Deep Reinforcement Learning For Sequence to Sequence Models
Stars
768
Forks
161
Watchers
768
Open Issues
13
Overall repository health assessment
No package.json found
This might not be a Node.js project
fixed decoding during intradecoder/temporal training
67a8ce4View on GitHubadded avoiding trigrams during decoding, shared output projection matrix, fixed minor bugs
bdd611aView on GitHubadded support for intermediate and disounted reward for policy gradient
0095a76View on GitHubadded support for intermediate and disounted reward for policy gradient
586104bView on GitHubadded support for intermediate and disounted reward for policy gradient
eae3a3dView on GitHubadded support for intermediate and disounted reward for policy gradient
7e019e8View on GitHubfixing rouge tensor to return tensor rather than just one value
ac94ea6View on GitHubUpdate to Python 2.7 & CUDA 9 and TensorFlow 1.10 lastest version
8e0890aView on GitHubUpdate to Python 2.7 & CUDA 9 and TensorFlow 1.10 lastest version
d1ae031View on GitHubUpdate to Python 2.7 & CUDA 9 and TensorFlow 1.10 lastest version
9d5ba41View on GitHubnow you can choose between self-critic or vanilla policy gradient and using discounted reward error for policy training
35faf80View on GitHub