RL training environments with verifiable rewards for coding agents. Works with TRL, Unsloth, verl, OpenRLHF.
Stars
34
Forks
1
Watchers
34
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
20
commits
Harden benchmark loading and mixed batch routing for imported suites
88de371View on GitHubRestore Python 3.10 compatibility for the 0.2.0 release lane
d135a99View on GitHubUnblock DeepGym release CI by matching the enforced Ruff policy
e081105View on GitHubMake DeepGym releases publishable from the actual repo root
7b4f1a7View on GitHubCut a 0.2.0 release line for benchmark-backed training
6b36dbaView on GitHubEnable benchmark-backed reward routing for repo and terminal tasks
95fab5eView on GitHubfeat: add Axolotl integration with PRM data generation
aa583e3View on GitHubdocs: position README against current RL code training landscape
ef10d61View on GitHubdocs: replace Mermaid diagrams with ASCII art in README
f71fac1View on GitHubdocs: rewrite README with Mermaid diagrams, add wiki, update .gitignore
b0a3344View on GitHub