An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
Stars
3.1k
Forks
269
Watchers
3.1k
Open Issues
95
Overall repository health assessment
No package.json found
This might not be a Node.js project
fix: use Cluster instead of WorkerConfig for dynamic batching dp_size
a49a915View on GitHubremove IPA config yaml not needed for OpenReward integration
b39681bView on GitHubfix: disable reward normalization for SWE configs with group_size=1
52e0978View on GitHubfix: MultipleChoiceBoxedRuleRewardWorker returns a zero reward
9c6ce5cView on GitHub156
commits
33
commits
31
commits
25
commits
25
commits
17
commits
14
commits
14
commits
11
commits
10
commits