ThinkRL is a comprehensive, state-of-the-art reinforcement learning from human feedback (RLHF) library designed to democratize advanced AI training. Built with a zero-dependency core philosophy, ThinkRL provides researchers and developers with cutting-edge algorithms, reasoning capabilities, and multimodal support in a single, unified platform.
Stars
7
Forks
1
Watchers
7
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
212
commits
25
commits
14
commits
9
commits
3
commits
test: Add CLI tests for the `star` command to verify argument parsing and parameter passing.
7068cdeView on GitHubfeat: Introduce STaR, GRPO, and PRIME algorithms with associated training infrastructure, datasets, and CLIs, replacing `run_eval.py`.
7b01d77View on GitHubPotential fix for code scanning alert no. 178: Unused import
7c1eae7View on GitHubPotential fix for code scanning alert no. 177: Unused import
4fb11edView on GitHubPotential fix for code scanning alert no. 181: Unused local variable
d91061aView on GitHubPotential fix for code scanning alert no. 180: Unused local variable
47cb108View on GitHubPotential fix for code scanning alert no. 176: Unused import
191baf6View on GitHubPotential fix for code scanning alert no. 175: Unused import
df0f00dView on GitHubPotential fix for code scanning alert no. 179: Unused import
83be029View on GitHubfeat: Implement STaR (Self-Taught Reasoner) algorithm, trainer, and CLI
aa8389aView on GitHubfeat: Add a new CLI command for Group Relative Policy Optimization (GRPO) training.
fbe3cd9View on GitHubfeat: Add ThinkRL CLI with initial commands for training, generation, merging, info, SFT, and GRPO.
654bc88View on GitHub