Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
Stars
845
Forks
53
Watchers
845
Open Issues
17
Overall repository health assessment
No package.json found
This might not be a Node.js project
464
commits
308
commits
239
commits
69
commits
38
commits
31
commits
23
commits
22
commits
12
commits
9
commits
Remove unused _register_to_autoclass method from Gemma3_Patch to streamline the class implementation.
cfd369bView on GitHubRemove dataset argument from command line options in math verifier and update error handling for unknown prompt templates.
fef546eView on GitHubAdd handling for fake pixel values in Gemma3_Patch to ensure inputs_embeds are updated correctly when no valid image features are available.
657b10fView on GitHubRefactor Gemma3_VLDataProcessor error handling and message formatting
aff8ea9View on GitHubAdd model parameter filtering in DeepspeedStrategy initialization
41dab70View on GitHubAdd command line option to enable/disable format reward calculation in math verifier
89d358cView on GitHubImplement liger kernel support in Gemma3 by integrating apply_liger_kernel_to_gemma3 function
4f4fbc7View on GitHub