Stars
0
Forks
0
Watchers
0
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
96
commits
grpo: load SFT adapter from HF Hub (zcheng256/TermyLambda_sft)
9f65633View on GitHubreadme: document Dr.GRPO max_completion_length tuning and temperature=1.0
0aff916View on GitHubgrpo: set rollout temperature=1.0, reduce max_completion_length to 8192
980e872View on GitHubgrpo_lora_mt: reduce max_completion_length 28672→8192, 2 epochs
08387b8View on GitHubreadme: add MT-GRPO data curation pipeline documentation
70c5bccView on GitHubdata curation: GPT-5-mini task analysis + quality-filtered 100-task dataset
213348eView on GitHubmulti_turn_grpo: add efficiency bonus, support PEFT adapter resume
ff0def5View on GitHubmulti_turn_grpo: remove mask_truncated_completions for multi-turn
e720b50View on GitHubreadme: add nemotron base TB2 results (7.87%) and cross-model comparison
250395bView on GitHubmulti_turn_grpo: add turn_rewards, expand tests, update trainer and config
58160c2View on GitHubmulti_turn_grpo: fix cache_position, tensor dtypes; align params with eval
c4bbda3View on GitHubmulti_turn_grpo: fix num_items_in_batch type + add smoke config
97b4260View on GitHubreadme: update max_output_tokens to 8192 in eval command
facdc0eView on GitHub