Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...) (AAAI 2025).
Stars
13.6k
Forks
1.3k
Watchers
13.6k
Open Issues
1.0k
Overall repository health assessment
No package.json found
This might not be a Node.js project
1.8k
commits
572
commits
443
commits
56
commits
23
commits
18
commits
16
commits
13
commits
13
commits
12
commits
[megatron] fix vit_attn_impl megatron (compat mcore-bridge) (#9019)
03b5297View on GitHubFix NPU device mismatch in Megatron DPO/RLHF training (#9014)
662aa31View on GitHub[bugfix] Fix mm token type ids (transformers 5.5.0 infer); update json.dumps ensure_ascii (#9015)
a635d8bView on GitHub