LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
Stars
240
Forks
24
Watchers
240
Open Issues
8
Overall repository health assessment
No package.json found
This might not be a Node.js project
Merge pull request #8 from jasonvanf/codex/locate-and-fix-an-important-bug
53c7735View on GitHubIncrease the flexibility of parameters in the training reward model
f280755View on GitHubKeep the maximum length consistent with 'seq_length' when setting up sft_trainer
397e700View on GitHubSupport full weight fine-tuning with DeepSpeed stage-3 (offload)
68f61bfView on GitHubAdd `warmup_ratio` and `save_total_limit` argument settings
ea9ee75View on GitHub