Found 1 repositories(showing 1)
EdwinSJ
Performed supervised fine-tuning (SFT) on Llama 3.1 8B using HH-RLHF and Ranked 10K responses with Llama 3.1 70B to build a safety-optimized dataset
All 1 repositories loaded