Found 1 repositories(showing 1)
SulRash
Minimal yet high performant code for pretraining llms. Attempts to implement some SOTA features. Implements training through: Deepspeed, Megatron-LM, and FSDP. WIP
All 1 repositories loaded