Build a Tiny Transformers: a 124M parameter GPT2 model
Stars
2
Forks
0
Watchers
2
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
evaluate epoch 1 model checkpoints loss and perplexity on 1% samples from the FineWeb-edu test set
3e57f2bView on GitHubdisabling the torch.backends.cudnn.deterministic doesn't give a 2~3x speed boost
51a9c31View on GitHub