Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to larger models with one parameter change (feature currently in alpha).
Stars
355
Forks
27
Watchers
355
Open Issues
4
Overall repository health assessment
No package.json found
This might not be a Node.js project
13
commits
1
commits
make default run parameters a bit more friendly (avoid grad norm soft lock @ 1k iterations).
d848543View on GitHubUpdate hlb-gpt to v0.4.0. Darn big update. See patch notes for thread link for more info.
038af4eView on GitHubMerge pull request #5 from tysam-code/torch_compile_hotfix
0ce502eView on GitHubPort and update changes as needed from https://github.com/daniel-p-gonzalez/hlb-gpt, as mentioned and discussed in https://github.com/tysam-code/hlb-gpt/pull/4.
fe47f53View on GitHubFix dataset loading and Pytorch 1.x compatibility
1c3e0fbView on GitHub