Train the smallest LM you can that fits in 16MB. Best model wins!
Stars
4.7k
Forks
3.0k
Watchers
4.7k
Open Issues
970
Overall repository health assessment
No package.json found
This might not be a Node.js project
33
commits
20
commits
11
commits
4
commits
4
commits
2
commits
2
commits
2
commits
2
commits
2
commits
Merge pull request #1019 from abaybektursun/record/ar-selfgen-gptq-xsa-bigramhash3072
2443851View on GitHubRecord: AR Self-Gen GPTQ + XSA-all + BigramHash 3072×112 — val_bpb 1.11473 (3-seed mean)
d7fbe3dView on GitHubNon-record: Depth Recurrence in Parameter-Constrained Transformers — What Works, What Doesn't, and Why (#363)
50390d6View on GitHubRecord Submission: 1.1570 BPB - 73.7M Ternary U-Net (10L 768d 8192BPE relu² 4xMLP FP8) (#640)
69bc84eView on GitHubNotable Non-Record Submission: 1.1239 BPB - 106.2M Binary Asymmetric U-Net + NeoMuon + 4xrelu²MLP + Smear + Fact Tied Emb + Poly5 Softcap + YaRN2048 + 8192BPE + FP8 + Bit-packing LZMA + Stride-16 Eval - 2h (#641)
9855688View on GitHub