Tiny C++ LLM inference implementation from scratch
Stars
110
Forks
15
Watchers
Open Issues
2
Overall repository health assessment
No package.json found
This might not be a Node.js project
User
64
commits
deps: Upgrade TinyTorch
098f296
docs: Update readme
a903d6f
aae5624
feat: Enable flash attention
35650be
refactor: Refactor tokenizer interface
bdbd912
c5a0f51
97b2392
b074ef7
feat: Add support for qwen3 models
0978c27
feat: batch inference
242451f
feat: Add support for qwen3 tokenizer
ee2dfdd
fix: Fix 'lm_head' module names
3607ce0
b8445d8
feat: Add support for Mistral models
0368c4c
c88ce8c