Ultra-lightweight C++ inference engine for BitMamba-2 (1.58-bit SSM). Runs 1B models on consumer CPUs at 50+ tok/s using <700MB RAM. No heavy dependencies.
Stars
14
Forks
3
Watchers
14
Open Issues
1
Overall repository health assessment
No package.json found
This might not be a Node.js project
19
commits
Update README to specify current directory for tokenizer.bin
5a5bebcView on GitHubMerge branch 'main' of github.com:Zhayr1/bitmamba.cpp
a01fa8cView on GitHub