llama.cpp fork with additional SOTA quants and improved performance
Stars
2.0k
Forks
251
Watchers
2.0k
Open Issues
45
Overall repository health assessment
No package.json found
This might not be a Node.js project
server: support slot save/restore/erase for mtmd tokens and checkpoints (#1584)
5e8bb72View on GitHubmtmd: be able to use alternative types for the K*Q multiplication (#1567)
73742c5View on GitHubFix re-quantizing a model using row-interleaved quants (#1561)
8b575c4View on GitHub799
commits
179
commits
137
commits
98
commits
71
commits
56
commits
42
commits
40
commits
32
commits
29
commits