GitHub Explorer

by Alexey Ratnikov

Search Results

Found 1 repositories(showing 1)

MartinCrespoC

🧡55

🚀 Run any LLM on any hardware. 130% faster MoE inference with ExpertFlow + TurboQuant KV compression. Ollama-compatible API. Built on llama.cpp.

MIT

C++

Updated 19 hours ago

aiamd-gpucpp+16

All 1 repositories loaded