Found 1 repositories(showing 1)
MartinCrespoC
๐ Run any LLM on any hardware. 130% faster MoE inference with ExpertFlow + TurboQuant KV compression. Ollama-compatible API. Built on llama.cpp.
All 1 repositories loaded