Found 4 repositories(showing 4)
jagsan-cyber
World's first TurboQuant KV cache compression for llama.cpp on AMD ROCm (RX 9070 / gfx1201)
selmand
TurboQuant Run larger AI models with longer context on your GPU — powered by Google's TurboQuant KV cache compression.
JohnnyDillinger-hub
No description available
thepradip
No description available
All 4 repositories loaded