Search Results

Found 4 repositories(showing 4)

jagsan-cyber

🧡65

World's first TurboQuant KV cache compression for llama.cpp on AMD ROCm (RX 9070 / gfx1201)

Updated 2 days ago

selmand

💛70

TurboQuant Run larger AI models with longer context on your GPU — powered by Google's TurboQuant KV cache compression.

MIT

Python

Updated 1 day ago

JohnnyDillinger-hub

❤️45

No description available

C++

Updated 1 week ago

thepradip

🧡60

No description available

MIT

C++

Updated 20 hours ago

All 4 repositories loaded

GitHub Explorer