Found 1 repositories(showing 1)
codepawl
PyTorch implementation of TurboQuant. Near-optimal vector quantization for KV cache compression and vector search. 3-bit with zero accuracy loss.
All 1 repositories loaded