Extreme KV Cache Compression (1-3 bit) for LLMs natively on Apple Silicon (MLX). Features TurboQuant, asymmetric PolarQuant caching, and OpenAI server compatibility.
Stars
29
Forks
3
Watchers
29
Open Issues
3
Overall repository health assessment
No package.json found
This might not be a Node.js project
No contributors data available