Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
Stars
366
Forks
64
Watchers
366
Open Issues
64
Overall repository health assessment
No package.json found
This might not be a Node.js project