High-speed Large Language Model Serving for Local Deployment
Stars
9.3k
Forks
553
Watchers
9.3k
Open Issues
130
Overall repository health assessment
No package.json found
This might not be a Node.js project
Fix segmentation fault for models exceeding 40B on AMD GPUs & optimize mul_mat_axpy operation (#217)
6ae7e06View on GitHubadd convert-hf-to-powerinfer-gguf.py to CMakeLists.txt (#205)
61cac9bView on GitHub401
commits
81
commits
60
commits
59
commits
49
commits
38
commits
38
commits
32
commits
26
commits
23
commits