A high-performance inference engine for LLMs, optimized for diverse AI accelerators.
Stars
1.2k
Forks
171
Watchers
1.2k
Open Issues
155
Overall repository health assessment
No package.json found
This might not be a Node.js project
82
commits
80
commits
64
commits
59
commits
47
commits
46
commits
43
commits
42
commits
39
commits
26
commits
bugfix: fix DeepSeek-3.2 failures when ACL Graph is enabled. (#1172)
fb683dbView on GitHubdocs: add bilingual generative recommendation design docs. (#1182)
535b3bcView on GitHubfeat: add onerec in supported model docs and align rec utility style. (#1055)
990393fView on GitHubfeat: adapt MooncakeTransferEngine for AscendDirectTransport. (#1201)
b3e1c16View on GitHubrefactor: rename .agent to .agents and refine AGENTS.md. (#1208)
918f99cView on GitHubbugfix: fix qwen3.5 gated delta net conv state indices for acl graph[5/N]. (#1171)
ae9c62dView on GitHub