Community maintained hardware plugin for vLLM on Ascend
Stars
1.9k
Forks
1.0k
Watchers
1.9k
Open Issues
1.6k
Overall repository health assessment
No package.json found
This might not be a Node.js project
[Feature] Optimize host-device sync problem in prefill phase for Qwen3Next/Qwen3.5 (#7967)
0fccd72View on GitHub[Ascend950][quant][Feature] Add W4A4 MXFP4 quantization support for Ascend950 (#7877)
0d768aaView on GitHub[BugFix][xlite] Clamp block_size to 128 for xlite graph compatibility (#7943)
0b4bafcView on GitHub[CI][Lint] Restrict python-init check to tracked package directories (#7939)
223f647View on GitHub[CI] add nightly Qwen3.5_27B_w8a8;MiniMax-M2.5-w8a8;Qwen3.5-397B-w8a8-mtp (#7745)
2599328View on GitHub[Feature] Support Flash Comm V1 for Qwen3-VL models (#7897)
9afc389View on GitHub[Feature] [310p] Add recurrent_gated_delta_rule_310 AscendC Custom Op (#7926)
b68212eView on GitHub[Doc][Misc] Add GLM5 to supported model list and update deployment document for GLM5 (#7958)
d39031bView on GitHub[Performance][model_runner_v2]:optimize the performance of the _ranks_kernel and _min_p_kernel (#7767)
aa04fa5View on GitHub276
commits
185
commits
121
commits
110
commits
84
commits
84
commits
56
commits
50
commits
47
commits
47
commits