An open-source library for hardware-aware model optimization. It learns soft gates to prune and fine-tune neural networks for specific GPUs, balancing latency and accuracy. Includes tools for latency-driven training, export with kernel-aligned sizes, and a Hugging Face model-zoo integration.
Stars
1
Forks
0
Watchers
1
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
34
commits