Back to search
User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice routing providing a content-based sparse attention mechanism.
Stars
28
Forks
4
Watchers
28
Open Issues
1
Overall repository health assessment
No package.json found
This might not be a Node.js project
3
commits