Found 11,810 repositories(showing 30)
deepseek-ai
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
deepseek-ai
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
vllm-project
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
togethercomputer
Together Mixture-Of-Agents (MoA) β 65.1% on AlpacaEval with OSS models
PKU-YuanGroup
γTMM 2025π₯γ Mixture-of-Experts for Large Vision-Language Models
MoonshotAI
MoBA: Mixture of Block Attention for Long-Context LLMs
openai
Code for the paper "PixelCNN++: A PixelCNN Implementation with Discretized Logistic Mixture Likelihood and Other Modifications"
deepseek-ai
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
XueFuzhao
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
alelievr
Mixture is a powerful node-based tool crafted in unity to generate all kinds of textures in realtime
XueFuzhao
A collection of AWESOME things about mixture-of-experts
davidmrau
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
MoonshotAI
Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities
allenai
OLMoE: Open Mixture-of-Experts Language Models
pjlab-sys4nlp
β·οΈ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
microsoft
Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4
melange-re
A mixture of tooling combined to produce JavaScript from OCaml & Reason
Time-MoE
[ICLR 2025 Spotlight] Official implementation of "Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts"
skapadia3214
Mixture of Agents using Groq
lucidrains
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
hongyangAndroid
ζ―ζAndroidεΎζζ··ζγζεη―η»εΎηηζζ
hardmaru
Multilayer LSTM and Mixture Density Network for modelling path-level SVG Vector Graphics data in TensorFlow
AviSoori1x
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
drawbridge
A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)
hardmaru
Generative Handwriting using LSTM Mixture Density Network with TensorFlow
codecaution
A curated reading list of research in Mixture-of-Experts(MoE).
ldeecke
Gaussian mixture models in PyTorch.
raymin0223
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)
jsn5
DanceNet -ππDance generator using Autoencoder, LSTM and Mixture Density Network. (Keras)
shiimizu
Tiled Diffusion, MultiDiffusion, Mixture of Diffusers, and optimized VAE