Back to search
Lightweight LLM inspired by Qwen3, built from scratch in PyTorch. Full training pipeline with transformer components including RMSNorm, Rotary Position Embeddings (RoPE), Grouped-Query Attention (GQA), and SwiGLU layers. Trained with hybrid Muon + AdamW optimizer, causal masking, efficient batching, and evaluation tools.
Stars
3
Forks
1
Watchers
3
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
4
commits