Found 1 repositories(showing 1)
Hasin-Al
This is a LLM using Decoupled RoPE, MultiHeadLatentAttention and TransfomerBLocks with post and pre normalization and using MoE. The Basic Idea is to build an LLM from scratch.
All 1 repositories loaded