Found 2 repositories(showing 2)
lucidrains
Implementation of Bottleneck Transformer in Pytorch
allenkchau
A from-scratch implementation of a tiny Transformer inference engine in PyTorch, built to study prefill vs decode performance, attention mechanics, and GPU bottlenecks. The project focuses on correctness and profiling-driven optimization rather than training.
All 2 repositories loaded