Stars
3
Forks
1
Watchers
3
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
19
commits
feat: :sparkles: add Zero3Layer class for efficient parameter management in distributed training
b3b8bfbView on GitHubfeat: add LoRA modules with linear and embedding layers for efficient parameterization
660d881View on GitHubfeat: add FeedbackAttention module with support for key-value precomputation and positional embeddings
a35351bView on GitHubfeat: add GLU variants implementation with Tiny Shakespeare dataset and training framework
85a0a42View on GitHubfeat: :zap: add GPT model implementation with custom optimizer and training configurations
d47fe7aView on GitHubfeat: add CompressiveTransformer and related classes for enhanced memory compression in transformer models
6c95195View on GitHubfeat: add BERTChunkEmbeddings and RetroIndex for enhanced text processing and embedding retrieval
48e1bd9View on GitHubfeat: :sparkles: implement Attention with Linear Biases (ALiBi) for input length extrapolation
3010ef5View on GitHubfeat: add Rotary Position Embedding and RotaryPEMultiHeadAttention classes
9fe182fView on GitHubfeat: add Relative Multi-Headed Attention implementation with shift functionality
96a9563View on GitHubfeat: enhance TransformerXL with improved forward method and layer normalization
91597adView on GitHubdocs: add reference link for Transformer XL attention span explanation
b54585fView on GitHub