Mastering Large Language Models: Build Your Own LLM from Scratch
Stars
206
Forks
29
Watchers
206
Open Issues
10
Overall repository health assessment
No package.json found
This might not be a Node.js project
15
commits
Implement top-p filtering and enhance the generate method in UstaModel for improved token sampling. Adjust parameters for temperature, top_k, and top_p to refine output generation process.
9b5bafcView on GitHubUpdate execution counts in module_3_2.ipynb for consistent notebook flow and enhance model evaluation output with updated training metrics. Adjust model loading to include device specification for improved compatibility.
bfec4f3View on GitHubImplement automatic model download in app.py and update requirements.txt for package version compatibility. Replace local model path with a download method for u_model.pth and upgrade Gradio and Torch versions.
e367670View on GitHubUpdate model path in app.py for consistency and adjust execution counts in demo.ipynb for accurate notebook flow. Revise readme.md to reflect course title change and expand course details, including learning outcomes and module descriptions.
7119c19View on GitHubAdd Gradio interface for Usta Model, including model loading, chat functionality, and example prompts. Create README_HF.md for Hugging Face integration and update requirements.txt. Remove unused files and refactor tokenizer and model structure for improved performance.
34cc940View on GitHubEnhance module_3_2.ipynb by adding model evaluation code, updating training epochs to 1,000,000, and implementing model saving/loading functionality. Update readme.md to include a link for running the model in Colab. Refactor UstaModel to utilize UstaEmbedding for improved embedding management.
75ede35View on GitHubUpdate module_pytorch_train.ipynb to correct training data indexing, adjust model architecture by simplifying the neural network layers, and enhance output logging with updated accuracy and loss metrics for improved training feedback.
4303bdbView on GitHubUpdate .gitignore to exclude 'data/' directory and enhance module_3_1.ipynb with additional code cells for model loading and tokenization examples, including execution count adjustments and output updates.
ae32defView on GitHubRefactor UstaModel to incorporate UstaDecoderBlock and update architecture to support multiple layers and linear output head
a8d5742View on GitHubRefactor UstaCausalAttention to UstaMultiHeadAttention and update UstaModel to utilize multi-head attention mechanism with context length support
016a713View on GitHubAdd UstaCausalAttention class and update model to use causal attention mechanism
29c1061View on GitHubRefactor text_dataset.py to include DataLoader and remove tokenizer.py
639d4b3View on GitHub