Implementation of a GPT-style language model from scratch, following the concepts and code presented in Sebastian Raschka’s Build a Large Language Model From Scratch. This repository includes step-by-step implementations, experiments, and notes to deepen understanding of transformer-based architectures.
Stars
5
Forks
0
Watchers
5
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
15
commits
Added the final training notebook. Rewrote the README file. And finished the training and the images of the training.
799314aView on GitHubImplemented the data loader and tokenizer along with their tests.
e188dc8View on GitHubImplemented the loss function and added the test for it
8d95ccfView on GitHubTested out each layer file to make sure it is running correctly
3c2ff86View on GitHubAdded the text_loader to read external files from a directory and added more structure to the folder to be more organized
baa469eView on GitHubImplemented the GPT Model that calls all the Transformer block to build the GPT. Edited the README file to incorporate the changed in the directory
2271eb9View on GitHub