This repository is based on the material from the book Build a Large Language Model (From Scratch)
Stars
1
Forks
0
Watchers
1
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
25
commits
Extend with more tests for the create dataloader function to show case different strides/max_length
ec85826View on GitHubImplement the GPTDatasetV1 and the create_dataloader_v1 method.
ad10e29View on GitHubtorch will need numpy, therefore add that to the project
5aef635View on GitHubCreate a test for the exercise 2.1 byte pair encoding of unknown words
0c345e9View on GitHubUpdate the creation of the vocabulary so it will add the two special token, <|endoftext|> and <|unknown|>, to the list
015b001View on GitHubCreate a test that shows what happends when a word is missing from the vocabulary
f9ffaedView on GitHub