Tiny inference-only implementation of LLaMA
Stars
92
Forks
8
Watchers
Open Issues
1
Overall repository health assessment
No package.json found
This might not be a Node.js project
User
commits
Implement Fairscale parallelism
342412f
Add Readme
1f7e8c5
Generation with caching
507103c
Working generation.
0da8109
Feed forward layer
83115f8
Attention layer
3789e14
Use memmap
e631ae6
Torch loader
db75286