CenterForOpenScience/llm-benchmarking - GitHub Explorer | GitHub Explorer | Trending

Stars

4

Forks

1

Watchers

4

Open Issues

1

Repository Health Score

❤️

45/100

Poor

Overall repository health assessment

Score Breakdown

Activity

Regular updates - updated this month

20/30

67%

Recent Commits

gpt-4o results, 30-39

Bang Nguyen•1 week ago

48de78eView on GitHub

Merge branch 'main' of github.com:CenterForOpenScience/llm-benchmarking

Dominik Soós•1 week ago

12052a2View on GitHub

gpt-4o experiments + evaluation 20-29

Dominik Soós•1 week ago

7e7eb3fView on GitHub

support .sav data loading

Bang Nguyen•2 weeks ago

e467a95View on GitHub

round 2 data, o3 python result 30-39s

Bang Nguyen•2 weeks ago

115baefView on GitHub

python experiment + evaluation (20-29) for o3

Dominik Soós•2 weeks ago

b75db69View on GitHub

readded ItemsList_Final.csv file

Dominik Soós•2 weeks ago

65f52ebView on GitHub

removed .DS_Stores

Dominik Soós•2 weeks ago

2d59305View on GitHub

reset original directories 20-29

Dominik Soós•2 weeks ago

fe87a34View on GitHub

updated initial details fro case study 4

Dominik Soós•2 weeks ago

a573aeaView on GitHub

gpt-5 experiment + evaluation results 20-29

Dominik Soós•2 weeks ago

a6ca47cView on GitHub

Merge branch 'main' of https://github.com/CenterForOpenScience/llm-benchmarking

Bang Nguyen•3 weeks ago

2e3e3a2View on GitHub

Round 2 data, gpt5 python results 30-39

Bang Nguyen•3 weeks ago

e545d25View on GitHub

gpt-5 python experiment 20-29

Dominik Soós•3 weeks ago

15e4d3fView on GitHub

update make eval commands after data reformat

Bang Nguyen•1 month ago

086015bView on GitHub

View all commits

GitHub Explorer

llm-benchmarking

Score Breakdown

Issues Activity: Last 6 months

Hottest Issues