InternLM/WildClawBench - GitHub Explorer | GitHub Explorer | Trending

Stars

249

Forks

17

Watchers

249

Open Issues

1

Repository Health Score

🧡

65/100

Fair

Overall repository health assessment

Score Breakdown

Activity

Active development - updated this week

30/30

100%

Recent Commits

<doc> update LLM evaluation results and model details

Lennox Dai•1 week ago

a0d3b03View on GitHub

docs: update README leaderboard (14 models, 2026-03-27)

mark12ding•1 week ago

cb03b44View on GitHub

fix: include the one-time search case for task 04_10

mark12ding•1 week ago

e3ec8e4View on GitHub

fix: add two additional ground truth paths for google scholar search task

mark12ding•1 week ago

ef1e125View on GitHub

docs: add repository citation metadata

Cooperx521•1 week ago

ac7477bView on GitHub

Use per-run unique ID to prevent parallel run collisions

dingshuangrui•1 week ago

e50cf51View on GitHub

clean run.sh examples

Mark Ding•1 week ago

3f5aad9View on GitHub

Revise the font of google drive link.

Mark Ding•1 week ago

6889f29View on GitHub

refactor: extract hardcoded judge model to JUDGE_MODEL env var

mark12ding•1 week ago

745a8deView on GitHub

<doc> add eval details

Lennox Dai•1 week ago

90b9a91View on GitHub

Update .env

Mark Ding•1 week ago

abc2abdView on GitHub

Fix global summary scoring and optional thinking config.

Cooperx521•1 week ago

59f6be1View on GitHub

Update README with custom model endpoint warning

Mark Ding•1 week ago

e817e5dView on GitHub

Update .env

Xing Long•1 week ago

3a66206View on GitHub

Revert script/run.sh to previous version.

Cooperx521•1 week ago

1c0deb6View on GitHub

View all commits

GitHub Explorer

WildClawBench

Score Breakdown

Issues Activity: Last 6 months

Hottest Issues