GitHub Explorer

by Alexey Ratnikov

GitHub Explorer|TRENDING COMPARE|FEEDBACK

Search Results

Found 1 repositories(showing 1)

oolong-pairs

zircote

❤️40

Benchmark harness for A/B testing Claude Code plugins against OOLONG long-context reasoning tasks. Compare truncation vs RLM-RS recursive chunking strategies. Features Claude Code hooks integration, SQLite persistence, and comprehensive scoring aligned with the OOLONG paper methodology.

Python

Updated 1 month ago

ai-evaluationanthropicbenchmark+14

All 1 repositories loaded