Benchmark harness for A/B testing Claude Code plugins against OOLONG long-context reasoning tasks. Compare truncation vs RLM-RS recursive chunking strategies. Features Claude Code hooks integration, SQLite persistence, and comprehensive scoring aligned with the OOLONG paper methodology.
Stars
3
Forks
0
Watchers
3
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
9
commits