Found 30 repositories(showing 30)
suyoumo
OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.
InternLM
An in-the-wild benchmark for AI agents in the OpenClaw Environment.
claw-bench
The Definitive AI Agent Benchmark
InternScience
ResearchClawBench: Evaluating AI Agents for Automated Research from Re-Discovery to New-Discovery
devswha
Benchmark suite: Claw Code (Rust) vs Claude Code (Node.js)
bionicdl-sustech
No description available
TRIBE-INC
Benchmarks for model performance in openclaw
Mosi-AI
No description available
Xubqpanda
Benchmarking and reproducibility suite for EcoClaw across PinchBench and other agent datasets.
just-claw-it
A CLI that runs standardized tests (benchmarking) against any OpenClaw skill and produces a quality scorecard.
OperatingSystem-1
No description available
biostochastics
Standalone Python audit suite that evaluates the agentic bioinformatics tools suite ClawBio for safety, security, and correctness.
InternScience
No description available
ufatfat
No description available
PilotBenchAnonymous
No description available
Herrieson
Agent for ClawBench
maxdevfranklin
No description available
mxn2020
No description available
wild-balthazar224
Measure AI agents’ performance with standardized tests across 314 tasks, 33 domains, and 4 difficulty levels for clear, reproducible comparison.
KyoukoLi
No description available
thompcd
No description available
Milbaxter
🧠 ClawdBot Self-Improvement Benchmark - WeaveHacks 3 Hackathon | Separating signal from noise in agent self-improvement
Oooer8
Using codex to control robot in robocasa
eprinsell
Repository for https://replit.com/@757pwq4bhv/MainClaw-BenchBot
biostochastics
Reproducible safety benchmark for pharmacogenomics AI tools provided by ClawBio
ste11-xinxin
ResearchClawBench forAAAI
clawinfra
Ablation benchmark suite for claw-forge middleware features
kenahrens
No description available
mxn2020
No description available
DataScienceNigeria
No description available
All 30 repositories loaded