Found 105 repositories(showing 30)
Accenture
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers
mcp-tool-bench
MCPToolBench++ MCP Model Context Protocol Tool Use Benchmark on AI Agent and Model Tool Use Ability
AIS2Lab
MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols
twilio-labs
No description available
Brunestuder
No description available
thiagomendes
No description available
K1ta141k
MCP server that routes design tasks to designarena.ai's current top-ranked model via OpenRouter
longevity-genie
MCP server to interact with Benchlong
0x5457
A reproducible TypeScript benchmark comparing MCP-native agents vs mcp-cli, capturing token usage, tool calls, retries, and latency across shared MCP tasks
Eliovp-BV
A very simple proof-of-concept mcp for running vllm benchmarks
benchopt
Benchopt benchmark for MCP regression
zhiqiangwang4
MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers
portal-labs-infrastructure
An automated benchmark and public leaderboard for Model-Context Protocol (MCP) clients. Test your client and see how it ranks!
MikeyBeez
🧠 Tool-Augmented Reasoning MCP Server - Computational verification tools that achieved 58.3% on BIG-Bench Hard evaluation (+29.7pp improvement)
scalekit-inc
Rigorous benchmark comparing MCP servers vs CLI tools for AI agents
SajmustafaKe
A Model Context Protocol (MCP) server that provides AI assistance for Frappe/ERPNext development. This server offers tools to help with creating DocTypes, running bench commands, managing apps, and other Frappe development tasks.
doksihq
Doksi Device MCP Test Bench
pranav-deshmukh
No description available
quangminh1212
No description available
mpecan
Benchmark tool for measuring MCP server effectiveness in LLM-assisted development
rezapirighadim
No description available
GG-science
MCP server that turns any SQL database into an AI-native data layer. Pluggable backends (DuckDB, Athena) with a 4-layer knowledge system that learns from real sessions.
zengzhuozhen
mcp server for the benchmark-proxy
fjm2u
MCP Adversarial Benchmark; MCP利用時の敵対的入力防御を測定するためのベンチマーク
ArcadeAI
No description available
lucianfialho
Reproducible benchmark: Google Analytics MCP vs gmp CLI token cost. Real data, no mocks.
thiagomendes
No description available
matsonj
BIRD-Bench multi-model text-to-SQL evaluation harness with MotherDuck MCP
nothingtosurprise
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers
alexbel83
MCP server in C# that runs BenchmarkDotNet on demand and returns results & artifacts to LLM clients (MCP Inspector, Visual Studio Copilot Tools).