Playwright-based testing and eval framework for MCP servers with LLM-as-a-judge
Stars
5
Forks
2
Watchers
5
Open Issues
4
Overall repository health assessment
^0.2.71^2.0.0^1.27.1^14.0.2^4.4.3^5.2.1^5.0.0^3.0.0^10.1.0^18.3.1^7.24.0^4.3.6^4.0.15^1.49.0^8.0.1^4.1.12^22.10.2^18.3.12^18.3.1^8.18.2^8.18.2^4.1.2^10.4.20^4.0.1^0.28.0^8.57.1^4.0.0^0.460.0^0.6.1^4.51.1^8.4.49^3.4.2^18.3.1^3.7.0^19.2.4^3.4.15^8.3.5^4.19.2^5.7.2^4.1.2^3.0.46^3.0.0^2.0.0^3.0.0^4.0.0^3.0.0^3.0.31^3.0.0^2.0.0^1.40.0^6.0.0233
commits
6
commits
5
commits
4
commits
2
commits
chore(deps): override release-it's undici to fix 3 security alerts (#159)
02f5c3bView on GitHubchore: fix dependabot security alerts (168→16 vulns) (#158)
ed9fbd4View on GitHubchore(deps): bump hono from 4.12.5 to 4.12.9 (#145)
e1b6565View on GitHubchore(deps): bump path-to-regexp from 8.3.0 to 8.4.1 (#146)
afef0d4View on GitHubchore(deps-dev): bump picomatch from 2.3.1 to 2.3.2 (#147)
c91478cView on GitHubchore(deps-dev): bump flatted from 3.3.3 to 3.4.2 (#148)
e1bd0f3View on GitHubchore(deps): bump lodash.template from 4.5.0 to 4.18.1 (#154)
6edd72aView on GitHubfeat: add vertex-anthropic and anthropic-agent-sdk judge providers (#152)
1cf39e0View on GitHubfeat: custom judge registry for user-defined judge executors (#149)
110eab1View on GitHubfix: DetailModal crash when clicking on failed eval cases (#144)
3601898View on GitHubfeat: regex pattern matching for tool call arguments (#143)
cc24678View on GitHubfeat: include request data in eval results and HTML report (#142)
e450d28View on GitHubBump `undici` to fix security vulnerability (#141)
2f5fe3dView on GitHub