Search Results

Found 6 repositories(showing 6)

agent-pr-replay

sshh12

💛70

Agent PR Replay takes merged PRs from any repository, reverse-engineers the task prompt, runs Claude Code against it, and compares what the agent did versus what humans actually shipped. The result is targeted, empirical guidance.

MIT

Python

Updated 20 hours ago

claudeclaude-codecoding-agents+2

Multi-agent-deep-Q-network-with-stochastic-prioritized-replay

codurr1

❤️25

No description available

Jupyter Notebook

Updated 1 year ago

replayfix

sarvanithin

🧡65

Session replay → UX anomaly detection → RAG → multi-agent fixes → draft GitHub PR (PostHog-shaped demo)

Python

Updated 9 hours ago

pr_ai_benchmark

geekychris

🧡55

take an examplar git repo with known PR's and replay them letting PR agents run against them. Then gather up the issues and label for evaluation of PR tools

Updated 3 weeks ago

replayCI

aayushimalhotra3

❤️45

ReplayCI: PR-native regression tests for tool-using AI agents with deterministic replay, behavior diffs, and cost/safety gates.

Python

Updated 2 months ago

Agent-Reliability-Suite-tracing-replay-eval-gates-dashboard-with-a-prompt-injection-firewall-plugin.

MadScientist1912

❤️35

No description available

Python

Updated 1 month ago

All 6 repositories loaded

GitHub Explorer

Search Results

agent-pr-replay

Multi-agent-deep-Q-network-with-stochastic-prioritized-replay

replayfix

pr_ai_benchmark

replayCI

Agent-Reliability-Suite-tracing-replay-eval-gates-dashboard-with-a-prompt-injection-firewall-plugin.

agent-pr-replay

Multi-agent-deep-Q-network-with-stochastic-prioritized-replay

replayfix

pr_ai_benchmark

replayCI

Agent-Reliability-Suite-tracing-replay-eval-gates-dashboard-with-a-prompt-injection-firewall-plugin.