Let autonomous AI agents hack you so the real ones can't. Attacks LLM endpoints, web apps, npm packages, and source code. Blind PoC verification to minimize false positives.
Stars
8
Forks
1
Watchers
8
Open Issues
6
Overall repository health assessment
^12.8.0^0.44.0^25.5.0^0.27.4Update docs and blog with XBOW 70% results and shell-first data
f28c081View on GitHubMove Azure config to secrets, remove hardcoded endpoint URL
1768195View on GitHubFix XBOW CI: add Azure base URL, model, and wire API env vars
a10f2a9View on GitHubRelease v0.4.2: shell-first web pentesting, 70% on XBOW
0455a83View on GitHubXBOW 70%: blind SQLi cracked with 25 turns, 7/10 buildable
a0e7d10View on GitHubUpdate XBOW results: 6/10 (60%) with shell-first approach
e7020feView on GitHubWire shell-first as default web pentesting mode, add blog post
0a1c755View on GitHubShell-first validated: 4/4 XBOW challenges cracked with flag extraction
fa6dde1View on GitHubAdd shell_exec tool: full shell access for pentesting exploitation
b4179bcView on GitHubImprove crawl tool: include page text content for credential discovery
f8d81acView on GitHubAdd agentic web pentesting: crawl, submit_form, web prompts, pipeline wiring
c38c682View on GitHubFix XBOW benchmark: flag extraction only, enable agentic mode
4edaf59View on GitHubAdd two staggered blog posts: XBOW benchmark + AI attack surface
463d4b1View on GitHub