pwnkit

Fully autonomous agentic pentesting framework. Attacks AI/LLM apps, web apps, package ecosystems, and source code. Blind PoC verification to minimize false positives.

Get Started View on GitHub

One command, zero config

Run pwnkit scan --target <url> and get a verified security report in minutes.

Blind verification

Every finding is independently re-exploited by a second agent that never sees the original reasoning. False positives are killed automatically.

Bring your own AI

Use your API key (OpenRouter, Anthropic, Azure OpenAI, OpenAI) or spawn Claude Code, Codex, or Gemini CLI with your existing subscription.

Full-spectrum pentesting

AI/LLM apps, web applications, package ecosystems, and source code repositories. Not just AI security — pwnkit covers traditional web vulnerabilities too. Cybench: 36/40 = 90.0% on the first scored full 40-challenge run (single-config gpt-5.4, single-shot). XBOW retained-artifact aggregate: 103/104 = 99.0% (only XBEN-030 unsolved in any mode), with the load-bearing gpt-5.4 cohort at 93/95 = 97.9% black-box at $5.20/flag (~$0.48/run). Historical mixed publication line documented separately on the benchmark page.

Explore the docs

How it works

Architecture — the 5-stage pipeline, runtime adapters, and MCP integration
Agent Loop — how the autonomous loop thinks, calls tools, reflects, and stops
Finding Triage — the multi-layer pipeline between research and verify agents
Blind Verification — why findings are re-exploited before they count
Adversarial Evals — how pwnkit extends beyond classic pentesting into attack-driven evaluation of AI systems

Research and benchmark analysis

Research overview — the index of design notes and experiments
FP Reduction Moat — the full false-positive reduction stack
Finding Triage ML — the shipped triage layers and why they exist
XBOW Analysis — what moved the score and what still blocks the remaining challenges
Benchmark — current results, caveats, and historical context