Configuration
pwnkit is designed for zero-config usage, but every default can be overridden via CLI flags or environment variables.
Runtime modes
Section titled “Runtime modes”pwnkit is an agentic harness — bring your own AI. The --runtime flag controls which LLM backend powers the agents.
| Runtime | Flag | Description |
|---|---|---|
api | --runtime api | Uses your configured direct provider (ChatGPT Codex subscription auth, OpenRouter, Anthropic, Azure OpenAI, or OpenAI). Best for CI and quick scans. Default. |
claude | --runtime claude | Spawns the Claude Code CLI with your existing subscription. Best for deep analysis. |
codex | --runtime codex | Uses the Codex CLI for source review. For live target scans, routes to the direct ChatGPT Codex provider when PWNKIT_CHATGPT_OAUTH_REFRESH_TOKEN is configured. |
gemini | --runtime gemini | Spawns the Gemini CLI. Best for large-context source analysis. |
auto | --runtime auto | Auto-detects installed CLIs and picks the best one per pipeline stage. |
API runtime
Section titled “API runtime”The default api runtime makes direct HTTP calls to an LLM provider. It requires one of these environment variables:
export PWNKIT_CHATGPT_OAUTH_REFRESH_TOKEN="..." # ChatGPT/Codex subscription authexport OPENROUTER_API_KEY="sk-or-..." # Recommendedexport ANTHROPIC_API_KEY="sk-ant-..."export AZURE_OPENAI_API_KEY="..."export OPENAI_API_KEY="sk-..."See API Keys for the full priority order and provider details.
If you use Azure, also set AZURE_OPENAI_BASE_URL and AZURE_OPENAI_MODEL unless pwnkit can read them from a valid Azure-backed ~/.codex/config.toml. For the Responses API, the base URL should include /openai/v1. pwnkit fails fast on incomplete Azure config instead of attempting a scan with guessed defaults.
For ChatGPT Codex subscription auth, run codex login, then provide the
refresh token from ~/.codex/auth.json as PWNKIT_CHATGPT_OAUTH_REFRESH_TOKEN.
When that variable is set it takes provider priority over API-key based
providers.
CLI runtimes (claude, codex, gemini)
Section titled “CLI runtimes (claude, codex, gemini)”These runtimes spawn the respective CLI tool as a subprocess. You must have the CLI installed and authenticated:
# Claude Code CLInpm i -g @anthropic-ai/claude-code
# Codex CLInpm i -g @openai/codex
# Gemini CLInpm i -g @google/gemini-cliThen use them:
pwnkit scan --target https://api.example.com/chat --runtime claudepwnkit review ./my-repo --runtime codex --depth deepThe Codex CLI is not used as a live target wrapper. That MCP-backed path was removed because it added a target-interaction bottleneck. To use your Codex subscription for live scans, configure the direct provider:
export PWNKIT_CHATGPT_OAUTH_REFRESH_TOKEN="..."pwnkit scan --target https://example.com --runtime codexCodex runtime parity matrix
Section titled “Codex runtime parity matrix”--runtime codex works across every pwnkit entry point as long as
either the local codex CLI binary is installed OR the direct ChatGPT
Codex provider is configured via PWNKIT_CHATGPT_ACCESS_TOKEN /
PWNKIT_CHATGPT_OAUTH_REFRESH_TOKEN. When the binary is absent and the
subscription env is set, pwnkit routes the request through the API
runtime against chatgpt.com/backend-api/codex/responses (the same
endpoint the upstream codex CLI uses).
| Surface | Command | Supported via direct provider |
|---|---|---|
| Web / URL scan | pwnkit scan --target https://… --runtime codex | yes |
| npm package audit | pwnkit audit lodash --ecosystem npm --runtime codex | yes |
| PyPI package audit | pwnkit audit requests --ecosystem pypi --runtime codex | yes |
| crates.io package audit | pwnkit audit tokio --ecosystem cargo --runtime codex | yes |
| OCI image audit | pwnkit audit nginx:1.25 --ecosystem oci --runtime codex | yes |
| Default source-code review | pwnkit review ./repo --runtime codex | yes |
| Linux kernel review | pwnkit review ./linux --profile linux-kernel --runtime codex | yes |
| C/C++ library review | pwnkit review ./lib --profile c-library --runtime codex | yes |
Before pwnkit#402 only the web / URL scan path honoured the direct
provider — the other surfaces aborted with
Requested runtime 'codex' is not available whenever the codex CLI
binary was missing, even with subscription auth configured. Cloud
sandbox dispatch (0cloud worker-controller) still gates codex on
target_ecosystem === "web" and is tracked as a separate follow-up.
Scan modes
Section titled “Scan modes”The --mode flag controls what kind of target is being scanned.
| Mode | Description |
|---|---|
deep | Full autonomous pentest. Runs the research + verify agents with the full 40-turn budget. Default when the target is an https:// URL. |
probe | Lightweight surface scan — recon and fingerprinting without deep exploitation. |
web | Shell-first autonomous pentesting for web applications. The agent uses bash (curl, python3, bash) as its primary tool to probe for CORS, headers, exposed files, SSRF, XSS, SQLi, SSTI, and more. |
mcp | Scan MCP (Model Context Protocol) servers for tool poisoning and schema abuse. Default when the target starts with mcp://. |
# LLM API scan (default)pwnkit scan --target https://api.example.com/chat
# Web app scanpwnkit scan --target https://example.com --mode webDepth settings
Section titled “Depth settings”The --depth flag controls how thorough the scan is.
| Depth | Test Cases | Typical Time | Best For |
|---|---|---|---|
quick | ~15 | ~1 min | CI pipelines, smoke tests |
default | ~50 | ~3 min | Day-to-day scanning |
deep | ~150 | ~10 min | Pre-launch audits, thorough review |
pwnkit scan --target https://api.example.com/chat --depth quickpwnkit audit express --depth deeppwnkit review ./my-repo --depth deep --runtime claudeOutput formats
Section titled “Output formats”pwnkit supports multiple output formats:
| Format | Description |
|---|---|
terminal | Human-readable terminal summary with share URL |
html | Rich browser report saved to a temporary file |
pdf | Printable report saved to a temporary file |
json | Machine-readable JSON output for pipelines |
sarif | SARIF format for the GitHub Security tab |
markdown | Human-readable Markdown report |
In CI (GitHub Action), set format: sarif to populate the Security tab:
- uses: 0sec-labs/pwnkit@main with: mode: review path: . format: sarifDiff-aware review
Section titled “Diff-aware review”For PR workflows, review only changed files against a base branch:
pwnkit review ./my-repo --diff-base origin/main --changed-onlyThis is particularly useful in CI to avoid scanning the entire codebase on every PR.
Verbose output
Section titled “Verbose output”Use --verbose to see the animated attack replay and detailed agent reasoning:
pwnkit scan --target https://api.example.com/chat --verboseFeature flags
Section titled “Feature flags”pwnkit ships a set of agent-improvement features behind environment-variable flags so you can A/B test them and opt in/out per run. Every flag is read at process start; set <FLAG>=0 or <FLAG>=false to disable, anything else to enable.
| Flag | Default | What it enables |
|---|---|---|
PWNKIT_FEATURE_EARLY_STOP | on | Early-stop at 50% budget if no findings, then retry with a different strategy. |
PWNKIT_FEATURE_LOOP_DETECTION | on | Detects A-A-A and A-B-A-B action loops, injects a warning to break the cycle. |
PWNKIT_FEATURE_CONTEXT_COMPACTION | on | Compresses middle-of-conversation messages when the context exceeds 30k tokens. |
PWNKIT_FEATURE_SCRIPT_TEMPLATES | on | Adds exploit-script templates (blind SQLi, SSTI, auth chain) to the shell prompt. |
PWNKIT_FEATURE_DYNAMIC_PLAYBOOKS | off | Injects technology-specific vulnerability playbooks after the recon phase. |
PWNKIT_FEATURE_EXTERNAL_MEMORY | off | Agent writes plan/creds to disk, re-injected at reflection checkpoints. |
PWNKIT_FEATURE_PROGRESS_HANDOFF | off | Injects prior-attempt findings when retrying, so retries don’t restart from zero. |
PWNKIT_FEATURE_WEB_SEARCH | off | Lets the agent search the web for CVE details, vendor docs, and technique references. |
PWNKIT_FEATURE_TARGET_HISTORY_PRESEED | on | Preloads source-review prompts with prior target CVE/GHSA audit graph leads inferred from repo metadata. |
PWNKIT_FEATURE_DOCKER_EXECUTOR | off | Runs every bash command inside a Kali Linux container with the full pentesting toolchain. |
PWNKIT_FEATURE_CLOUD_SINK | on | Allows opt-in streaming of findings/final reports to a remote scan sink when the cloud env vars are set. |
PWNKIT_FEATURE_PTY_SESSION | off | Interactive PTY sessions for exploits requiring interactivity (reverse shells, DB clients, SSH). |
PWNKIT_FEATURE_EGATS | off | Evidence-Gated Attack Tree Search — beam search over a hypothesis tree. Also toggled by --egats. |
PWNKIT_FEATURE_CONSENSUS_VERIFY | off | Self-consistency voting: runs the verify pipeline N times and takes the majority vote. |
PWNKIT_FEATURE_DEBATE | off | Adversarial debate: prosecutor vs. defender agents argue each finding, a skeptical judge decides. |
PWNKIT_FEATURE_MULTIMODAL | off | Cross-validates findings against foxguard (Rust pattern scanner). |
PWNKIT_FEATURE_REACHABILITY_GATE | off | Suppresses findings whose sink is not reachable from an application entry point. |
PWNKIT_FEATURE_POV_GATE | off | Requires a working executable PoC per finding, otherwise downgrades to info. |
PWNKIT_FEATURE_TRIAGE_MEMORIES | off | Injects Semgrep-style per-target persistent FP memories into the verify pipeline. Pairs with pwnkit-cli triage. |
Static analyzer selection
Section titled “Static analyzer selection”pwnkit review, source-code pipeline scans, and package source scans use Foxguard by default for pre-agent static leads. Set PWNKIT_STATIC=semgrep to route those static leads through Semgrep instead. Diff-aware --changed-only source reviews preserve the same changed-file narrowing with either scanner. Dependency advisory checks (npm audit, OSV, and OCI package inventory) remain separate and still run for package targets.
PWNKIT_STATIC=semgrep pwnkit review ./repo --depth quickSemgrep remains available as an explicit compatibility and comparison path while Foxguard carries the default static lead role.
Docker executor overrides
Section titled “Docker executor overrides”When PWNKIT_FEATURE_DOCKER_EXECUTOR=1 is enabled, these extra env vars
control the container image, networking, and bootstrap behavior:
| Variable | Default | Purpose |
|---|---|---|
PWNKIT_DOCKER_IMAGE | ghcr.io/0sec-labs/pwnkit:latest | Override the executor image |
PWNKIT_DOCKER_NETWORK | bridge | Docker network mode for the executor container |
PWNKIT_DOCKER_BOOTSTRAP_TOOLS | auto | Force or disable apt-based tool bootstrap inside the container |
Bootstrap rules:
- default GHCR image -> no bootstrap, use the pre-baked toolchain
kalilinux/kali-rolling-> bootstrap tools on first startPWNKIT_DOCKER_BOOTSTRAP_TOOLS=1-> always bootstrapPWNKIT_DOCKER_BOOTSTRAP_TOOLS=0-> never bootstrap
Networking rules:
- default is
bridge— the executor container gets its own network stack. This is the safe default (no exposure of the host’s localhost services to the container) and is fine for public targets. - set
PWNKIT_DOCKER_NETWORK=hostwhen the scan target is served from the same host, e.g. local XBOW challenges onlocalhost:<port>or adocker-composetarget on the default bridge. The container needs to reachhost.docker.internal/localhostto hit the service. - any valid
docker run --network <name>value works — pass a custom compose network name to land the executor on the same network as the target stack.
Cost ceiling
Section titled “Cost ceiling”You can bound API spend per scan, audit, or review:
export PWNKIT_COST_CEILING_USD=5pwnkit scan --target https://example.com --mode webOr override it per command:
pwnkit audit lodash --cost-ceiling 2pwnkit review ./my-repo --cost-ceiling 10If the ceiling is exceeded, pwnkit preserves partial findings and exits with code 4.
Cloud sink
Section titled “Cloud sink”If you want to stream findings and the final report to an orchestration layer:
export PWNKIT_CLOUD_SINK=https://api.example.comexport PWNKIT_CLOUD_SCAN_ID=scan_123export PWNKIT_CLOUD_TOKEN=secret-tokenWhen set, pwnkit posts:
- each finding as
{ "finding": ... } - the final report as
{ "report": ..., "final": true }
to:
${PWNKIT_CLOUD_SINK}/scans/${PWNKIT_CLOUD_SCAN_ID}/findingsSet PWNKIT_FEATURE_CLOUD_SINK=0 to disable this behavior even when the env vars are present.
Machine-readable result line
Section titled “Machine-readable result line”Set:
export PWNKIT_EMIT_RESULT_LINE=1to make the CLI print one final PWNKIT_RESULT=... JSON line summarizing:
- success/failure
- exit code and exit reason
- target type
- finding counts
- estimated cost and token usage when available
This is useful for wrappers, CI parsers, and the cloud orchestration path.
Example: maximum-accuracy pentest
Section titled “Example: maximum-accuracy pentest”Turn on every false-positive reduction feature for a client-ready scan:
export PWNKIT_FEATURE_CONSENSUS_VERIFY=1export PWNKIT_FEATURE_REACHABILITY_GATE=1export PWNKIT_FEATURE_POV_GATE=1export PWNKIT_FEATURE_TRIAGE_MEMORIES=1export PWNKIT_FEATURE_MULTIMODAL=1
pwnkit scan --target https://example.com --mode web --depth deepExample: Kali toolchain + web search
Section titled “Example: Kali toolchain + web search”export PWNKIT_FEATURE_DOCKER_EXECUTOR=1export PWNKIT_FEATURE_WEB_SEARCH=1
pwnkit scan --target https://example.com --mode webExample: raw Kali fallback
Section titled “Example: raw Kali fallback”export PWNKIT_FEATURE_DOCKER_EXECUTOR=1export PWNKIT_DOCKER_IMAGE=kalilinux/kali-rollingexport PWNKIT_DOCKER_BOOTSTRAP_TOOLS=1
pwnkit scan --target https://example.com --mode web