Skip to content

Configuration

pwnkit is designed for zero-config usage, but every default can be overridden via CLI flags or environment variables.

pwnkit is an agentic harness — bring your own AI. The --runtime flag controls which LLM backend powers the agents.

RuntimeFlagDescription
api--runtime apiUses your configured direct provider (ChatGPT Codex subscription auth, OpenRouter, Anthropic, Azure OpenAI, or OpenAI). Best for CI and quick scans. Default.
claude--runtime claudeSpawns the Claude Code CLI with your existing subscription. Best for deep analysis.
codex--runtime codexUses the Codex CLI for source review. For live target scans, routes to the direct ChatGPT Codex provider when PWNKIT_CHATGPT_OAUTH_REFRESH_TOKEN is configured.
gemini--runtime geminiSpawns the Gemini CLI. Best for large-context source analysis.
auto--runtime autoAuto-detects installed CLIs and picks the best one per pipeline stage.

The default api runtime makes direct HTTP calls to an LLM provider. It requires one of these environment variables:

Terminal window
export PWNKIT_CHATGPT_OAUTH_REFRESH_TOKEN="..." # ChatGPT/Codex subscription auth
export OPENROUTER_API_KEY="sk-or-..." # Recommended
export ANTHROPIC_API_KEY="sk-ant-..."
export AZURE_OPENAI_API_KEY="..."
export OPENAI_API_KEY="sk-..."

See API Keys for the full priority order and provider details.

If you use Azure, also set AZURE_OPENAI_BASE_URL and AZURE_OPENAI_MODEL unless pwnkit can read them from a valid Azure-backed ~/.codex/config.toml. For the Responses API, the base URL should include /openai/v1. pwnkit fails fast on incomplete Azure config instead of attempting a scan with guessed defaults.

For ChatGPT Codex subscription auth, run codex login, then provide the refresh token from ~/.codex/auth.json as PWNKIT_CHATGPT_OAUTH_REFRESH_TOKEN. When that variable is set it takes provider priority over API-key based providers.

These runtimes spawn the respective CLI tool as a subprocess. You must have the CLI installed and authenticated:

Terminal window
# Claude Code CLI
npm i -g @anthropic-ai/claude-code
# Codex CLI
npm i -g @openai/codex
# Gemini CLI
npm i -g @google/gemini-cli

Then use them:

Terminal window
pwnkit scan --target https://api.example.com/chat --runtime claude
pwnkit review ./my-repo --runtime codex --depth deep

The Codex CLI is not used as a live target wrapper. That MCP-backed path was removed because it added a target-interaction bottleneck. To use your Codex subscription for live scans, configure the direct provider:

Terminal window
export PWNKIT_CHATGPT_OAUTH_REFRESH_TOKEN="..."
pwnkit scan --target https://example.com --runtime codex

--runtime codex works across every pwnkit entry point as long as either the local codex CLI binary is installed OR the direct ChatGPT Codex provider is configured via PWNKIT_CHATGPT_ACCESS_TOKEN / PWNKIT_CHATGPT_OAUTH_REFRESH_TOKEN. When the binary is absent and the subscription env is set, pwnkit routes the request through the API runtime against chatgpt.com/backend-api/codex/responses (the same endpoint the upstream codex CLI uses).

SurfaceCommandSupported via direct provider
Web / URL scanpwnkit scan --target https://… --runtime codexyes
npm package auditpwnkit audit lodash --ecosystem npm --runtime codexyes
PyPI package auditpwnkit audit requests --ecosystem pypi --runtime codexyes
crates.io package auditpwnkit audit tokio --ecosystem cargo --runtime codexyes
OCI image auditpwnkit audit nginx:1.25 --ecosystem oci --runtime codexyes
Default source-code reviewpwnkit review ./repo --runtime codexyes
Linux kernel reviewpwnkit review ./linux --profile linux-kernel --runtime codexyes
C/C++ library reviewpwnkit review ./lib --profile c-library --runtime codexyes

Before pwnkit#402 only the web / URL scan path honoured the direct provider — the other surfaces aborted with Requested runtime 'codex' is not available whenever the codex CLI binary was missing, even with subscription auth configured. Cloud sandbox dispatch (0cloud worker-controller) still gates codex on target_ecosystem === "web" and is tracked as a separate follow-up.

The --mode flag controls what kind of target is being scanned.

ModeDescription
deepFull autonomous pentest. Runs the research + verify agents with the full 40-turn budget. Default when the target is an https:// URL.
probeLightweight surface scan — recon and fingerprinting without deep exploitation.
webShell-first autonomous pentesting for web applications. The agent uses bash (curl, python3, bash) as its primary tool to probe for CORS, headers, exposed files, SSRF, XSS, SQLi, SSTI, and more.
mcpScan MCP (Model Context Protocol) servers for tool poisoning and schema abuse. Default when the target starts with mcp://.
Terminal window
# LLM API scan (default)
pwnkit scan --target https://api.example.com/chat
# Web app scan
pwnkit scan --target https://example.com --mode web

The --depth flag controls how thorough the scan is.

DepthTest CasesTypical TimeBest For
quick~15~1 minCI pipelines, smoke tests
default~50~3 minDay-to-day scanning
deep~150~10 minPre-launch audits, thorough review
Terminal window
pwnkit scan --target https://api.example.com/chat --depth quick
pwnkit audit express --depth deep
pwnkit review ./my-repo --depth deep --runtime claude

pwnkit supports multiple output formats:

FormatDescription
terminalHuman-readable terminal summary with share URL
htmlRich browser report saved to a temporary file
pdfPrintable report saved to a temporary file
jsonMachine-readable JSON output for pipelines
sarifSARIF format for the GitHub Security tab
markdownHuman-readable Markdown report

In CI (GitHub Action), set format: sarif to populate the Security tab:

- uses: 0sec-labs/pwnkit@main
with:
mode: review
path: .
format: sarif

For PR workflows, review only changed files against a base branch:

Terminal window
pwnkit review ./my-repo --diff-base origin/main --changed-only

This is particularly useful in CI to avoid scanning the entire codebase on every PR.

Use --verbose to see the animated attack replay and detailed agent reasoning:

Terminal window
pwnkit scan --target https://api.example.com/chat --verbose

pwnkit ships a set of agent-improvement features behind environment-variable flags so you can A/B test them and opt in/out per run. Every flag is read at process start; set <FLAG>=0 or <FLAG>=false to disable, anything else to enable.

FlagDefaultWhat it enables
PWNKIT_FEATURE_EARLY_STOPonEarly-stop at 50% budget if no findings, then retry with a different strategy.
PWNKIT_FEATURE_LOOP_DETECTIONonDetects A-A-A and A-B-A-B action loops, injects a warning to break the cycle.
PWNKIT_FEATURE_CONTEXT_COMPACTIONonCompresses middle-of-conversation messages when the context exceeds 30k tokens.
PWNKIT_FEATURE_SCRIPT_TEMPLATESonAdds exploit-script templates (blind SQLi, SSTI, auth chain) to the shell prompt.
PWNKIT_FEATURE_DYNAMIC_PLAYBOOKSoffInjects technology-specific vulnerability playbooks after the recon phase.
PWNKIT_FEATURE_EXTERNAL_MEMORYoffAgent writes plan/creds to disk, re-injected at reflection checkpoints.
PWNKIT_FEATURE_PROGRESS_HANDOFFoffInjects prior-attempt findings when retrying, so retries don’t restart from zero.
PWNKIT_FEATURE_WEB_SEARCHoffLets the agent search the web for CVE details, vendor docs, and technique references.
PWNKIT_FEATURE_TARGET_HISTORY_PRESEEDonPreloads source-review prompts with prior target CVE/GHSA audit graph leads inferred from repo metadata.
PWNKIT_FEATURE_DOCKER_EXECUTORoffRuns every bash command inside a Kali Linux container with the full pentesting toolchain.
PWNKIT_FEATURE_CLOUD_SINKonAllows opt-in streaming of findings/final reports to a remote scan sink when the cloud env vars are set.
PWNKIT_FEATURE_PTY_SESSIONoffInteractive PTY sessions for exploits requiring interactivity (reverse shells, DB clients, SSH).
PWNKIT_FEATURE_EGATSoffEvidence-Gated Attack Tree Search — beam search over a hypothesis tree. Also toggled by --egats.
PWNKIT_FEATURE_CONSENSUS_VERIFYoffSelf-consistency voting: runs the verify pipeline N times and takes the majority vote.
PWNKIT_FEATURE_DEBATEoffAdversarial debate: prosecutor vs. defender agents argue each finding, a skeptical judge decides.
PWNKIT_FEATURE_MULTIMODALoffCross-validates findings against foxguard (Rust pattern scanner).
PWNKIT_FEATURE_REACHABILITY_GATEoffSuppresses findings whose sink is not reachable from an application entry point.
PWNKIT_FEATURE_POV_GATEoffRequires a working executable PoC per finding, otherwise downgrades to info.
PWNKIT_FEATURE_TRIAGE_MEMORIESoffInjects Semgrep-style per-target persistent FP memories into the verify pipeline. Pairs with pwnkit-cli triage.

pwnkit review, source-code pipeline scans, and package source scans use Foxguard by default for pre-agent static leads. Set PWNKIT_STATIC=semgrep to route those static leads through Semgrep instead. Diff-aware --changed-only source reviews preserve the same changed-file narrowing with either scanner. Dependency advisory checks (npm audit, OSV, and OCI package inventory) remain separate and still run for package targets.

Terminal window
PWNKIT_STATIC=semgrep pwnkit review ./repo --depth quick

Semgrep remains available as an explicit compatibility and comparison path while Foxguard carries the default static lead role.

When PWNKIT_FEATURE_DOCKER_EXECUTOR=1 is enabled, these extra env vars control the container image, networking, and bootstrap behavior:

VariableDefaultPurpose
PWNKIT_DOCKER_IMAGEghcr.io/0sec-labs/pwnkit:latestOverride the executor image
PWNKIT_DOCKER_NETWORKbridgeDocker network mode for the executor container
PWNKIT_DOCKER_BOOTSTRAP_TOOLSautoForce or disable apt-based tool bootstrap inside the container

Bootstrap rules:

  • default GHCR image -> no bootstrap, use the pre-baked toolchain
  • kalilinux/kali-rolling -> bootstrap tools on first start
  • PWNKIT_DOCKER_BOOTSTRAP_TOOLS=1 -> always bootstrap
  • PWNKIT_DOCKER_BOOTSTRAP_TOOLS=0 -> never bootstrap

Networking rules:

  • default is bridge — the executor container gets its own network stack. This is the safe default (no exposure of the host’s localhost services to the container) and is fine for public targets.
  • set PWNKIT_DOCKER_NETWORK=host when the scan target is served from the same host, e.g. local XBOW challenges on localhost:<port> or a docker-compose target on the default bridge. The container needs to reach host.docker.internal / localhost to hit the service.
  • any valid docker run --network <name> value works — pass a custom compose network name to land the executor on the same network as the target stack.

You can bound API spend per scan, audit, or review:

Terminal window
export PWNKIT_COST_CEILING_USD=5
pwnkit scan --target https://example.com --mode web

Or override it per command:

Terminal window
pwnkit audit lodash --cost-ceiling 2
pwnkit review ./my-repo --cost-ceiling 10

If the ceiling is exceeded, pwnkit preserves partial findings and exits with code 4.

If you want to stream findings and the final report to an orchestration layer:

Terminal window
export PWNKIT_CLOUD_SINK=https://api.example.com
export PWNKIT_CLOUD_SCAN_ID=scan_123
export PWNKIT_CLOUD_TOKEN=secret-token

When set, pwnkit posts:

  • each finding as { "finding": ... }
  • the final report as { "report": ..., "final": true }

to:

${PWNKIT_CLOUD_SINK}/scans/${PWNKIT_CLOUD_SCAN_ID}/findings

Set PWNKIT_FEATURE_CLOUD_SINK=0 to disable this behavior even when the env vars are present.

Set:

Terminal window
export PWNKIT_EMIT_RESULT_LINE=1

to make the CLI print one final PWNKIT_RESULT=... JSON line summarizing:

  • success/failure
  • exit code and exit reason
  • target type
  • finding counts
  • estimated cost and token usage when available

This is useful for wrappers, CI parsers, and the cloud orchestration path.

Turn on every false-positive reduction feature for a client-ready scan:

Terminal window
export PWNKIT_FEATURE_CONSENSUS_VERIFY=1
export PWNKIT_FEATURE_REACHABILITY_GATE=1
export PWNKIT_FEATURE_POV_GATE=1
export PWNKIT_FEATURE_TRIAGE_MEMORIES=1
export PWNKIT_FEATURE_MULTIMODAL=1
pwnkit scan --target https://example.com --mode web --depth deep
Terminal window
export PWNKIT_FEATURE_DOCKER_EXECUTOR=1
export PWNKIT_FEATURE_WEB_SEARCH=1
pwnkit scan --target https://example.com --mode web
Terminal window
export PWNKIT_FEATURE_DOCKER_EXECUTOR=1
export PWNKIT_DOCKER_IMAGE=kalilinux/kali-rolling
export PWNKIT_DOCKER_BOOTSTRAP_TOOLS=1
pwnkit scan --target https://example.com --mode web