Reads your source code and threat model, maps vulnerabilities by category, and outputs a structured report with severity and remediation steps for your security team.
Best for: Engineers and security leads who need a quick, repeatable static scan before code review or deployment.
Creator's repository · anthropics/defending-code-reference-harness
License: NOASSERTION
---
name: vuln-scan
description: >-
Static source-code vulnerability scan. Reads a target directory (and
THREAT_MODEL.md if present), spawns parallel review subagents per focus
area, and writes VULN-FINDINGS.json + .md for /triage to consume. Read-only
— no building, running, or network. For execution-verified crashes, use
vuln-pipeline instead. Use when asked to "scan for vulns", "review this code
for security issues", "find bugs in <dir>", or as the step between
/threat-model and /triage.
argument-hint: "<target-dir> [--focus <area>] [--single] [--extra <file>] [--no-score]"
allowed-tools:
- Read
- Glob
- Grep
- Write
- Task
- Bash(rg:*)
- Bash(grep:*)
- Bash(ls:*)
- Bash(wc:*)
- Bash(head:*)
- Bash(file:*)
---
# /vuln-scan
Static vulnerability review of a source tree. Produces `VULN-FINDINGS.json`
(+ a human-readable `.md`) that `/triage` ingests directly.
**This skill does not execute code.** It reads source and reasons about it.
For execution-verified findings (ASAN crashes, reproducing PoCs), point the
user at `vuln-pipeline run <target>` — see README Step 2.
**Tool fallbacks.** Prefer the dedicated Glob and Grep tools. Some sessions
do not provision them — `allowed-tools` is a permission filter, not a loader,
so listing them here does not make them appear. When Glob/Grep are
unavailable, fall back to the read-only Bash commands whitelisted above:
`rg --files <scope>` / `ls -R` for enumeration, `rg -n` / `grep -rn` for
search, `wc` / `head` / `file` for sniffing. These are the ONLY permitted
Bash commands; do not write helper scripts or pipe target content into a
shell interpreter.
## Arguments
- `<target-dir>` (required) — directory to scan. Relative or absolute.
- `--focus <area>` — scan only this focus area (repeatable). Skips recon.
- `--single` — no subagent fan-out; one sequential pass. Use on tiny targets
or when debugging the prompt.
- `--extra <file>` — append the contents of `<file>` to the review brief
(after the category list). Use to add org-specific vulnerability classes,
compliance checks, or stack-specific patterns. Plain text; same shape as
the category blocks below.
- `--no-score` — skip the Step 3b confidence pass (saves a round of
subagents). Findings keep the scanner's self-reported confidence only.
## Step 1 — Scope
1. Resolve `<target-dir>`. If it doesn't exist or has no source files, stop
with an error.
2. Look for `<target-dir>/THREAT_MODEL.md`. If present, parse its section 3 "Entry
points & trust boundaries" table and section 4 "Threats" table for focus areas
and threat classes. This is the preferred scoping input.
3. If no THREAT_MODEL.md and no `--focus`: do a **quick recon** — list the
source tree, read entry points and dispatch code, and propose 3-10 focus
areas using the pattern `<subsystem> (<function/file>) — <key operations>`.
Same shape as `harness/prompts/recon_prompt.py`.
4. If `--focus` was given, use exactly those.
Tell the user the focus areas you'll scan and the source-file count before
fanning out.
## Step 2 — Fan out
Unless `--single`, spawn **one Task subagent per focus area** in parallel.
Cap at 10 concurrent. Each subagent gets the review brief below with its
focus area filled in. On tiny targets (<15 source files), fall through to
`--single` automatically.
### Review brief (per subagent)
```
You are conducting authorized static security review of source code. Your
focus area: **{focus_area}**. Other agents cover other areas; duplication
is wasted effort.
TARGET: {target_dir}
TRUST BOUNDARY: {from THREAT_MODEL.md section 3, or "untrusted input → process memory"}
TASK: read the source in your focus area and identify candidate
vulnerabilities. This is static review — do NOT build, run, or probe
anything. Reason from the code.
REPORTING BAR: report anything with a plausible exploit path. Skip style
concerns, best-practice gaps, and purely theoretical issues with no attack
story at all — but if you're unsure whether something is real, REPORT IT
with a low confidence score rather than dropping it. A downstream triage
step does the rigorous verification; your job is to not miss things.
WHAT TO LOOK FOR:
MEMORY SAFETY (C/C++ and unsafe/FFI blocks) — HIGH VALUE:
- heap-buffer-overflow / stack-buffer-overflow / global-buffer-overflow
- heap-use-after-free / double-free
- integer overflow feeding an allocation or index
- format-string bugs
- unbounded recursion or allocation driven by untrusted size fields
INJECTION & CODE EXECUTION — HIGH VALUE:
- SQL / command / LDAP / XPath / NoSQL / template injection
- path traversal in file operations
- unsafe deserialization (pickle, YAML, native), eval injection
- XSS (reflected, stored, DOM-based) — but see React/Angular note below
AUTH, CRYPTO, DATA — HIGH VALUE:
- authentication or authorization bypass, privilege escalation
- TOCTOU on a security check
- hardcoded secrets, weak crypto, broken cert validation
- sensitive data (secrets, PII) in logs or error responses
LOW VALUE — note briefly, keep looking:
- null-pointer deref at small fixed offsets with no attacker control
- assertion failures / clean error returns (correct handling, not a bug)
DO NOT REPORT (common false positives — skip even if technically present):
- volumetric DoS / rate-limiting / resource-exhaustion — BUT unbounded
recursion, algorithmic-complexity blowup, or ReDoS driven by untrusted
input ARE reportable
- memory-safety findings in memory-safe languages outside unsafe/FFI
- XSS in React/Angular/Vue unless via dangerouslySetInnerHTML,
bypassSecurityTrustHtml, v-html, or equivalent raw-HTML escape hatch
- findings in test files, fixtures, build scripts, docs, or .ipynb
- missing hardening / best-practice gaps with no concrete exploit
- env vars and CLI flags as the attack vector (operator-controlled)
- regex injection, log spoofing, open redirect, missing audit logs
- outdated third-party dependency versions
{if --extra <file> was given: append its contents here verbatim}
For each finding you DO report, trace: where does the untrusted input
enter, what path reaches the sink, and what condition triggers it.
OUTPUT — one block per finding, nothing else:
<finding>
<id>F-{focus_idx:02d}-{n:02d}</id>
<file>{relative/path}</file>
<line>{line_number}</line>
<category>{heap-buffer-overflow | use-after-free | integer-overflow | sql-injection | command-injection | path-traversal | deserialization | xss | auth-bypass | hardcoded-secret | ...}</category>
<severity>{HIGH | MEDIUM | LOW}</severity>
<confidence>{0.0-1.0}</confidence>
<title>{one line}</title>
<description>{root cause, attacker control, trigger condition, data flow from entry to sink. Cite line numbers.}</description>
<exploit_scenario>{concrete attack: what input, from where, causing what outcome}</exploit_scenario>
<recommendation>{specific fix: parameterize the query, bounds-check before memcpy, etc.}</recommendation>
</finding>
SEVERITY: HIGH = directly exploitable → RCE, data breach, auth bypass.
MEDIUM = significant impact under specific conditions. LOW = defense-in-
depth.
If you find nothing reportable in your area after a thorough read, emit a
single <finding> with category=none and a one-line note of what you covered.
```
## Step 3 — Collate
1. Collect `<finding>` blocks from all subagents. Drop `category=none`
placeholders.
2. **Light dedupe** — if two findings cite the same `file:line` with the
same category, keep the one with the longer description and note the
duplicate id. (Heavy dedupe is `/triage`'s job; don't over-engineer here.)
3. Assign stable ids `F-001`, `F-002`, ... in (severity desc, file, line)
order.
## Step 3b — Confidence pass (skip if `--no-score`)
A cheap second-opinion read that **ranks** findings by signal quality.
**Nothing is dropped** — this pass calibrates `confidence` so humans and
`/triage` see high-signal findings first. Spawn **one Task subagent per
finding** in parallel with the brief below. Shallow: re-read and score, not
a full reachability trace.
### Scoring brief (per finding)
```
You are giving ONE candidate security finding an independent confidence
score. You are NOT deciding whether to keep it — every finding is kept.
You are deciding how likely it is to survive rigorous triage.
FINDING:
{the full <finding> block}
TARGET: {target_dir} (you may Read/Grep inside it; do NOT execute)
STEP 1 — Re-read the cited code. Open {file} around line {line}. Does the
code actually do what the description claims?
STEP 2 — Check against common false-positive patterns (volumetric DoS,
memory-safe language, test/fixture/doc file, framework auto-escape, env-var
vector, missing-hardening-only, regex/log injection, outdated dep). A match
lowers confidence sharply but does not auto-zero it.
STEP 3 — Score 1-10 that this is a real, actionable vulnerability:
1-3 likely false positive or noise
4-5 plausible but speculative
6-7 credible, needs investigation
8-10 high confidence, clear pattern
OUTPUT (exactly this, nothing else):
CONFIDENCE: <1-10>
REASON: <one line>
```
**Resolve:** overwrite each finding's `confidence` with the score
(normalized to 0.0-1.0) and attach `confidence_reason`. Re-sort findings
by (`confidence` desc, `severity` desc, `file`, `line`) and reassign ids
`F-001..` in that order. Compute `low_confidence_count` = findings with
confidence < 0.4, for the summary line.
## Step 4 — Write output
Write **both** files to `<target-dir>/`:
**`VULN-FINDINGS.json`** — the `/triage` ingest shape:
```json
{
"target": "<target-dir>",
"scanned_at": "<iso8601>",
"focus_areas": ["..."],
"findings": [
{
"id": "F-001",
"file": "relative/path.c",
"line": 123,
"category": "heap-buffer-overflow",
"severity": "HIGH",
"confidence": 0.9,
"title": "...",
"description": "...",
"exploit_scenario": "...",
"recommendation": "...",
"confidence_reason": "..."
}
],
"summary": {"total": 0, "high": 0, "medium": 0, "low": 0, "low_confidence": 0}
}
```
Findings are sorted by `confidence` desc (then severity, file, line), so
the top of the file is the highest-signal material.
**`VULN-FINDINGS.md`** — human-readable: a summary table (id | severity |
category | file:line | title), then one `### F-NNN` section per finding with
the full description.
## Step 5 — Hand back
Tell the user:
1. Counts: N findings (H/M/L split, X low-confidence), across K focus
areas, from M source files.
2. Top 3 by confidence, one line each.
3. Next step: `> /triage <target-dir>/VULN-FINDINGS.json --repo <target-dir>`
4. Remind: these are **static candidates**, not verified. For
execution-verified crashes, `vuln-pipeline run <target>` (README Step 2).
## Constraints
- **Never execute target code.** No Bash, no builds, no `docker`, no network.
If the user asks you to "reproduce" or "confirm with a PoC," decline and
point at `vuln-pipeline`.
- **Don't fabricate line numbers.** Every `file:line` you emit must be
something you Read or Grep'd. If unsure of the exact line, cite the
function and say so in the description.
- **Stay in `<target-dir>`.** Don't follow symlinks or `..` out of it.
- Findings are candidates for `/triage`, not final verdicts. **This skill
never drops a finding** — Step 3b only ranks. `/triage` does the rigorous
N-vote verification and is where false positives actually get removed.
## Provenance
The focus-area recon pattern and memory-safety quality tiers are lifted
from this repo's own `harness/prompts/find_prompt.py` and
`harness/prompts/recon_prompt.py` — the same logic the autonomous pipeline
uses, applied statically. The broader category menu, DO-NOT-REPORT
exclusions, per-finding confidence pass, and
`exploit_scenario`/`recommendation` output fields are adapted from
[`anthropics/claude-code-security-review`](https://github.com/anthropics/claude-code-security-review)'s
`/security-review` command.