Reads a security scanner report, weeds out false positives, merges duplicates, ranks by real exploitability, and outputs a prioritized list with owner assignments.
Best for: Engineering leaders who need to turn a wall of scanner noise into a one-page action list.
Creator's repository · anthropics/defending-code-reference-harness
License: NOASSERTION
---
name: triage
description: Triage a batch of raw security findings. Verify each is real,
collapse duplicates, re-rank by derived exploitability, and tag with an
owner. Takes a directory or file of scanner output and writes TRIAGE.json
+ TRIAGE.md sorted by what actually needs engineering attention. Use when
asked to "triage findings", "validate scanner output", "prioritize vulns",
or "review the backlog". Runs interactively by default; pass --auto to
skip the interview.
argument-hint: "<findings-path> [--auto] [--votes N] [--repo PATH] [--fp-rules FILE] [--fresh]"
allowed-tools:
- Read
- Glob
- Grep
- Write
- Task
- AskUserQuestion
- Bash(git log:*)
- Bash(jq:*)
- Bash(find:*)
- Bash(ls:*)
- Bash(wc:*)
- Bash(python3 .claude/skills/_lib/checkpoint.py:*)
---
# triage
Adversarial triage of raw security-scanner output. Does four jobs:
**verify** each finding is real, **deduplicate** across runs and scanners,
**rank** survivors by derived exploitability rather than the scanner's
claimed severity, and **route** each to a component owner. Output is a
short, ranked, owned list instead of a raw dump.
Invoke with `/triage <findings-path> [--auto] [--votes N] [--repo PATH] [--fp-rules FILE]`.
**Arguments** (parse from `$ARGUMENTS`; positional `$1`/`$2` expansion is
not stable across runtimes):
- findings path (first positional, required): a JSON file, a directory of
JSON files, a `VULN-FINDINGS.json`, a pipeline `results/<target>/<ts>/`
directory, or a markdown report.
- `--auto`: skip the interview and use defaults. Default mode is
**interactive**.
- `--votes N`: verifier votes per finding (default 3; use 1 for a quick
pass, 5 for high-stakes batches).
- `--repo PATH`: path to the target codebase, read-only (default cwd).
Verification needs source access; the skill stops with an error if the
cited files aren't reachable.
- `--fp-rules FILE`: append the contents of FILE to the verifier's
exclusion-rule list (Phase 3a). Use for org-specific precedents: "we use
Prisma ORM everywhere — raw-query SQLi only", "k8s resource limits cover
DoS", etc. Plain text, one rule per line or paragraph.
- `--fresh`: ignore any existing checkpoint in `./.triage-state/` and start
from Phase 0. Without this flag the skill resumes from the last completed
phase if a checkpoint is present.
**Tools:** Read, Glob, Grep, Write, Task, AskUserQuestion. Bash is
permitted only for `git`, `find`, `wc`, `ls`, `jq`, and
`python3 .claude/skills/_lib/checkpoint.py` (checkpoint I/O).
**Do not execute target code.** No building, running, installing
dependencies, or sending requests. A proof-of-concept that accidentally
works against something real is unacceptable, and "couldn't write a working
PoC" is weak evidence of non-exploitability. Every conclusion comes from
reading source. This applies to the orchestrator and every subagent;
include the constraint in every Task prompt. For high-confidence HIGH
findings, recommend a human-built PoC as a follow-up instead.
**Do not reach the network.** No package-registry lookups, CVE-database
queries, or upstream-commit fetches.
---
## Checkpointing (runs before Phase 0 and after every phase)
On large finding batches a full run can exhaust context or hit rate limits
mid-way — particularly Phase 3, which spawns `candidates × votes` verifiers.
Phase state persists to `./.triage-state/` so a fresh `/triage` session can
resume without re-asking the interview or re-spawning verifiers.
All checkpoint I/O goes through `python3 .claude/skills/_lib/checkpoint.py`
(atomic writes, JSON-validated). Never use the Write tool for `progress.json`
directly. Never pass payload via heredoc or stdin; target-derived strings
could collide with the heredoc delimiter and break out to shell. The
Write→`--from` pattern keeps repo-derived bytes out of Bash argv.
State files in `./.triage-state/`:
- `progress.json` — **single source of truth** for resume position:
`{"status": "running"|"complete", "phase_done": N, "shards_done": [...]}`.
Resume decisions read ONLY this file, never a glob of `phase*.json` or
shard files (stale files from a prior run must not be trusted).
- `phaseN.json` — data payload for phase N (schemas at the tail of each phase
section below).
- `_chunk.tmp` — transient payload buffer; overwritten before every
`save`/`shard`/`append` call.
**Start of run — resume check.** Bash:
`python3 .claude/skills/_lib/checkpoint.py load ./.triage-state`
- `status == "absent"` OR `"complete"`, OR `--fresh` in `$ARGUMENTS` →
**fresh start.** Bash:
`python3 .claude/skills/_lib/checkpoint.py reset ./.triage-state`,
then proceed to Phase 0.
- `status == "running"` with `phase_done == N` → **resume.** Read
`./.triage-state/phase0.json` through `phaseN.json` **in order** (and any
`shard_*.json` files listed in `shards_done`), merging keys into working
state (later files override earlier — checkpoints may be deltas). Print
`Resuming from checkpoint: Phase N complete (./.triage-state/phaseN.json)`,
and **skip directly to Phase N+1**.
**End of every phase N.** Two tool calls:
1. Write tool → `./.triage-state/_chunk.tmp` containing the phase's output
JSON (schema at the tail of each phase section).
2. Bash → `python3 .claude/skills/_lib/checkpoint.py save ./.triage-state <N> <name> --from ./.triage-state/_chunk.tmp`
**End of run.** After writing `TRIAGE.json` and `TRIAGE.md`, Bash:
`python3 .claude/skills/_lib/checkpoint.py done ./.triage-state 6`
---
## Phase 0: Mode select and interview
### 0a. Parse arguments
From `$ARGUMENTS`: extract the findings path (first positional), `--auto`
flag, `--votes N` (default 3), `--repo PATH` (default `.`), `--fp-rules
FILE` (default none). If no findings path was given, ask for one and stop.
If `--fp-rules` was given, Read the file now and carry its contents as
`context.extra_fp_rules` for injection into the Phase 3a verifier prompt.
### 0b. Interactive mode (default): interview the user
Unless `--auto` was passed, use **AskUserQuestion** to gather context that
shapes verification and ranking. Batch into one or two calls of up to four
questions. Expect free-text answers via "Other"; the multiple-choice options
are prompts, not constraints.
**Round 1** (single AskUserQuestion call):
1. **Environment & trust boundary** (header `Environment`, single-select)
`What kind of system are these findings from, and where does untrusted
input enter it?`
Options: `Internet-facing web service (HTTP is untrusted)`,
`Internal service (callers are authenticated peers)`,
`Library / SDK (caller is the trust boundary)`,
`CLI / batch tool (operator inputs trusted, file inputs not)`,
`Embedded / firmware (physical access in scope)`.
Reachability is judged against this boundary; "command injection from env
var" is a true positive in a multi-tenant web service and a rule-8 false
positive in an operator CLI.
2. **Threat model** (header `Threat model`, multi-select)
`What does a worst-case attacker look like for this system, and what
must never happen? Free text is best.`
Options: `Unauthenticated remote code execution`,
`Tenant-to-tenant data leakage`, `Privilege escalation to admin`,
`Supply-chain compromise of downstream users`,
`Denial of service against a paid SLA`,
`Compliance-scoped data exposure (PII / PCI / PHI)`.
Phase 4 boosts findings that map onto a stated threat.
3. **Scoring standard** (header `Scoring`, single-select)
`How should severity be expressed in the output?`
Options: `Derived HIGH/MEDIUM/LOW from preconditions (default)`,
`CVSS v3.1 vector + base score`, `CVSS v4.0 vector + base score`,
`OWASP Risk Rating (likelihood x impact)`,
`Organization bug-bar (describe in Other)`.
The precondition rule is always computed; this controls what
`severity_label` additionally shows.
4. **Noise tolerance** (header `Noise tolerance`, single-select)
`When verifiers disagree, which way should ties break?`
Options:
`Precision: drop anything not majority-confirmed (fewer FPs, may miss real bugs)`,
`Recall: keep split votes as needs_manual_test (more to review, fewer misses)`,
`Ask me per-finding when it happens`.
**Round 2** (conditional): if the threat-model answer was empty or generic,
or the scoring answer was `Organization bug-bar`, ask one targeted follow-up.
Record the answers as a `context` dict carried through every phase and
echoed in the output under `triage_context`.
### 0c. Auto mode defaults
When `--auto` is set, do not call AskUserQuestion. Use:
- Environment: `Unknown. Treat any externally-reachable entry point as
untrusted; flag trust-boundary assumptions explicitly in rationale.`
- Threat model: empty (no boost).
- Scoring: derived HIGH/MEDIUM/LOW.
- Noise tolerance: precision.
**Checkpoint:** Write tool → `./.triage-state/_chunk.tmp`:
```json
{"phase": 0, "context": {mode, environment, threat_model, scoring, noise_tolerance, votes_per_finding, repo, findings_path}}
```
Then Bash:
`python3 .claude/skills/_lib/checkpoint.py save ./.triage-state 0 interview --from ./.triage-state/_chunk.tmp`
On resume past Phase 0, the interview is **not** re-asked; `context` is
restored from this file.
---
## Phase 1: Ingest and normalize
Turn the input into a flat `findings[]` list with stable ids, regardless of
source format.
### 1a. Detect input shape
Inspect the findings path:
- **Directory**: Glob for `**/*.json` and `**/*.jsonl`. Recognized
containers, in priority order:
- `VULN-FINDINGS.json` (a `{findings: [...]}` container): read
`.findings[]`.
- `reports/bug_*/report.json` or `reports/manifest.jsonl` (this repo's
pipeline output): one finding per `bug_NN`. Map `crash.crash_type` →
`category`, `verdict.severity_rating` → `severity`, the prose `report` →
`description`, crash file from the ASAN top frame → `file`/`line`.
- `found_bugs.jsonl`: one finding per line.
- Any other `*.json` whose top level is a list of objects, or an object
with a `findings`/`results`/`issues`/`vulnerabilities` array: that
array.
- **Single `.json` / `.jsonl` file**: same recognition as above.
- **Markdown / text**: split on level-2/3 headings or `---` rules; for each
section, extract `file`, `line`, `category`, `severity`, `description` by
pattern (`File:`, `Line:`, `Severity:` labels or `path:NN` spans).
Best-effort; mark `source_format: "markdown_heuristic"`.
If nothing parseable is found, stop and report what was seen.
### 1b. Normalize fields
For each raw record, build a finding dict. **Pull what's present; never
guess what's absent.** Field map (source-key aliases → canonical):
| Canonical | Also accept |
|-----------------|----------------------------------------------------------|
| `file` | `path`, `location.file`, `filename`, ASAN top-frame file |
| `line` | `line_number`, `location.line`, `lineno` |
| `category` | `type`, `cwe`, `rule_id`, `crash_type`, `vulnerability_class` |
| `severity` | `severity_rating`, `level`, `priority`, `risk` |
| `title` | `name`, `summary`, `message` |
| `description` | `details`, `report`, `body`, `evidence` |
| `exploit_scenario` | `attack_scenario`, `poc`, `reproduction` |
| `preconditions` | `requirements`, `assumptions` |
| `recommendation`| `fix`, `remediation`, `mitigation` |
| `scanner_confidence` | `confidence`, `score`, `certainty` (normalize to 0.0-1.0) |
Attach to every finding:
- `id`: `f001`, `f002`, ... in ingest order. If `scanner_confidence` is
present on most findings, order ingest by it descending so high-signal
findings get verified (and surface in partial output) first; otherwise
keep source order. This is a scheduling prior only — it does not affect
verdicts.
- `source`: relative path of the file it came from, plus source format.
- `missing_fields`: list of canonical fields that were absent. If `file` is
missing or does not resolve under `--repo`, the finding is
**unlocatable**: it skips dedup and verification and is emitted directly
with `verdict: false_positive`, `verify_verdict: needs_manual_test`,
`confidence: 0`, `refute_reasons: ["doesnt_exist"]`, `rationale: "no
source location in input; cannot verify statically; human review
required"`. Never emit a confident verdict on a finding you could not
locate, and never let it absorb or be absorbed by dedup.
### 1c. Locate the target codebase
Resolve `--repo` (default cwd). For the first 5 findings with a `file`,
check the path resolves under the repo. Try, in order: (a) `repo/file`
as-given; (b) `file` as an absolute or cwd-relative path; (c) `repo/file`
with common prefixes stripped from `file` (`src/`, `app/`, `./`, or the
repo's own basename, e.g. `harness/grade.py` with `--repo harness`).
Record which resolution worked and apply it to every finding. If none
resolve, **stop**: tell the user verification needs source access and the
cited files aren't reachable, and suggest a `--repo` value based on the
longest common suffix you can see.
**Checkpoint:** Write tool → `./.triage-state/_chunk.tmp`:
```json
{"phase": 1, "context": {...}, "findings": [ {normalized finding dicts with id/source/file/line/category/...} ], "path_resolution": "<which of a/b/c worked>"}
```
Then Bash:
`python3 .claude/skills/_lib/checkpoint.py save ./.triage-state 1 ingest --from ./.triage-state/_chunk.tmp`
---
## Phase 2: Deduplicate (before verification)
Collapse repeats so duplicate findings don't each burn N verifiers.
### 2a. Deterministic pass (inline, no subagent)
Cluster findings where all of:
- same `file` (after path normalization), AND
- same `category` (case-insensitive, punctuation stripped), AND
- `line` numbers within 10 of each other. Both-missing matches; one-side-
missing does NOT (a line-less record must not absorb a located one).
Within each cluster, the canonical is the record with the fewest
`missing_fields`; ties break to lowest `id`. Every other member gets
`verdict: duplicate`, `duplicate_of: <canonical id>`, and is removed from
the working set. Record duplicate ids on the canonical as `absorbed: [...]`.
### 2b. Semantic pass (one subagent, only if >1 cluster survives)
Spawn ONE Task with `subagent_type: "general-purpose"` and this prompt:
```
You are deduplicating security findings before expensive verification. Two
findings are DUPLICATES if fixing one would also fix the other. Two findings
are DISTINCT if they have genuinely independent root causes, even if they
share a category or file.
Treat as DUPLICATE:
- Same root cause described with different wording or by different scanners
- A shared vulnerable helper function reported once per call site
- A missing global protection (auth check, output encoding) reported once
per endpoint that lacks it
- A cause ("missing input validation on `name`") and its consequence
("SQL injection via `name`") in the same code path
Treat as DISTINCT:
- Different categories in the same file region (an "ssrf" near a
"buffer_overflow" is not a duplicate just because the lines are close)
- Same file, same category, but different tainted variables reaching
different sinks
- Same helper, but two independent bugs inside it
- Two endpoints missing the same check, where the fix is per-endpoint
rather than a shared gate
Below are the candidate findings (one per line: id | file:line | category |
title). Group them. Respond with ONLY lines of the form:
GROUP: <canonical_id> <- <dup_id>, <dup_id>, ...
One line per group that has duplicates. Omit singletons. Pick the most
specific / best-described finding as canonical. No prose.
CANDIDATES:
{one line per surviving finding: "f003 | src/auth.py:112 | sql_injection | User lookup concatenates name into query"}
```
Parse `GROUP:` lines. For each, mark the listed dup ids with
`verdict: duplicate`, `duplicate_of: <canonical>`, append them to the
canonical's `absorbed`, and drop them from the working set.
Carry forward `candidates[]` = the surviving canonicals.
**Checkpoint:** Write tool → `./.triage-state/_chunk.tmp`:
```json
{"phase": 2, "context": {...}, "findings": [ {all findings; duplicates carry verdict/duplicate_of} ], "candidates": ["f001", "f003", "..."]}
```
Then Bash:
`python3 .claude/skills/_lib/checkpoint.py save ./.triage-state 2 dedup --from ./.triage-state/_chunk.tmp`
---
## Phase 3: Verify
For each candidate, N independent adversarial verifiers re-derive the claim
from the code and vote. Each verifier's stance is "find any reason this is
wrong." Each starts from the code at the cited location, not the scanner's
description, and never sees the other verifiers' reasoning (shared context
propagates blind spots).
### 3a. Verifier prompt (assemble once, reuse for every spawn)
```
You are a skeptical security engineer adversarially verifying ONE finding
from an automated scanner. Your default assumption is that the scanner is
WRONG. Your job is to re-derive the claim from the source code yourself and
decide TRUE_POSITIVE or FALSE_POSITIVE.
You have read-only access to the target codebase at: {REPO_PATH}
You may use Read, Glob, and Grep, but ONLY on paths inside {REPO_PATH}.
Do NOT read, grep, or glob outside that root: anything outside it (the
triage pipeline itself, scanner outputs, fixtures, other repos on disk) is
out of scope and citing it contaminates your verdict. If a finding's
`file` resolves outside {REPO_PATH}, return CANNOT_VERIFY with
REFUTE_REASON: doesnt_exist. You may NOT build, run, or test the target,
install dependencies, or reach the network. Every conclusion must come
from reading source under {REPO_PATH}.
ENVIRONMENT (from the operator; this defines the trust boundary):
{context.environment or "Unknown. Treat any externally-reachable entry point as untrusted."}
────────────────────────────────────────────────────────────────────────
PROCEDURE: follow all four steps. Each exists because skipping it lets a
specific false-positive class through.
1. READ THE CODE AT THE CITED LOCATION YOURSELF.
Open {file} at line {line}. Understand what the code actually does. Do
NOT trust the scanner's description: scanners misread code surprisingly
often, and if you start from the summary you inherit the misreading.
2. TRACE REACHABILITY BACKWARDS FROM THE SINK.
Grep for callers of this function/method. Follow imports. Establish
whether attacker-controlled input (per the ENVIRONMENT above) can
actually reach this line. A plausible-sounding chain is NOT enough: for
at least the FIRST link in the chain, READ the actual call site and
QUOTE the file:line in your rationale. Unreachable code is the single
largest false-positive source.
3. HUNT FOR PROTECTIONS.
Actively look for reasons the finding is WRONG:
- Input validation / sanitization upstream of the sink
- Framework auto-escaping, parameterized queries, prepared statements
- Type constraints (the value is an int, an enum, a fixed-length token)
- Authentication / authorization gates before this path
- Configuration that limits exposure (feature flag off, debug-only)
- Dead code, test-only code, example/fixture code
4. STRESS-TEST EACH PROTECTION.
For each protection you found: is it applied on EVERY path to the sink,
or only the one the scanner happened to trace? Are there encodings,
edge cases, or alternate entry points that bypass it?
────────────────────────────────────────────────────────────────────────
EXCLUSION RULES: if the finding matches any of these, it is FALSE_POSITIVE
even if technically accurate. Cite the rule number in your verdict.
1. Volumetric DoS or missing rate-limiting (handled at infrastructure
layer). ReDoS, algorithmic complexity, and unbounded recursion ARE
still valid findings.
2. Test-only code, dead code, example/fixture code, or a crash with no
security impact.
3. Behavior that is the intended design (compression middleware, a
backward-compatible weak algorithm offered alongside a strong one).
4. Memory-safety concerns in memory-safe languages outside `unsafe` /
FFI blocks.
5. SSRF where the attacker controls only the path, not the host or
protocol.
6. User input flowing into an AI/LLM prompt (prompt injection is not a
code vulnerability in the target).
7. Path traversal in object storage (S3/GCS) where `../` does not escape
a trust boundary.
8. Trusted inputs used as the attack vector (env vars, CLI flags set by
the operator), UNLESS the ENVIRONMENT above marks them untrusted.
9. Client-side code flagged for server-side vulnerability classes.
10. Outdated dependency versions (managed by a separate process).
11. Weak random used for non-security purposes (jitter, shuffling,
dev-only fallbacks).
12. Low-impact nuisance issues (log spoofing, CSRF on logout, self-XSS,
tabnabbing, open redirect, regex injection).
13. Missing hardening or best-practice gap with no concrete exploit path
(missing security headers, no audit logging, permissive config that
isn't actually reached by untrusted input).
14. XSS in a framework with default auto-escaping (React, Angular, Vue,
Jinja2 autoescape=on) UNLESS the sink is a raw-HTML escape hatch
(dangerouslySetInnerHTML, bypassSecurityTrustHtml, v-html, |safe).
15. Identifiers that are unguessable by construction (UUIDv4, 128-bit+
random tokens) flagged as "predictable" or "needs validation".
16. Race conditions or TOCTOU that are theoretical only — no realistic
window, or no security-relevant state changes between check and use.
{if context.extra_fp_rules: append here verbatim under an
"ORG-SPECIFIC RULES:" heading}
────────────────────────────────────────────────────────────────────────
VERDICT: your response MUST end with EXACTLY this block:
VERDICT: TRUE_POSITIVE | FALSE_POSITIVE | CANNOT_VERIFY
CONFIDENCE: <0-10>
REFUTE_REASON: <one of: doesnt_exist, already_handled,
implausible_trigger, intentional_behavior, misread_code, duplicate,
not_actionable, n/a>
EXCLUSION_RULE: <1-16, org rule, or none>
FIRST_LINK: <file:line of the first call site you read, or "none found">
RATIONALE: <2-5 sentences citing specific file:line evidence for
reachability, protections found/absent, and why each held or didn't>
TRUE_POSITIVE requires ALL of: path is reachable from untrusted input per
the ENVIRONMENT; protections are insufficient or bypassable; real-world
exploitation is feasible.
FALSE_POSITIVE requires ANY of: unreachable from untrusted input;
adequately protected on all paths; scanner misread the code; an exclusion
rule applies.
CANNOT_VERIFY: static reasoning genuinely hit its limit (e.g. behavior
depends on runtime configuration you cannot read, or the code path crosses
into a binary you cannot inspect). Use sparingly; it must not become the
default.
```
### 3b. Spawn N verifiers per candidate, all in one message
For each finding in `candidates[]`, build N Task calls (N = `--votes`,
default 3) with `subagent_type: "general-purpose"` and `description:
"verify {id} vote {k}/{N}"`.
**Always set `subagent_type`; never fork.** Omitting `subagent_type` forks
the orchestrator, and a fork inherits the full conversation context: every
other finding's description, the scanner's prose, and any prior verifier
results. That defeats verifier independence and re-introduces the
inherited-framing failure mode this phase exists to prevent. Each verifier
must start with a fresh, empty context and receive only the 3a prompt
plus the single finding under review. The same applies to the ranking
subagents in 4a.
Each prompt is the verifier prompt from 3a with this block appended:
```
────────────────────────────────────────────────────────────────────────
FINDING UNDER REVIEW (from the scanner; treat as a CLAIM, not a fact):
id: {id}
file: {file}
line: {line}
category: {category}
severity (claimed): {severity}
title: {title}
description:
{description}
exploit_scenario:
{exploit_scenario or "(not provided)"}
preconditions (claimed):
{preconditions as bullets or "(not provided)"}
You are vote {k} of {N}. You have NOT seen the other verifiers' reasoning
and you must NOT try to find it. Work independently from the code.
```
**Put all verifier Task calls in a single assistant message** so they run
concurrently. Do not set `run_in_background`; you need the final text, not
an async handle. If `len(candidates) * N` exceeds ~40, shard into
sequential batches of ~40, but keep each batch a single message.
**Prompt size at scale.** The 3a prompt is ~1200 words. When
`candidates * votes > ~50`, use this compact form instead (same procedure
and output contract, prose stripped):
```
Adversarially verify ONE scanner finding. Default: scanner is WRONG.
Read-only access scoped to {REPO_PATH} ONLY. No exec, no network.
ENVIRONMENT: {context.environment}
Steps: (1) Read {file}:{line} yourself; don't trust the description.
(2) Trace callers backwards; quote the first call-site file:line.
(3) Hunt for protections: validation, escaping, type bounds, auth gates,
dead/test code. (4) Stress-test each protection on every path.
Exclusion rules (FALSE_POSITIVE if matched): 1 volumetric DoS;
2 test/dead/fixture code; 3 intended design; 4 memory-safety in safe
lang outside unsafe/FFI; 5 SSRF path-only; 6 LLM prompt input;
7 object-storage traversal; 8 trusted operator env/CLI inputs;
9 client code, server vuln class; 10 outdated deps; 11 weak random
non-security; 12 low-impact nuisance (log spoof, open redirect, regex
inject); 13 missing-hardening-only, no concrete exploit; 14 XSS in
auto-escape framework w/o raw-HTML escape hatch; 15 unguessable
UUID/token flagged predictable; 16 theoretical-only race/TOCTOU.
{+ org rules from --fp-rules if any}
End with EXACTLY:
VERDICT: TRUE_POSITIVE | FALSE_POSITIVE | CANNOT_VERIFY
CONFIDENCE: <0-10>
REFUTE_REASON: <doesnt_exist|already_handled|implausible_trigger|
intentional_behavior|misread_code|duplicate|not_actionable|n/a>
EXCLUSION_RULE: <1-16, org rule, or none>
FIRST_LINK: <file:line or "none found">
RATIONALE: <2-5 sentences, file:line cited>
FINDING: {id} {file}:{line} {category} (claimed {severity})
{title}
{description}
Vote {k}/{N}. Independent; do not seek other votes.
```
Findings with a `file` but no `line` get **one** verifier vote regardless
of `--votes` (a file-level sweep is expensive and doesn't benefit from
voting).
**If any Task call returns `status: "async_launched"` instead of the
verifier's text**, the runtime backgrounded it (some runtimes do this
automatically for large parallel batches). Pick one recovery and use it for
the whole batch:
- If completion notifications arrive in your conversation: parse each
verifier's VERDICT block from its notification `result` as it lands.
Do not end your turn until every vote is accounted for.
- If notifications do not arrive: do not poll transcript files. Re-spawn
the missing verifiers in a fresh Task batch (smaller shard size, e.g.
10) and use the synchronous results.
The same recovery applies to the dedupe subagent in 2b and the ranking
subagents in 4a.
### 3c. Tally votes
For each candidate, parse the trailing block from each of its N verifiers
(tolerate code fences and whitespace). If a verifier errored, timed out,
or produced no parseable VERDICT block, re-spawn it once. If the retry
also fails, count that vote as `cannot_verify` with `confidence: 0` and
note `"verifier_error"` in `refute_reasons`. The remaining N-1 votes still
decide.
Build:
- `vote_breakdown`: `{"true_positive": x, "false_positive": y,
"cannot_verify": z}`
- `confidence`: mean CONFIDENCE across votes that agree with the majority,
rounded to one decimal.
- `exclusion_rule`: the modal EXCLUSION_RULE among FALSE_POSITIVE votes,
else `null`.
- `refute_reasons`: sorted unique REFUTE_REASON values from FALSE_POSITIVE
votes.
- `first_links`: unique FIRST_LINK values across all votes (reachability
audit trail).
- `rationale`: the RATIONALE from the highest-confidence vote on the
winning side, verbatim.
**Decide `verdict`:**
- Majority TRUE_POSITIVE → `verdict: true_positive`. Proceeds to Phase 4.
- Majority FALSE_POSITIVE → `verdict: false_positive`. Skips Phase 4.
- No majority (tie, or majority CANNOT_VERIFY):
- Noise tolerance `precision` → `verdict: false_positive`; append
`"(split vote, dropped under precision policy)"` to rationale.
- Noise tolerance `recall` → `verdict: true_positive` with
`verify_verdict: needs_manual_test`. Proceeds to Phase 4.
- Noise tolerance `ask` → collect all split findings and present them in
one AskUserQuestion call at the end of Phase 3 (header: id + title,
options: keep / drop), then apply the user's choices.
Build `confirmed[]` = candidates with `verdict == true_positive`.
**Checkpoint:** Write tool → `./.triage-state/_chunk.tmp`:
```json
{"phase": 3, "context": {...}, "findings": [ {all findings with verdict/vote_breakdown/confidence/refute_reasons/first_links/rationale/exclusion_rule} ], "confirmed": ["f001", "..."]}
```
Then Bash:
`python3 .claude/skills/_lib/checkpoint.py save ./.triage-state 3 verify --from ./.triage-state/_chunk.tmp`
This is the most expensive checkpoint. When `len(candidates) * votes` exceeds
~40 and verifier spawns are sharded into sequential batches, additionally
checkpoint **per candidate** as its votes are tallied:
1. Write tool → `./.triage-state/_chunk.tmp` = that finding's post-tally dict.
2. Bash:
`python3 .claude/skills/_lib/checkpoint.py shard ./.triage-state <id> --from ./.triage-state/_chunk.tmp`
On resume at `phase_done == 2`, the Phase-3 entry point reads
`progress.json:shards_done` (default `[]` — do **not** glob shard files on
disk; stale shards from a prior run may exist), loads the corresponding
`shard_{id}.json` files, and spawns verifiers only for `candidates[]` ids
from `phase2.json` that are NOT in `shards_done`. Once every candidate is in
`shards_done`, write the consolidated `phase3.json` checkpoint as above.
---
## Phase 4: Rank by exploitability (confirmed findings only)
Recompute severity from preconditions and reachability rather than category
name, and judge the scanner's claimed severity separately. Verification and
severity are independent judgments; "this is real" must not inflate into
"this is critical."
### 4a. Ranking prompt
Spawn one Task per confirmed finding (`subagent_type: "general-purpose"`,
all in one message) with:
```
You are assigning severity to a CONFIRMED security finding. Verification
already happened; assume the finding is real. Your only job is to derive
how bad it is, independently of what the scanner claimed.
You may Read/Grep the codebase at {REPO_PATH} to check preconditions. Do
NOT execute code.
ENVIRONMENT: {context.environment}
THREAT MODEL (operator-stated, may be empty):
{context.threat_model as bullets, or "(none provided)"}
SCORING STANDARD: {context.scoring}
FINDING:
id: {id}
file: {file}:{line}
category: {category}
claimed severity: {severity}
reachability evidence: {first_links from Phase 3}
verifier rationale: {rationale from Phase 3}
────────────────────────────────────────────────────────────────────────
STEP 1: Enumerate EVERY precondition that must hold for exploitation.
Be concrete: required auth state, configuration, prior request, race
window, attacker position. Then state the minimum ACCESS LEVEL required
(unauthenticated remote / authenticated / local / physical).
STEP 2: Derive severity from the precondition count and access level:
| Preconditions | Access required | Severity |
|---------------|--------------------------|----------|
| 0 | Unauthenticated remote | HIGH |
| 1-2 | Authenticated | MEDIUM |
| 3+ | Local-only / no demo path| LOW |
Evaluate each column independently and take the LOWER result. Example:
0 preconditions but authenticated-only is MEDIUM, not HIGH; 1
precondition but local-only is LOW. Cross-check: if your preconditions
list has 3+ items, HIGH is almost certainly wrong.
STEP 3: Threat-model match. If the THREAT MODEL is non-empty and this
finding maps onto one of its entries, note which one. A match may raise
severity by ONE step (LOW to MEDIUM or MEDIUM to HIGH), never two. If the
threat model is empty, skip this step.
STEP 4: Judge the scanner's claimed severity. From the perspective of an
engineer who has reviewed two hundred scanner findings this week and is
allergic to inflation: would the CLAIMED severity contribute to alert
fatigue? Is it comparable to a real CVE at that level? Is the code in test
fixtures or dev-only config? Score in -5..+5:
+3..+5 claimed severity is justified or understated
0..+2 roughly right
-1..-3 inflated by one level
-4..-5 badly inflated (LOW dressed as HIGH)
STEP 5: verify_verdict. Exactly one of:
exploitable preconditions are realistically satisfiable
mitigated real, but a deployed control reduces it below the
derived severity (name the control)
needs_manual_test severity hinges on something only a runtime test can
settle; recommend a human build a PoC
STEP 6: If SCORING STANDARD is a CVSS or OWASP variant, emit a
`severity_label` in that format (vector string + base score for CVSS;
likelihood x impact for OWASP). Otherwise set it equal to the derived
HIGH/MEDIUM/LOW.
────────────────────────────────────────────────────────────────────────
Respond with ONLY this block:
PRECONDITIONS:
- <one per line>
ACCESS_LEVEL: <unauthenticated_remote|authenticated|local|physical>
SEVERITY: <HIGH|MEDIUM|LOW>
SEVERITY_LABEL: <per scoring standard>
THREAT_MATCH: <matched threat-model entry, or none>
SEVERITY_ALIGNMENT: <-5..+5>
VERIFY_VERDICT: <exploitable|mitigated|needs_manual_test>
RANK_RATIONALE: <2-4 sentences>
```
### 4b. Merge
For each confirmed finding, parse the block and attach `preconditions`
(replacing any scanner-supplied list), `access_level`, `severity`
(recomputed), `severity_label`, `threat_match`, `severity_alignment`,
`verify_verdict`, and append RANK_RATIONALE to `rationale` (separated by a
blank line from the Phase-3 rationale).
For findings that did NOT reach Phase 4 (`false_positive`, `duplicate`,
unlocatable): set `severity: null`, `verify_verdict: null`,
`severity_alignment: null`, `preconditions: []`.
**Checkpoint:** Write tool → `./.triage-state/_chunk.tmp`:
```json
{"phase": 4, "context": {...}, "findings": [ {all findings with severity/severity_label/preconditions/access_level/threat_match/severity_alignment/verify_verdict} ]}
```
Then Bash:
`python3 .claude/skills/_lib/checkpoint.py save ./.triage-state 4 rank --from ./.triage-state/_chunk.tmp`
---
## Phase 5: Route
Tag each confirmed true-positive with the most specific component or owner
inferable. For each finding in `confirmed[]`, stop at the first hit:
1. **CODEOWNERS / OWNERS.** Grep `--repo` for `CODEOWNERS`, `OWNERS`,
`.github/CODEOWNERS`, `docs/CODEOWNERS`. If found, match the finding's
`file` against its patterns (last match wins). Hint:
`"CODEOWNERS: <pattern> -> <owner(s)>"`.
2. **git log.** If `--repo` is a git checkout, run
`git -C {REPO} log --format='%an' -n 50 -- "{file}" | sort | uniq -c | sort -rn | head -3`.
Hint: `"top committer: <name> (<n>/<total> recent commits); no
CODEOWNERS entry"`.
3. **Module fallback.** Hint: `"component: <top-level dir of file>/; no
CODEOWNERS or git history"`.
Attach as `owner_hint`. State the source so confidence is clear; a bare
username is less useful than `"component: auth/; no CODEOWNERS entry; top
committer jsmith (14/20 recent commits)"`. For non-true-positive findings,
set `owner_hint: null`.
**Checkpoint:** Write tool → `./.triage-state/_chunk.tmp`:
```json
{"phase": 5, "context": {...}, "findings": [ {all findings with owner_hint} ]}
```
Then Bash:
`python3 .claude/skills/_lib/checkpoint.py save ./.triage-state 5 route --from ./.triage-state/_chunk.tmp`
---
## Phase 6: Output
### 6a. Sort
Order all findings by:
1. `verdict`: `true_positive`, then `duplicate`, then `false_positive`.
2. Within true positives: `severity` HIGH > MEDIUM > LOW, then `confidence`
descending, then `severity_alignment` descending.
3. Within others: original `id`.
### 6b. Write `./TRIAGE.json`
```json
{
"triage_completed": true,
"triage_context": {
"mode": "interactive|auto",
"environment": "...",
"threat_model": ["..."],
"scoring": "...",
"noise_tolerance": "...",
"votes_per_finding": 3,
"repo": "..."
},
"summary": {
"input_count": 0,
"duplicates": 0,
"false_positives": 0,
"true_positives": 0,
"needs_manual_test": 0,
"by_severity": {"HIGH": 0, "MEDIUM": 0, "LOW": 0}
},
"findings": [
{
"id": "f001",
"source": "VULN-FINDINGS.json#0",
"title": "...",
"file": "...",
"line": 0,
"category": "...",
"claimed_severity": "HIGH",
"verdict": "true_positive|false_positive|duplicate",
"verify_verdict": "exploitable|mitigated|needs_manual_test|null",
"confidence": 0.0,
"severity": "HIGH|MEDIUM|LOW|null",
"severity_label": "...",
"severity_alignment": 0,
"preconditions": ["..."],
"access_level": "...",
"threat_match": "...|null",
"rationale": "file:line-cited prose: reachability, protections, why each held or didn't; then ranking rationale",
"vote_breakdown": {"true_positive": 0, "false_positive": 0, "cannot_verify": 0},
"refute_reasons": ["..."],
"exclusion_rule": null,
"first_links": ["file:line", "..."],
"duplicate_of": null,
"absorbed": ["..."],
"owner_hint": "...",
"missing_fields": ["..."]
}
]
}
```
Every input finding appears exactly once (duplicates reference their
canonical via `duplicate_of`). Do not silently drop anything. Do not print
this JSON to the terminal; write to file only.
### 6c. Write `./TRIAGE.md`
Reviewer-facing report. Build it **incrementally**. Do NOT emit the whole
file in one Write. One chunk per finding; a stalled chunk loses that one
section, not the file.
**Step 1 — header.** Write tool → `./TRIAGE.md` (clobbers any prior file)
containing only the title block, summary, and `## Act on these` heading:
```
# Triage Report
{summary line: N in -> D duplicates, F false positives, T confirmed (H high / M med / L low), X need manual test}
Context: {mode}; environment = {environment}; scoring = {scoring}; {votes}-vote verification.
## Act on these
```
**Step 2 — per finding.** For each true_positive in severity order:
1. Write tool → `./.triage-state/_chunk.tmp` containing ONE finding's section:
```
### [{severity}] {title} ({id})
`{file}:{line}` | {category} | claimed {claimed_severity} (alignment {severity_alignment:+d}) | confidence {confidence}/10
**Owner:** {owner_hint}
**Verdict:** {verify_verdict}, votes {vote_breakdown}
**Preconditions ({n}):** {bulleted}
**Threat-model match:** {threat_match or "none"}
**Why:** {rationale}
**Reachability evidence:** {first_links}
{if verify_verdict == needs_manual_test:}
> Recommend a human build a PoC; static reasoning hit its limit.
```
2. Bash:
`python3 .claude/skills/_lib/checkpoint.py append ./TRIAGE.md --from ./.triage-state/_chunk.tmp`
Repeat for each true_positive.
**Step 3 — footer.** Write tool → `./.triage-state/_chunk.tmp` containing the
Dropped table, then `checkpoint.py append` it the same way:
```
## Dropped
| id | title | file:line | why dropped |
{false_positives: refute_reasons + exclusion_rule}
{duplicates: "duplicate of {duplicate_of}"}
{unlocatable: "no source location in input"}
```
**Checkpoint (final):** Bash:
`python3 .claude/skills/_lib/checkpoint.py done ./.triage-state 6`
The next invocation's resume check sees `status == "complete"` and starts
fresh.
### 6d. Terminal summary
Under ~12 lines:
```
Triage complete: {N} findings -> {T} confirmed, {F} false positives, {D} duplicates.
HIGH: {n} {title of top HIGH, owner_hint}
MEDIUM: {n}
LOW: {n}
Needs manual test: {n}
Top refute reasons: {top 3 refute_reasons with counts}
Wrote ./TRIAGE.md and ./TRIAGE.json
```
---
## Testing this skill
Smoke test (five-finding fixture: 2 real, 1 dup, 2 FP):
```
/triage .claude/skills/triage/fixtures/canary-findings.json --auto --repo targets/canary
```
Expected: f001 and f003 confirmed; f002 duplicate of f001; f004 dropped
(`misread_code`: it's a read buffer, not a randomness source); f005 dropped
(`already_handled`: there is a null check at line 68).
Or against pipeline output:
```
vuln-pipeline run drlibs --runs 3 --parallel --stream
/triage results/drlibs/<ts>/ --repo targets/drlibs
```
Hand-check a sample of TRUE_POSITIVE/HIGH results (the `first_links` should
point at real call sites) and a sample of FALSE_POSITIVE rejects (the
`exclusion_rule` or `refute_reasons` should be defensible).
---
## Design notes
- **Checkpoints are per-phase JSON**, not conversation state. The pipeline's
`--resume <session_id>` (docs/pipeline.md) restores transcript history but
doesn't help when the orchestrator's context window itself fills;
file-backed checkpoints let a brand-new session pick up from the last
completed phase. `./.triage-state/` is scratch — add to `.gitignore`.
- **Dedupe runs before verify** to cut verifier spend by the duplication
factor (often 2-4x on multi-scanner input) at the cost of one cheap
subagent.
- **Semantic dedupe is one agent**, given only id/file/line/category/title:
enough to cluster, not enough to leak one scanner's reasoning into
another finding's verification.
- **Bash is allowed narrowly** for `git log` (owner hints), `jq`/`find`
(ingest), and `python3 .claude/skills/_lib/checkpoint.py` (state I/O).
The actual safety property is "no execution of target code," which is
preserved.
- **`CANNOT_VERIFY`** exists so verifiers aren't forced into a false
binary. It maps to `needs_manual_test` under recall policy and to a drop
under precision policy.
- **Threat-model boost is capped at one step** so a stated threat can't
re-inflate a LOW back to HIGH and defeat the precondition rule.
- **`severity_label` is separate from `severity`.** Sorting always uses the
precondition-derived HIGH/MEDIUM/LOW; the label is presentation-layer for
whatever standard the reviewer's tooling expects.
- **Pipeline `report.json` ingest is best-effort.** Those reports describe
ASAN crashes with prose exploitability analysis rather than the
file/line/category shape static verifiers expect. Expect more
`needs_manual_test` verdicts on that input than on static-scanner JSON.
- **Sharding at ~40 parallel Tasks** is a conservative ceiling for typical
agent-spawn limits; tune up if your runtime allows.
- **No network**, deliberately. CVE-database enrichment and upstream-fix
checks would help ranking but break the air-gapped-review property.