From ad6da302c7a438a804ffeac9b037683e37d45cb8 Mon Sep 17 00:00:00 2001 From: Oscar Wallberg Date: Mon, 27 Apr 2026 03:37:03 +0200 Subject: [PATCH] fix(claude): walk audit findings tier by tier --- .claude/skills/audit/SKILL.md | 141 +++++++++++++++++++++------------- 1 file changed, 87 insertions(+), 54 deletions(-) diff --git a/.claude/skills/audit/SKILL.md b/.claude/skills/audit/SKILL.md index 19ccc85..68296d1 100644 --- a/.claude/skills/audit/SKILL.md +++ b/.claude/skills/audit/SKILL.md @@ -19,6 +19,10 @@ description: Run a deep, multi-lens review of existing code state (not a diff). If the user invokes `/audit` with no arguments, infer the primary source tree from the project layout (Rust: `crates/*/src/`; TS/JS: `src/` or `packages/*/src/`; Python: the package directory). Ask briefly only if ambiguous. +## Communicating with the user + +Phases are internal scaffolding for organizing this skill, not concepts the user needs to track. Do not announce them in user-facing text. No "Phase 3: validating findings before reporting", no "moving on to Phase 5", no "Phase 4 triage complete". Brief, plain progress notes are fine when warranted ("validating findings before reporting", "running the gate"), but they should describe the action, not name a phase. + ## Phase 1: Context gather Before spawning review agents: @@ -38,19 +42,19 @@ Send a single message with multiple `Agent` tool uses, each `subagent_type: gene The `Agent` tool accepts a `model: "sonnet" | "opus" | "haiku"` parameter. Pick deliberately - some lenses are pattern-matching (cheap), others are reasoning-heavy (expensive but worth it). -| lens | model | why | -|------|-------|-----| -| reuse | sonnet | pattern recognition across files, fits sonnet's strengths | -| quality | sonnet | structural critique, naming, dead code; sonnet is enough | -| efficiency | **opus** | needs reasoning about hot paths, allocations, asymptotic patterns | -| errors | **opus** | control-flow analysis, silent-failure detection wants careful reading | -| api | sonnet | visibility analysis, type design - mostly mechanical | -| bugs | **opus** | correctness reasoning is the place not to skimp | -| docs (opt-in) | haiku | "does the comment still match the code?" - cheap | -| tests (opt-in) | sonnet | gap analysis with semantic context | -| security (opt-in) | **opus** | high-stakes correctness, needs careful reading | -| a11y (opt-in) | sonnet | pattern matching with semantic context | -| deps (opt-in) | haiku | mostly file scanning | +| lens | model | why | +| ----------------- | -------- | --------------------------------------------------------------------- | +| reuse | sonnet | pattern recognition across files, fits sonnet's strengths | +| quality | sonnet | structural critique, naming, dead code; sonnet is enough | +| efficiency | **opus** | needs reasoning about hot paths, allocations, asymptotic patterns | +| errors | **opus** | control-flow analysis, silent-failure detection wants careful reading | +| api | sonnet | visibility analysis, type design - mostly mechanical | +| bugs | **opus** | correctness reasoning is the place not to skimp | +| docs (opt-in) | haiku | "does the comment still match the code?" - cheap | +| tests (opt-in) | sonnet | gap analysis with semantic context | +| security (opt-in) | **opus** | high-stakes correctness, needs careful reading | +| a11y (opt-in) | sonnet | pattern matching with semantic context | +| deps (opt-in) | haiku | mostly file scanning | The validation agent in Phase 3 also runs on **opus** - false negatives drop real findings, so this is the wrong place to economize. @@ -59,6 +63,7 @@ These are defaults; if a project's lens is unusually subtle (e.g. obscure embedd ### Default lenses Each lens prompt must include: + - One-paragraph project summary (language, domain, what the code does). - The scope: exact file/directory list the agent must read. - The lens's concrete focus (see below). @@ -138,64 +143,92 @@ Classify each confirmed finding into one of four tiers: This phase is **classification only**. Do NOT apply any fixes here, do NOT edit `TODO.md` here. Recording happens in the next phase, after the user has seen the proposed plan. -## Phase 5: Report and pause +## Phase 5: Report and apply tier by tier -Present a single structured report to the user. Layout: +Don't dump every tier at once. The user shouldn't have to scroll back through a wall of findings to track decisions. Walk through one tier at a time: present, get approval, apply, commit, gate, then move to the next. + +### Set up internal tracking + +Before presenting anything, use `TaskCreate` to record one task per non-empty tier in the order below. The full finding set lives in those tasks, so you can hold detail internally and surface only the active tier to the user. Mark each tier's task complete as you finish it. + +### Tier order + +1. **Suggested backlog additions** - lock these in first. A single `TODO.md` append is cheap and ensures nothing is lost if a later code change goes sideways. +2. **Trivial fixes** - grouped by theme (e.g. "use existing helpers", "drop dead code"), one commit per theme. +3. **Substantive fixes** - one commit per logical change. Commit message explains the why. +4. **Needs discussion** - present each as: issue, two options, tradeoff. Apply only if the user gives specific direction. + +Skip any tier that has zero items. + +### Opening the report + +Only on the first non-empty tier, lead with a single summary line: ``` ## Audit findings -Ran lens(es) over . raw findings, confirmed by validation. - -### Trivial fixes (will apply on go-ahead) -- file.rs:42 - . Fix: . -- ... - -### Substantive fixes (will apply on go-ahead, separate commits) -- file.rs:120 - . Fix: . -- ... - -### Needs discussion (no action yet, want your read) -- file.rs:220 - . Two options: (a)