Claude Design Pilot — One-Pager

Why now

Across five feedback passes (2026-03-26 → 2026-04-10), Sergey has said the same thing with increasing sharpness: the product reads like a telemetry dashboard; it needs to read like a remediation decision tool. Every correction he has made has been about narrative architecture — not features, not tech. The April 10 AWS demo review made the failure mode concrete: 13 paths surfaced instead of 4 stories, KPI strips instead of "What Happened" + "Am I Exposed?" prose, and a missing middle layer between Overview and Access Chain detail.

Anthropic Labs launched Claude Design on 2026-04-17. Its differentiators relative to Figma are (1) codebase-aware design-system ingestion, (2) Opus 4.7 conversational iteration with comment/slider refinement, and (3) a Claude Code handoff bundle. If those claims hold on a real TypeScript + Tailwind v4 + TanStack Query codebase, it collapses our slowest loop: narrative-layout iteration with the founder.

Hypothesis

Claude Design can produce token-accurate Brief / Strategic Context / access-chain prototypes that Sergey reviews in his comment flow instead of Slack — and the resulting Claude Code handoff lands as a mergeable PR in hours, not days.

Scope — in

Enable Claude Design for the SecurityV0 org; ingest sv0-platform (tokens in ui/src/index.css, components in ui/src/components/, UX rules in UX-GUIDE.md).
Fidelity check: reproduce the current Brief page using only our tokens. Zero arbitrary values (mt-[17px], text-[#abc123]).
Generate prototypes:
- 3× Brief variants for the AWS demo "Drifted Sensitive Access" cluster, using Sergey's verbatim copy from §5 of the 2026-04-10 feedback (What Happened + Am I Exposed?).
- 1× Strategic Context panel emphasizing posture compression over telemetry counts.
- 1× Access-chain narrative header (1–2 sentence summary above the path visualization).
Founder review via Claude Design's inline comments / sliders.
Export winning variant via Claude Code handoff → PR with visual-verification evidence (visual-qa.ts, visual-diff-report.ts --agent-report).

Scope — out

Evaluator prioritization fixes (issue #334).
Label semantics / llm_egress fan-out (issue #333).
Whole-product redesign or replacing Figma org-wide.
Copy drafting. Copy remains authored in-repo / in docs; Claude Design is a layout lab, not a CMS.
Making Likely language changes (still under founder review).

Success criteria — all must hold

Token fidelity. Fidelity-check output uses only tokens defined in ui/src/index.css; no arbitrary Tailwind values.
Founder signal. Sergey approves ≥1 Brief variant's shape ("ship this") or leaves actionable inline feedback that round-trips cleanly through the tool.
Handoff quality. Claude Code handoff → reviewer-landable PR in ≤4 hours of manual cleanup.
Visual gate. visual-diff-report.ts --agent-report returns PASS on affected pages; no regressions elsewhere.

Failure criteria — any kills the pilot

Tokens ignored; output invents colors, spacing, or typography.
Handoff requires >8 hours of normalization to fit TanStack Query + React Router v7 + Tailwind v4 patterns.
Sergey rejects all variants with structural feedback that does not round-trip through the tool.

Timeline — 1 engineer, 1 week

Day	Deliverable
1	Org access enabled. Codebase ingested. Fidelity-check screenshot vs. current Brief.
2–3	Five prototypes generated (3 Brief + 1 Strategic Context + 1 Access Chain header).
4	Internal review → Sergey review in Claude Design.
5	Winning variant → Claude Code handoff → PR with visual diff report. Verdict filled below.

Guardrails

No production paths redesigned speculatively. Only the three surfaces above.
UX-GUIDE is the source of truth. Any prompt to Claude Design must cite UX-GUIDE.md §3 (Four Questions) and §4 (Three-Zone Model). If the tool's output violates either, that's a failure signal, not a design opinion.
No copy invention. Narrative copy uses Sergey's verbatim text from the 2026-04-10 feedback. If a variant needs new copy, the human writes it, not the tool.
Determinism boundary. Claude Design may draft; humans approve. No generated copy ships without review.
Memory hygiene. Pilot outputs and project state live in Claude Design only during the pilot. Winning patterns are codified back into UX-GUIDE.md and the component library.

Exit artifacts

This spec with the Verdict section (below) filled in.
PR with the winning variant — or a short write-up on why no variant won.
If Go: a 3-line update to UX-GUIDE.md stating when to reach for Claude Design vs. hand-coding.
If No-Go: a memory entry capturing the specific failure mode so we don't re-pilot on false hope.

Tracking issue: SecurityV0/sv0-platform#410.
Platform issues addressed in parallel (out of scope here): #313 (Brief narrative structure), #334 (overview story ranking), #333 (llm_egress fan-out).
UX canon: sv0-platform/UX-GUIDE.md — sections 3 (Four Questions), 4 (Three-Zone Model), 5.3 (Detail Page).
Feedback trail: 2026-03-26, 2026-03-31, 2026-04-03, 2026-04-07, 2026-04-10.

Why now​

Hypothesis​

Scope — in​

Scope — out​

Success criteria — all must hold​

Failure criteria — any kills the pilot​

Timeline — 1 engineer, 1 week​

Guardrails​

Exit artifacts​

Related​

Verdict — fill at pilot end​