Skip to main content

Claude Design Pilot — One-Pager

Why now

Across five feedback passes (2026-03-26 → 2026-04-10), Sergey has said the same thing with increasing sharpness: the product reads like a telemetry dashboard; it needs to read like a remediation decision tool. Every correction he has made has been about narrative architecture — not features, not tech. The April 10 AWS demo review made the failure mode concrete: 13 paths surfaced instead of 4 stories, KPI strips instead of "What Happened" + "Am I Exposed?" prose, and a missing middle layer between Overview and Access Chain detail.

Anthropic Labs launched Claude Design on 2026-04-17. Its differentiators relative to Figma are (1) codebase-aware design-system ingestion, (2) Opus 4.7 conversational iteration with comment/slider refinement, and (3) a Claude Code handoff bundle. If those claims hold on a real TypeScript + Tailwind v4 + TanStack Query codebase, it collapses our slowest loop: narrative-layout iteration with the founder.

Hypothesis

Claude Design can produce token-accurate Brief / Strategic Context / access-chain prototypes that Sergey reviews in his comment flow instead of Slack — and the resulting Claude Code handoff lands as a mergeable PR in hours, not days.

Scope — in

  1. Enable Claude Design for the SecurityV0 org; ingest sv0-platform (tokens in ui/src/index.css, components in ui/src/components/, UX rules in UX-GUIDE.md).
  2. Fidelity check: reproduce the current Brief page using only our tokens. Zero arbitrary values (mt-[17px], text-[#abc123]).
  3. Generate prototypes:
    • 3× Brief variants for the AWS demo "Drifted Sensitive Access" cluster, using Sergey's verbatim copy from §5 of the 2026-04-10 feedback (What Happened + Am I Exposed?).
    • 1× Strategic Context panel emphasizing posture compression over telemetry counts.
    • 1× Access-chain narrative header (1–2 sentence summary above the path visualization).
  4. Founder review via Claude Design's inline comments / sliders.
  5. Export winning variant via Claude Code handoff → PR with visual-verification evidence (visual-qa.ts, visual-diff-report.ts --agent-report).

Scope — out

  • Evaluator prioritization fixes (issue #334).
  • Label semantics / llm_egress fan-out (issue #333).
  • Whole-product redesign or replacing Figma org-wide.
  • Copy drafting. Copy remains authored in-repo / in docs; Claude Design is a layout lab, not a CMS.
  • Making Likely language changes (still under founder review).

Success criteria — all must hold

  1. Token fidelity. Fidelity-check output uses only tokens defined in ui/src/index.css; no arbitrary Tailwind values.
  2. Founder signal. Sergey approves ≥1 Brief variant's shape ("ship this") or leaves actionable inline feedback that round-trips cleanly through the tool.
  3. Handoff quality. Claude Code handoff → reviewer-landable PR in ≤4 hours of manual cleanup.
  4. Visual gate. visual-diff-report.ts --agent-report returns PASS on affected pages; no regressions elsewhere.

Failure criteria — any kills the pilot

  • Tokens ignored; output invents colors, spacing, or typography.
  • Handoff requires >8 hours of normalization to fit TanStack Query + React Router v7 + Tailwind v4 patterns.
  • Sergey rejects all variants with structural feedback that does not round-trip through the tool.

Timeline — 1 engineer, 1 week

DayDeliverable
1Org access enabled. Codebase ingested. Fidelity-check screenshot vs. current Brief.
2–3Five prototypes generated (3 Brief + 1 Strategic Context + 1 Access Chain header).
4Internal review → Sergey review in Claude Design.
5Winning variant → Claude Code handoff → PR with visual diff report. Verdict filled below.

Guardrails

  • No production paths redesigned speculatively. Only the three surfaces above.
  • UX-GUIDE is the source of truth. Any prompt to Claude Design must cite UX-GUIDE.md §3 (Four Questions) and §4 (Three-Zone Model). If the tool's output violates either, that's a failure signal, not a design opinion.
  • No copy invention. Narrative copy uses Sergey's verbatim text from the 2026-04-10 feedback. If a variant needs new copy, the human writes it, not the tool.
  • Determinism boundary. Claude Design may draft; humans approve. No generated copy ships without review.
  • Memory hygiene. Pilot outputs and project state live in Claude Design only during the pilot. Winning patterns are codified back into UX-GUIDE.md and the component library.

Exit artifacts

  1. This spec with the Verdict section (below) filled in.
  2. PR with the winning variant — or a short write-up on why no variant won.
  3. If Go: a 3-line update to UX-GUIDE.md stating when to reach for Claude Design vs. hand-coding.
  4. If No-Go: a memory entry capturing the specific failure mode so we don't re-pilot on false hope.
  • Tracking issue: SecurityV0/sv0-platform#410.
  • Platform issues addressed in parallel (out of scope here): #313 (Brief narrative structure), #334 (overview story ranking), #333 (llm_egress fan-out).
  • UX canon: sv0-platform/UX-GUIDE.md — sections 3 (Four Questions), 4 (Three-Zone Model), 5.3 (Detail Page).
  • Feedback trail: 2026-03-26, 2026-03-31, 2026-04-03, 2026-04-07, 2026-04-10.

Verdict — fill at pilot end

  • Go / No-Go / Scoped-continue
  • Token-fidelity result:
  • Founder-signal result:
  • Handoff-quality result (hours of cleanup):
  • Visual-gate result:
  • Notes / follow-ups: