Multi-Perspective Platform Review — Round 2, March 2026
Executive Summary
SecurityV0's second MPAS-7 review cycle — the first with actual visual input — reveals a platform that improved in depth but regressed in surface quality. The evidence engine retains distinct capabilities (SHA256 integrity-hashed evidence packs, named departed owners, deterministic explanations). New pages (Data Domains, Execution Chains) are strong additions. Impact scores were correctly removed per Sergey's decision (PR #89). The CRITICAL security finding (JWT verification) is resolved by production auth configuration.
But two broken pages are now the #1 issue. The scope_drift_sensitive cluster and the Exposure Detail page return errors that any demo stakeholder will encounter within 3-4 clicks. The four Phase 0 blockers from the consolidated action plan — remediation naming specific objects, authority path role visibility, breadcrumb hash IDs, and compliance mapping — remain substantially unresolved.
Net result: 1 of 7 MPAS-7 targets met (Security Auditor). The platform is ready for internal demo, not for design partner demo.
What Changed Between Rounds
| Dimension | Round 1 (Mar 15) | Round 2 (Mar 19) | Change |
|---|---|---|---|
| Review method | Code/API analysis only | 35 visual screenshots + code audit | First visual review |
| Impact scores | Inverted/misleading (CRITICAL bug) | Removed entirely (PR #89) | Fixed |
| Path detail | Text-only path description | Visual execution diagram + risk badges | Major improvement |
| Remediation | Generic ("Restrict LLM endpoint access") | Partially named ("svc-foundry-ascribe-prod > LLM Egress") | Partial improvement |
| New pages | N/A | Data Domains, Execution Chains, Identities enriched | Strong additions |
| Finding triage | No workflow actions | Acknowledge + Mark False Positive buttons | New capability |
| Broken pages | 0 | 2 (scope_drift cluster, Exposure Detail) | Regression |
| Security CRITICAL | 1 (JWT not verified) | 0 (production uses API key auth) | Target met |
| Cluster detail structure | Sections A-D Exposure Brief | Flat table (per UX Critic) | Regression |
MPAS-7 Scores
| Role | Round 1 (Mar 15) | Round 2 (Mar 19) | Target | Delta | Met? |
|---|---|---|---|---|---|
| CISO Executive | 70% | 68% | ≥85% | -2% | No |
| SecOps Analyst | 70% | 74% | ≥80% | +4% | No |
| Product QA | 8 partial, 2 missing | 6 partial, 1 missing, 2 diverged | ≤2 partial, 0 missing | Improved | No |
| UX Critic | B- / 23 terms | B / 19 terms | A- / ≤5 terms | +1 grade / -4 terms | No |
| Security Auditor | 1 CRITICAL, 3 HIGH | 0 CRITICAL, 2 HIGH | 0 critical | CRITICAL resolved | Yes |
| Enterprise Executive | 1.8/5 | 2.1/5 | ≥3.5/5 | +0.3 | No |
| CEO Reviewer | 18/28 (64%) | ~19/28 (68%) | ≥24/28 (86%) | +1 item | No |
Why the CISO score went down: The CISO reviewer saw what the code-only review could not — two broken detail pages, hash IDs rendering in breadcrumbs, and the unchanged cluster card hierarchy (path count still dominates over verdict sentence). The visual review was harder on presentation quality than the code review was.
Review Team & Reports
| Reviewer | Verdict | Key Finding | Report |
|---|---|---|---|
| CISO Executive | NEEDS WORK (68%) | "Improvements are below the fold; regressions are at click distance." Broken pages and unchanged stat cards offset path detail improvements. | Full report |
| SecOps Analyst | NEEDS WORK (74%) | New pages (Data Domains, Chains, Identities) improve investigation. "What changed since yesterday" still #1 blocker. Two broken pages are regressions. | Full report |
| Product QA | Internal demo ready | 19 implemented (+2), 6 partial (-2), 1 missing (-1), 2 diverged (+1). Mixed remediation specificity "arguably worse than uniformly generic." | Full report |
| UX Critic | Grade B / 19 terms | Cluster detail lost Sections A-D Exposure Brief — "the most significant regression." Breadcrumbs, orphan pages, jargon unchanged. | Full report |
| Security Auditor | 0 CRITICAL (target met) | JWT reclassified to HIGH (Bearer auth not active). Tenant isolation now enforced. Rate limiting added. New finding: PATCH /findings lacks Zod validation. | Full report |
| Enterprise Executive | 2.1/5 (NEEDS REWRITE) | Compliance mapping absent (single largest score drag). Partner rewrite down from 60-70% to 50-60%. "Two lowest-effort, highest-impact items remain unshipped." | Full report |
| CEO Reviewer | YES WITH CHANGES | Data Domains is strongest new addition. 4 blocking fixes before partner demo: broken cluster, named remediation, breadcrumbs, broken exposure detail. | Full report |
Sergey's 28 Feedback Items — Status After Round 2
Round 1 review tracked 28 inline feedback items from Sergey. Here is their status after Round 2:
| Status | Count (R1) | Count (R2) | Change |
|---|---|---|---|
| DONE | 1 | 3 | +2 (impact scores removed, risk reducer sorted list, operational detail scope control) |
| Partial | 0 | 5 | +5 (some progress on multiple items) |
| Not Done | 18 (ACCEPTED) | 9 | -9 (5 moved to partial, 4 remain accepted-not-started) |
| DEFERRED | 4 | 4 | No change (correctly held) |
| OPEN QUESTION | 5 | 5 | No change (4 pending decisions still open) |
Key Items Resolved
| # | Item | Resolution |
|---|---|---|
| 14 | "Remove scores entirely" | DONE — PR #89 shipped. Impact bars gone. Remediation renders as sorted list. |
| 25 | "Don't push operational details where they don't belong" | DONE — Cluster view stays at cluster level, path details stay in path view. |
| 20 | "Drop effort/cost estimates if unreliable" | DONE — No effort estimates visible anywhere. |
Key Items Still Blocked
| # | Item | Status | Blocker |
|---|---|---|---|
| 1 | "CISO/SI pull data into presentations" | Partial | No export capability |
| 3 | "Show impact of both problem AND solution" | Not done | Remediation still doesn't show business impact of fix |
| 7 | "Day-1 analyst productivity / Wiz-like simplicity" | Partial | Broken pages, no "what changed" filter |
| 9 | "WOW effect without 3 clicks" | Partial | Data Domains is a WOW page; broken cluster is anti-WOW |
| 12 | "Plain English labels instead of ABC grades" | Not done | Execution confidence labels not implemented |
| 18 | "Create Ticket — wire ServiceNow" | Not done | Not started |
| 21 | "Channel repackages on own paper; executive output critical" | Not done | No export, no report generator yet |
Pending Decisions (Unchanged from Round 1)
| # | Decision | Impact |
|---|---|---|
| 1 | Delta badges on Overview — keep, contextualize, or remove? | Still showing +838% type badges |
| 2 | "Authority path" terminology — understood by SIs/CISOs? | Term used unchanged throughout |
| 3 | Global risk ranking on Overview — top-3 absolute risks? | Not implemented |
| 4 | Evidence pack definition — jargon-free one-liner | Not written |
What the Research Tells Us
The AutoResearchClaw study (March 18-19, 23 stages, 50 synthetic NHI scenarios, 7 simulated reviewer personas) produced five findings directly applicable to our forward plan. Full summary: Research Findings Summary.
Finding 1: Fix the data first, UI second
Data completeness fixes (filling empty target_resource, fixing count discrepancies, populating added_roles) are projected to improve technical reviewer acceptance by +11 to +14 points with zero UI changes. Effects are multiplicative — presentation on bad data doesn't stick.
Round 2 validation: The security auditor confirmed that data quality items from Phase 3 (bySeverity/byType page-scoped, identity labels showing raw IDs) remain open. The broken scope_drift_sensitive cluster is itself a data/routing issue. The research's "engine first" ordering is confirmed correct.
Finding 2: Platform and report are different products
Analyst trust requires full evidence and drill-down. Executive confidence requires opinionated verdicts and business language. Serving both from the same screen creates structural conflict. The projected improvement from a two-output architecture: +13 points combined acceptance, partner rewrite dropping from 60-65% to 15-20%.
Round 2 validation: The enterprise executive scored 2.1/5, confirming the gap. The assessment: "the data foundation remains strong; the executive wrapper is still what is missing." The report generator (Phase 4) remains the correct architectural answer.
Finding 3: Opinionated reports outsell rich ones ("legibility inversion")
Non-technical buyers rate single-verdict opinionated reports ~2 points higher on purchase intent than analytically rich formats. Even a collapsed methodology appendix hurts — it signals complexity and shifts the buyer's frame from "expert recommendation" to "complex analysis."
Application: Assessment report template (Phase 4.3) must NOT include a methodology appendix. The cluster verdict sentences — which all 7 reviewers independently validated as the platform's strongest element — are the right pattern. Extend downward.
Finding 4: Implementation order is structurally required
The three improvement categories (data quality, presentation clarity, report generation) interact multiplicatively:
- Engine completeness alone: +9 points
- Architecture split alone: +7 points
- Both together: +19 points (not +16 — interaction adds +3)
Round 2 validation: This matches our observation. Remediation is partially named (presentation improvement), but broken pages (data/routing issue) undermine the improvement. The cluster detail lost its Exposure Brief structure (presentation regression on top of data issues). Phase ordering matters.
Finding 5: NHI governance is a new category — decisions now define norms
NHI governance emerged as a distinct category in 2025 (OWASP NHI Top 10, CSA MAESTRO). SecurityV0 is among the first platforms. The compliance mapping we add now (Phase 1.3 / Phase 4.1) becomes the reference mapping others adopt.
Implication: Compliance mapping isn't just a partner enablement feature — it's a category-defining move. Low effort, high strategic value.
Cross-Cutting Issues (Found by 3+ Reviewers)
These issues were independently flagged by multiple review agents across both rounds:
| Issue | R1 Reviewers | R2 Reviewers | Status |
|---|---|---|---|
| Broken pages (scope_drift cluster, Exposure Detail) | — | All 7 | NEW regression |
| Remediation missing object names | CISO, QA, SecOps, EE | CISO, QA, SecOps, EE, CEO | Partially improved |
| Breadcrumbs show hash IDs | QA, UX | CISO, QA, UX, EE, CEO | Unchanged |
| Compliance mapping absent | EE | EE, CEO | Unchanged |
| No "what changed" filter | SecOps | SecOps, CEO | Unchanged |
| Finding descriptions contain hex IDs | QA, UX | QA, UX, CISO, CEO | Unchanged |
| Export/PDF disabled | CISO, SecOps, EE | CISO, SecOps, EE, CEO | Unchanged |
| Secondary stat cards are inventory, not risk | CISO | CISO, UX | Unchanged |
| Governance label deduplication | QA, UX | — | Not retested (cluster structure changed) |
| Authority path role collapsing | CISO, QA, UX | QA | Partial improvement |
What Improved (Confirmed Across Both Rounds)
| Improvement | Evidence | Which Reviewers Confirmed |
|---|---|---|
| Impact scores removed (PR #89) | No score bars anywhere. Sorted list conveys priority. | All 7 |
| Path detail execution diagram | Visual workload → identity → destination chain with role labels | CISO, SecOps, QA, CEO |
| "Top Risk Reducers" section | Ordered remediation actions with risk condition references | All 7 |
| Data Domains page (new) | 7 domains, 27 resources, sensitivity classifications (restricted/confidential) | SecOps, QA, UX, CEO ("strongest new addition") |
| Execution Chains page (new) | 6 chains with egress, ownership, sensitivity columns | SecOps, QA, UX, CEO |
| Finding triage buttons (new) | "Acknowledge" and "Mark False Positive" on finding detail | SecOps, QA |
| Ownership shows real names | "Maria Lopez" as departed owner, "Not assigned" for current | CISO, QA, SecOps, CEO |
| Tenant isolation enforced | requireTenant middleware on all /api/v1 routes | Security Auditor |
| Rate limiting implemented | Two-tier per-tenant rate limiting | Security Auditor |
| Legacy Dashboard removed | /dashboard no longer exists; RG1-RG5 jargon gone | UX Critic |
| Identities table enriched | "Sensitive Domains" column shows data impact per identity | SecOps, UX, CEO |
What Regressed
| Regression | Severity | How Detected |
|---|---|---|
scope_drift_sensitive cluster broken — "Risk cluster is disabled" with raw internal key in error | CRITICAL (demo blocker) | Visual review — Round 1 couldn't detect this (code-only) |
| Exposure Detail "Entity not found" — EXP-hash format doesn't resolve to entity ID | CRITICAL (demo blocker) | Visual review — new page, new bug |
| Cluster detail lost Exposure Brief structure — Sections A-D (narrative, governance, remediation) replaced by flat table | HIGH | UX Critic — "most significant regression from Round 1" |
Finding breadcrumb worse — shows full eval:05d2c303... prefix instead of truncation | MEDIUM | QA, UX |
| "backstack" placeholder text in path detail breadcrumb | LOW | UX Critic |
Release Readiness
| Level | Round 1 | Round 2 | Blocker |
|---|---|---|---|
| Internal demo | Ready | Ready | — |
| Design partner demo | Not ready | Not ready | Broken pages, remediation naming, breadcrumbs, navigation orphans |
| Broader pilot | Not ready | Not ready | Also needs export, compliance mapping, "what changed" filter, Create Ticket |
The Path Forward
The consolidated action plan's phase ordering remains correct, validated by both the research (multiplicative effects: engine → presentation → reports) and Round 2 results (presentation on broken data doesn't stick).
Immediate: Fix Demo Blockers (1-2 sessions)
These are P0 — fix before showing to anyone outside the team:
| # | Fix | Effort | Impact |
|---|---|---|---|
| 1 | Fix scope_drift_sensitive cluster — either fix rendering or remove from cluster list. Never expose internal keys in errors. | <1 session | Removes the #1 demo-killing regression |
| 2 | Fix Exposure Detail — resolve EXP-hash → entity ID mapping, or remove clickable links from list. | <1 session | Removes the #2 demo-killing regression |
This Sprint: Close the Phase 0 Gap (3-5 sessions)
These are the items Sergey explicitly flagged. They were planned, partially started, and must be completed:
| # | Fix | Effort | Impact | Research Validation |
|---|---|---|---|---|
| 3 | Complete remediation object naming (Phase 0.1) — every action must specify which role, which identity, which system. The entityContext is available; pipe it consistently. | 2-3 sessions | Unblocks partner demos. Resolves the #1 cross-cutting issue across 5 of 7 reviewers. | Engine completeness: projected +11-14pp for technical reviewers |
| 4 | Fix breadcrumbs (Phase 2.3) — implement useBreadcrumbLabel() that resolves IDs to display names using already-fetched page data. One pass covers all routes. | 1 session | Removes the "developer tool" perception that 5 of 7 reviewers flagged. | |
| 5 | Add compliance mapping (Phase 1.3 / 4.1) — static deterministic lookup: orphaned_ownership → OWASP ASI-03, NIST AC-2; scope_drift → ASI-10, NIST AC-6; llm_egress → ASI-02, NIST SC-7. | 1-2 sessions | Enterprise Executive's single largest score drag. Category-defining for NHI governance space. | Research Finding 5: decisions now define category norms |
This Sprint: CISO Clarity (5-7 sessions)
Phase 1 items from the consolidated action plan — still valid, still unstarted:
| # | Fix | Effort |
|---|---|---|
| 6 | Invert visual hierarchy on cluster cards (verdict dominant, path count secondary) | Low |
| 7 | Add execution confidence labels ("Execution Confirmed" / "Standing Authority Only") | Low |
| 8 | Replace secondary stat cards with business metrics (Sensitive Domains Reached, Departed Owners, LLM Endpoints) | Low |
| 9 | Promote highest-risk path + global top-3 on Overview | Low-Medium |
| 10 | Add "What changed since yesterday" filter (?changed_since + "New since last visit" section) | Medium |
| 11 | Add Execution Chains, Findings, and Exposures to sidebar navigation | Low |
Next Sprint: Reports & Partner Deliverables (Phase 4)
The research confirms this is the correct next major investment — but only after data quality and presentation clarity:
| # | Item | Effort | Research Constraint |
|---|---|---|---|
| 12 | Report Service + Store (Phase 4.2) — two API families: full-fidelity (platform) and pre-synthesized (reports) | 4-6 sessions | Report generator must NOT access raw evidence (legibility inversion) |
| 13 | Assessment Report template (Phase 4.3) — 5 sections, NO methodology appendix | 2-3 sessions | Collapsed appendix hurts purchase intent by ~0.4 points |
| 14 | Scan Digest (Phase 4.3) — 1-page post-scan summary | 1-2 sessions | — |
| 15 | Business glossary (src/lib/business-glossary.ts) — used ONLY in report rendering | 1 session | "Service principal" → "automated account" only in reports, not platform |
Deferred (Correctly Held)
- Posture trend chart — research first (Sergey #28)
- Create Ticket / ServiceNow — accepted but lower priority than export
- Ownership inheritance logic — needs design
- Cross-source divergence scoring — requires 4+ connectors (research Finding 4)
Projected Scores After Fixes
If items 1-11 are completed:
| Role | Current | After Fixes | Target | Gap |
|---|---|---|---|---|
| CISO | 68% | ~82-85% | ≥85% | At or near target |
| SecOps | 74% | ~80-82% | ≥80% | Met |
| Product QA | 6P/1M/2D | ~2P/0M/0D | ≤2P/0M | Met |
| UX Critic | B/19 | ~B+/10-12 | A-/≤5 | Jargon requires terminology standardization (Phase 5.6) |
| Security Auditor | 0 CRITICAL | 0 CRITICAL | 0 CRITICAL | Already met |
| Enterprise Exec | 2.1/5 | ~3.0-3.2/5 | ≥3.5/5 | Report generator (Phase 4) needed for final gap |
| CEO | 19/28 | ~23-24/28 | ≥24/28 | At or near target |
The research projects that completing Phase 4 (reports) on top of items 1-11 would push Enterprise Executive to 3.5-4.0/5 and CEO to 25-26/28 — meeting all 7 targets.
What to Demo Today
The product's strongest demo path (updated from Round 1):
- Overview → 769 executions, 29 authority paths, 19 with invalid ownership. Four risk cluster cards with verdict sentences.
- Data Domains → 7 domains, 27 resources with sensitivity labels. "These are the sensitive systems your automated accounts can reach." This is the new WOW page.
- Click "Orphaned + Sensitive" cluster → Verdict sentence + 13-path authority table with data domains, sensitivity, egress badges.
- Click into Agent Ascribe_Summarizer path → Visual execution diagram (workload → identity → destination), risk condition tiles (Scope drift, Invalid owner, Sensitive data, LLM egress), "Top Risk Reducers" sorted list.
- Scroll to Ownership → "Maria Lopez" (departed), "Not assigned." Named accountability.
- Execution Chains → 6 chains showing cross-system risk at workload level.
What to avoid: Exposure Detail (broken), scope_drift_sensitive cluster (broken), Finding Detail breadcrumbs (hash IDs), Temporal Comparison (empty), Graph Explorer (overwhelming for non-analysts).
Review Process Notes
What Was Different About Round 2
This was the first review cycle under the hardened review process:
- Visual snapshots captured before review — 35 screenshots (25 pages + 10 entity detail captures) from the running platform, stored in
sv0-intelligence/store/snapshots/2026-03-19-demo-w1/. - All 7 agents received screenshot context — agents evaluated the rendered UI, not just source code and API responses. This caught regressions invisible to code review (broken pages, hash IDs in breadcrumbs, lost Exposure Brief structure).
- MPAS-7 scores tracked — first baseline in the structured scoring format.
- Parallel execution — all 7 reviewers ran simultaneously instead of sequentially.
What the Visual Review Revealed That Code Review Missed
The CISO score dropped from 70% to 68% precisely because the visual review detected presentation issues invisible to code analysis:
- Two broken pages (routing/data issues that only manifest when rendered)
- Hash IDs in breadcrumbs (the
eval:prefix is particularly bad — 40-character string in the nav bar) - Cluster card visual hierarchy (path count visually dominates verdict sentence — a CSS issue, not a data issue)
- "backstack" placeholder text in path detail breadcrumb
- Lost Exposure Brief structure (replaced by flat table — visible only in rendered output)
This validates the hardened process: visual review is harder but more honest than code review.
Appendix: Research Artifacts
| Document | Purpose |
|---|---|
| Research Findings Summary | CEO-readable summary of the AutoResearchClaw study — what was found and what we're using |
| Full Research Paper | 23-stage academic paper — hypotheses, literature synthesis, experiment design |
| FACET Paper | Full FACET framework paper |
| Acceptance Validation Research Brief | Proposed next research cycle — simulation-based validation of 18 proposed changes |
| Consolidated Action Plan | Single source of truth for what to build next (incorporates research findings) |
| Sergey Feedback Tracker | All 28 feedback items with implementation status |
Review Caveats
Visual-only limitation: In CLI mode (used for this review), screenshots are listed in the prompt but not sent as base64 images. However, each sub-agent independently loaded and viewed the screenshot files using the Read tool, achieving true visual review. The had_visual_input field in run.json reflects this.
Broken pages may be seed data issue: The scope_drift_sensitive cluster error and Exposure Detail "Entity not found" may be specific to the demo-w1 tenant seed data rather than code bugs. Either way, they must be fixed — the demo environment IS the product for evaluation purposes.
UX Critic's Exposure Brief regression: The UX Critic observed that cluster detail pages lost the Sections A-D structure. This may reflect a UI refactor that traded narrative synthesis for data density. Whether this is a regression or an intentional change needs clarification — the cluster detail table is richer than Round 1, but the CISO narrative layer is gone.
Score projections are estimates: The "Projected Scores After Fixes" section uses directional estimates informed by research projections and reviewer feedback, not empirical measurements. Actual scores will be validated in Round 3.