Skip to main content

Multi-Perspective Platform Review — Round 2, March 2026


Executive Summary

SecurityV0's second MPAS-7 review cycle — the first with actual visual input — reveals a platform that improved in depth but regressed in surface quality. The evidence engine retains distinct capabilities (SHA256 integrity-hashed evidence packs, named departed owners, deterministic explanations). New pages (Data Domains, Execution Chains) are strong additions. Impact scores were correctly removed per Sergey's decision (PR #89). The CRITICAL security finding (JWT verification) is resolved by production auth configuration.

But two broken pages are now the #1 issue. The scope_drift_sensitive cluster and the Exposure Detail page return errors that any demo stakeholder will encounter within 3-4 clicks. The four Phase 0 blockers from the consolidated action plan — remediation naming specific objects, authority path role visibility, breadcrumb hash IDs, and compliance mapping — remain substantially unresolved.

Net result: 1 of 7 MPAS-7 targets met (Security Auditor). The platform is ready for internal demo, not for design partner demo.

What Changed Between Rounds

DimensionRound 1 (Mar 15)Round 2 (Mar 19)Change
Review methodCode/API analysis only35 visual screenshots + code auditFirst visual review
Impact scoresInverted/misleading (CRITICAL bug)Removed entirely (PR #89)Fixed
Path detailText-only path descriptionVisual execution diagram + risk badgesMajor improvement
RemediationGeneric ("Restrict LLM endpoint access")Partially named ("svc-foundry-ascribe-prod > LLM Egress")Partial improvement
New pagesN/AData Domains, Execution Chains, Identities enrichedStrong additions
Finding triageNo workflow actionsAcknowledge + Mark False Positive buttonsNew capability
Broken pages02 (scope_drift cluster, Exposure Detail)Regression
Security CRITICAL1 (JWT not verified)0 (production uses API key auth)Target met
Cluster detail structureSections A-D Exposure BriefFlat table (per UX Critic)Regression

MPAS-7 Scores

RoleRound 1 (Mar 15)Round 2 (Mar 19)TargetDeltaMet?
CISO Executive70%68%≥85%-2%No
SecOps Analyst70%74%≥80%+4%No
Product QA8 partial, 2 missing6 partial, 1 missing, 2 diverged≤2 partial, 0 missingImprovedNo
UX CriticB- / 23 termsB / 19 termsA- / ≤5 terms+1 grade / -4 termsNo
Security Auditor1 CRITICAL, 3 HIGH0 CRITICAL, 2 HIGH0 criticalCRITICAL resolvedYes
Enterprise Executive1.8/52.1/5≥3.5/5+0.3No
CEO Reviewer18/28 (64%)~19/28 (68%)≥24/28 (86%)+1 itemNo

Why the CISO score went down: The CISO reviewer saw what the code-only review could not — two broken detail pages, hash IDs rendering in breadcrumbs, and the unchanged cluster card hierarchy (path count still dominates over verdict sentence). The visual review was harder on presentation quality than the code review was.


Review Team & Reports

ReviewerVerdictKey FindingReport
CISO ExecutiveNEEDS WORK (68%)"Improvements are below the fold; regressions are at click distance." Broken pages and unchanged stat cards offset path detail improvements.Full report
SecOps AnalystNEEDS WORK (74%)New pages (Data Domains, Chains, Identities) improve investigation. "What changed since yesterday" still #1 blocker. Two broken pages are regressions.Full report
Product QAInternal demo ready19 implemented (+2), 6 partial (-2), 1 missing (-1), 2 diverged (+1). Mixed remediation specificity "arguably worse than uniformly generic."Full report
UX CriticGrade B / 19 termsCluster detail lost Sections A-D Exposure Brief — "the most significant regression." Breadcrumbs, orphan pages, jargon unchanged.Full report
Security Auditor0 CRITICAL (target met)JWT reclassified to HIGH (Bearer auth not active). Tenant isolation now enforced. Rate limiting added. New finding: PATCH /findings lacks Zod validation.Full report
Enterprise Executive2.1/5 (NEEDS REWRITE)Compliance mapping absent (single largest score drag). Partner rewrite down from 60-70% to 50-60%. "Two lowest-effort, highest-impact items remain unshipped."Full report
CEO ReviewerYES WITH CHANGESData Domains is strongest new addition. 4 blocking fixes before partner demo: broken cluster, named remediation, breadcrumbs, broken exposure detail.Full report

Sergey's 28 Feedback Items — Status After Round 2

Round 1 review tracked 28 inline feedback items from Sergey. Here is their status after Round 2:

StatusCount (R1)Count (R2)Change
DONE13+2 (impact scores removed, risk reducer sorted list, operational detail scope control)
Partial05+5 (some progress on multiple items)
Not Done18 (ACCEPTED)9-9 (5 moved to partial, 4 remain accepted-not-started)
DEFERRED44No change (correctly held)
OPEN QUESTION55No change (4 pending decisions still open)

Key Items Resolved

#ItemResolution
14"Remove scores entirely"DONE — PR #89 shipped. Impact bars gone. Remediation renders as sorted list.
25"Don't push operational details where they don't belong"DONE — Cluster view stays at cluster level, path details stay in path view.
20"Drop effort/cost estimates if unreliable"DONE — No effort estimates visible anywhere.

Key Items Still Blocked

#ItemStatusBlocker
1"CISO/SI pull data into presentations"PartialNo export capability
3"Show impact of both problem AND solution"Not doneRemediation still doesn't show business impact of fix
7"Day-1 analyst productivity / Wiz-like simplicity"PartialBroken pages, no "what changed" filter
9"WOW effect without 3 clicks"PartialData Domains is a WOW page; broken cluster is anti-WOW
12"Plain English labels instead of ABC grades"Not doneExecution confidence labels not implemented
18"Create Ticket — wire ServiceNow"Not doneNot started
21"Channel repackages on own paper; executive output critical"Not doneNo export, no report generator yet

Pending Decisions (Unchanged from Round 1)

#DecisionImpact
1Delta badges on Overview — keep, contextualize, or remove?Still showing +838% type badges
2"Authority path" terminology — understood by SIs/CISOs?Term used unchanged throughout
3Global risk ranking on Overview — top-3 absolute risks?Not implemented
4Evidence pack definition — jargon-free one-linerNot written

What the Research Tells Us

The AutoResearchClaw study (March 18-19, 23 stages, 50 synthetic NHI scenarios, 7 simulated reviewer personas) produced five findings directly applicable to our forward plan. Full summary: Research Findings Summary.

Finding 1: Fix the data first, UI second

Data completeness fixes (filling empty target_resource, fixing count discrepancies, populating added_roles) are projected to improve technical reviewer acceptance by +11 to +14 points with zero UI changes. Effects are multiplicative — presentation on bad data doesn't stick.

Round 2 validation: The security auditor confirmed that data quality items from Phase 3 (bySeverity/byType page-scoped, identity labels showing raw IDs) remain open. The broken scope_drift_sensitive cluster is itself a data/routing issue. The research's "engine first" ordering is confirmed correct.

Finding 2: Platform and report are different products

Analyst trust requires full evidence and drill-down. Executive confidence requires opinionated verdicts and business language. Serving both from the same screen creates structural conflict. The projected improvement from a two-output architecture: +13 points combined acceptance, partner rewrite dropping from 60-65% to 15-20%.

Round 2 validation: The enterprise executive scored 2.1/5, confirming the gap. The assessment: "the data foundation remains strong; the executive wrapper is still what is missing." The report generator (Phase 4) remains the correct architectural answer.

Finding 3: Opinionated reports outsell rich ones ("legibility inversion")

Non-technical buyers rate single-verdict opinionated reports ~2 points higher on purchase intent than analytically rich formats. Even a collapsed methodology appendix hurts — it signals complexity and shifts the buyer's frame from "expert recommendation" to "complex analysis."

Application: Assessment report template (Phase 4.3) must NOT include a methodology appendix. The cluster verdict sentences — which all 7 reviewers independently validated as the platform's strongest element — are the right pattern. Extend downward.

Finding 4: Implementation order is structurally required

The three improvement categories (data quality, presentation clarity, report generation) interact multiplicatively:

  • Engine completeness alone: +9 points
  • Architecture split alone: +7 points
  • Both together: +19 points (not +16 — interaction adds +3)

Round 2 validation: This matches our observation. Remediation is partially named (presentation improvement), but broken pages (data/routing issue) undermine the improvement. The cluster detail lost its Exposure Brief structure (presentation regression on top of data issues). Phase ordering matters.

Finding 5: NHI governance is a new category — decisions now define norms

NHI governance emerged as a distinct category in 2025 (OWASP NHI Top 10, CSA MAESTRO). SecurityV0 is among the first platforms. The compliance mapping we add now (Phase 1.3 / Phase 4.1) becomes the reference mapping others adopt.

Implication: Compliance mapping isn't just a partner enablement feature — it's a category-defining move. Low effort, high strategic value.


Cross-Cutting Issues (Found by 3+ Reviewers)

These issues were independently flagged by multiple review agents across both rounds:

IssueR1 ReviewersR2 ReviewersStatus
Broken pages (scope_drift cluster, Exposure Detail)All 7NEW regression
Remediation missing object namesCISO, QA, SecOps, EECISO, QA, SecOps, EE, CEOPartially improved
Breadcrumbs show hash IDsQA, UXCISO, QA, UX, EE, CEOUnchanged
Compliance mapping absentEEEE, CEOUnchanged
No "what changed" filterSecOpsSecOps, CEOUnchanged
Finding descriptions contain hex IDsQA, UXQA, UX, CISO, CEOUnchanged
Export/PDF disabledCISO, SecOps, EECISO, SecOps, EE, CEOUnchanged
Secondary stat cards are inventory, not riskCISOCISO, UXUnchanged
Governance label deduplicationQA, UXNot retested (cluster structure changed)
Authority path role collapsingCISO, QA, UXQAPartial improvement

What Improved (Confirmed Across Both Rounds)

ImprovementEvidenceWhich Reviewers Confirmed
Impact scores removed (PR #89)No score bars anywhere. Sorted list conveys priority.All 7
Path detail execution diagramVisual workload → identity → destination chain with role labelsCISO, SecOps, QA, CEO
"Top Risk Reducers" sectionOrdered remediation actions with risk condition referencesAll 7
Data Domains page (new)7 domains, 27 resources, sensitivity classifications (restricted/confidential)SecOps, QA, UX, CEO ("strongest new addition")
Execution Chains page (new)6 chains with egress, ownership, sensitivity columnsSecOps, QA, UX, CEO
Finding triage buttons (new)"Acknowledge" and "Mark False Positive" on finding detailSecOps, QA
Ownership shows real names"Maria Lopez" as departed owner, "Not assigned" for currentCISO, QA, SecOps, CEO
Tenant isolation enforcedrequireTenant middleware on all /api/v1 routesSecurity Auditor
Rate limiting implementedTwo-tier per-tenant rate limitingSecurity Auditor
Legacy Dashboard removed/dashboard no longer exists; RG1-RG5 jargon goneUX Critic
Identities table enriched"Sensitive Domains" column shows data impact per identitySecOps, UX, CEO

What Regressed

RegressionSeverityHow Detected
scope_drift_sensitive cluster broken — "Risk cluster is disabled" with raw internal key in errorCRITICAL (demo blocker)Visual review — Round 1 couldn't detect this (code-only)
Exposure Detail "Entity not found" — EXP-hash format doesn't resolve to entity IDCRITICAL (demo blocker)Visual review — new page, new bug
Cluster detail lost Exposure Brief structure — Sections A-D (narrative, governance, remediation) replaced by flat tableHIGHUX Critic — "most significant regression from Round 1"
Finding breadcrumb worse — shows full eval:05d2c303... prefix instead of truncationMEDIUMQA, UX
"backstack" placeholder text in path detail breadcrumbLOWUX Critic

Release Readiness

LevelRound 1Round 2Blocker
Internal demoReadyReady
Design partner demoNot readyNot readyBroken pages, remediation naming, breadcrumbs, navigation orphans
Broader pilotNot readyNot readyAlso needs export, compliance mapping, "what changed" filter, Create Ticket

The Path Forward

The consolidated action plan's phase ordering remains correct, validated by both the research (multiplicative effects: engine → presentation → reports) and Round 2 results (presentation on broken data doesn't stick).

Immediate: Fix Demo Blockers (1-2 sessions)

These are P0 — fix before showing to anyone outside the team:

#FixEffortImpact
1Fix scope_drift_sensitive cluster — either fix rendering or remove from cluster list. Never expose internal keys in errors.<1 sessionRemoves the #1 demo-killing regression
2Fix Exposure Detail — resolve EXP-hash → entity ID mapping, or remove clickable links from list.<1 sessionRemoves the #2 demo-killing regression

This Sprint: Close the Phase 0 Gap (3-5 sessions)

These are the items Sergey explicitly flagged. They were planned, partially started, and must be completed:

#FixEffortImpactResearch Validation
3Complete remediation object naming (Phase 0.1) — every action must specify which role, which identity, which system. The entityContext is available; pipe it consistently.2-3 sessionsUnblocks partner demos. Resolves the #1 cross-cutting issue across 5 of 7 reviewers.Engine completeness: projected +11-14pp for technical reviewers
4Fix breadcrumbs (Phase 2.3) — implement useBreadcrumbLabel() that resolves IDs to display names using already-fetched page data. One pass covers all routes.1 sessionRemoves the "developer tool" perception that 5 of 7 reviewers flagged.
5Add compliance mapping (Phase 1.3 / 4.1) — static deterministic lookup: orphaned_ownership → OWASP ASI-03, NIST AC-2; scope_drift → ASI-10, NIST AC-6; llm_egress → ASI-02, NIST SC-7.1-2 sessionsEnterprise Executive's single largest score drag. Category-defining for NHI governance space.Research Finding 5: decisions now define category norms

This Sprint: CISO Clarity (5-7 sessions)

Phase 1 items from the consolidated action plan — still valid, still unstarted:

#FixEffort
6Invert visual hierarchy on cluster cards (verdict dominant, path count secondary)Low
7Add execution confidence labels ("Execution Confirmed" / "Standing Authority Only")Low
8Replace secondary stat cards with business metrics (Sensitive Domains Reached, Departed Owners, LLM Endpoints)Low
9Promote highest-risk path + global top-3 on OverviewLow-Medium
10Add "What changed since yesterday" filter (?changed_since + "New since last visit" section)Medium
11Add Execution Chains, Findings, and Exposures to sidebar navigationLow

Next Sprint: Reports & Partner Deliverables (Phase 4)

The research confirms this is the correct next major investment — but only after data quality and presentation clarity:

#ItemEffortResearch Constraint
12Report Service + Store (Phase 4.2) — two API families: full-fidelity (platform) and pre-synthesized (reports)4-6 sessionsReport generator must NOT access raw evidence (legibility inversion)
13Assessment Report template (Phase 4.3) — 5 sections, NO methodology appendix2-3 sessionsCollapsed appendix hurts purchase intent by ~0.4 points
14Scan Digest (Phase 4.3) — 1-page post-scan summary1-2 sessions
15Business glossary (src/lib/business-glossary.ts) — used ONLY in report rendering1 session"Service principal" → "automated account" only in reports, not platform

Deferred (Correctly Held)

  • Posture trend chart — research first (Sergey #28)
  • Create Ticket / ServiceNow — accepted but lower priority than export
  • Ownership inheritance logic — needs design
  • Cross-source divergence scoring — requires 4+ connectors (research Finding 4)

Projected Scores After Fixes

If items 1-11 are completed:

RoleCurrentAfter FixesTargetGap
CISO68%~82-85%≥85%At or near target
SecOps74%~80-82%≥80%Met
Product QA6P/1M/2D~2P/0M/0D≤2P/0MMet
UX CriticB/19~B+/10-12A-/≤5Jargon requires terminology standardization (Phase 5.6)
Security Auditor0 CRITICAL0 CRITICAL0 CRITICALAlready met
Enterprise Exec2.1/5~3.0-3.2/5≥3.5/5Report generator (Phase 4) needed for final gap
CEO19/28~23-24/28≥24/28At or near target

The research projects that completing Phase 4 (reports) on top of items 1-11 would push Enterprise Executive to 3.5-4.0/5 and CEO to 25-26/28 — meeting all 7 targets.


What to Demo Today

The product's strongest demo path (updated from Round 1):

  1. Overview → 769 executions, 29 authority paths, 19 with invalid ownership. Four risk cluster cards with verdict sentences.
  2. Data Domains → 7 domains, 27 resources with sensitivity labels. "These are the sensitive systems your automated accounts can reach." This is the new WOW page.
  3. Click "Orphaned + Sensitive" cluster → Verdict sentence + 13-path authority table with data domains, sensitivity, egress badges.
  4. Click into Agent Ascribe_Summarizer path → Visual execution diagram (workload → identity → destination), risk condition tiles (Scope drift, Invalid owner, Sensitive data, LLM egress), "Top Risk Reducers" sorted list.
  5. Scroll to Ownership → "Maria Lopez" (departed), "Not assigned." Named accountability.
  6. Execution Chains → 6 chains showing cross-system risk at workload level.

What to avoid: Exposure Detail (broken), scope_drift_sensitive cluster (broken), Finding Detail breadcrumbs (hash IDs), Temporal Comparison (empty), Graph Explorer (overwhelming for non-analysts).


Review Process Notes

What Was Different About Round 2

This was the first review cycle under the hardened review process:

  1. Visual snapshots captured before review — 35 screenshots (25 pages + 10 entity detail captures) from the running platform, stored in sv0-intelligence/store/snapshots/2026-03-19-demo-w1/.
  2. All 7 agents received screenshot context — agents evaluated the rendered UI, not just source code and API responses. This caught regressions invisible to code review (broken pages, hash IDs in breadcrumbs, lost Exposure Brief structure).
  3. MPAS-7 scores tracked — first baseline in the structured scoring format.
  4. Parallel execution — all 7 reviewers ran simultaneously instead of sequentially.

What the Visual Review Revealed That Code Review Missed

The CISO score dropped from 70% to 68% precisely because the visual review detected presentation issues invisible to code analysis:

  • Two broken pages (routing/data issues that only manifest when rendered)
  • Hash IDs in breadcrumbs (the eval: prefix is particularly bad — 40-character string in the nav bar)
  • Cluster card visual hierarchy (path count visually dominates verdict sentence — a CSS issue, not a data issue)
  • "backstack" placeholder text in path detail breadcrumb
  • Lost Exposure Brief structure (replaced by flat table — visible only in rendered output)

This validates the hardened process: visual review is harder but more honest than code review.


Appendix: Research Artifacts

DocumentPurpose
Research Findings SummaryCEO-readable summary of the AutoResearchClaw study — what was found and what we're using
Full Research Paper23-stage academic paper — hypotheses, literature synthesis, experiment design
FACET PaperFull FACET framework paper
Acceptance Validation Research BriefProposed next research cycle — simulation-based validation of 18 proposed changes
Consolidated Action PlanSingle source of truth for what to build next (incorporates research findings)
Sergey Feedback TrackerAll 28 feedback items with implementation status

Review Caveats

Visual-only limitation: In CLI mode (used for this review), screenshots are listed in the prompt but not sent as base64 images. However, each sub-agent independently loaded and viewed the screenshot files using the Read tool, achieving true visual review. The had_visual_input field in run.json reflects this.

Broken pages may be seed data issue: The scope_drift_sensitive cluster error and Exposure Detail "Entity not found" may be specific to the demo-w1 tenant seed data rather than code bugs. Either way, they must be fixed — the demo environment IS the product for evaluation purposes.

UX Critic's Exposure Brief regression: The UX Critic observed that cluster detail pages lost the Sections A-D structure. This may reflect a UI refactor that traded narrative synthesis for data density. Whether this is a regression or an intentional change needs clarification — the cluster detail table is richer than Round 1, but the CISO narrative layer is gone.

Score projections are estimates: The "Projected Scores After Fixes" section uses directional estimates informed by research projections and reviewer feedback, not empirical measurements. Actual scores will be validated in Round 3.