Skip to main content

AI Exposure Incident Monitor — Research Overview

Date: 2026-03-08 Status: Research complete — decision pending Origin: AI Exposure Incident Monitor brief


Original Hypothesis

There are enough publicly observable artifacts related to AI agents, LLM workflows, or AI integrations with execution authority to support at least one strong, credible incident post per week.

Verdict: Signal validated — but the standalone product framing is late. The signal has gone from "maybe enough" to overwhelming. A generic "AI incident monitor" is now a crowded, low-moat space. The question is no longer whether the signal exists — it's how SecurityV0 uses it to demonstrate its narrower, higher-value thesis: deterministic proof of autonomous authority, ownership, and egress inside customer environments.


1. Signal Volume (March 2026)

The original success criteria was "10-20 candidate signals in 7 days." Current reality:

MetricData PointSource
Orgs reporting AI agent security incidents88% in last yearHelp Net Security
MCP servers publicly exposed~500–1,862 across independent scans (Bitsight: ~1,000; Trend Micro: 492; Knostic: ~1,862). The 8,000+ figure cited in some posts refers to registered listings, not confirmed exposed servers.Bitsight
New AI incident IDs (3-month window)108 (Nov 2025 – Jan 2026)AI Incident Database
CISOs observing unintended agent behavior47%PR Newswire
CVEs across AI/ML/LLM ecosystem100-150+ and doubling YoYCVE databases, Protect AI Huntr
Enterprises running AI agents in production~70%, another 23% planningGravitee

Key incidents proving signal density:

  • Check Point discovered RCE + API token exfiltration in Claude Code (CVE-2025-59536, CVE-2026-21852)
  • Supabase Cursor agent hijacked via embedded SQL in support tickets (mid-2025)
  • GitHub MCP vulnerability allowing embedded commands in Issues to hijack developer agents (May 2025)
  • Anthropic's official Git MCP server had path validation bypass

2. Existing Incident Feeds (Competition)

The "incident feed as thought leadership" angle now has competitors:

SourceWhat They DoDifferentiation from SecurityV0
PointGuard AIMonthly AI Security Incident Roundup with AISSI severity scoringGeneric incident reporting, no authority path analysis
AI Incident DatabaseComprehensive public incident tracking (600+ incidents)Broad AI harms, not focused on execution authority
MITRE ATLASAdversarial TTPs against AI systemsAttacker-focused, not governance/exposure focused
Microsoft Cyber PulseAI security reportsVendor-centric
OWASP Agentic Top 10Risk taxonomy for agentic AI (2026 edition)Framework, not incident feed
Protect AI HuntrAI/ML bug bounty with public disclosuresVulnerability-focused, not authority/governance-focused

Gap: None of these analyze incidents through an execution authority lens. None ask: "What was the observed vs potential authority path? Where was scope drift? Who owned this agent?"


3. CISO Sentiment — The Buyer Is Ready

StatSource
73% of CISOs critically concerned about AI agent risks, only 30% have mature safeguardsPR Newswire
92% lack full visibility into AI identitiesGravitee
78% have no formal policies for AI identity lifecycleIANS Research
NHI-to-human ratio: 82:1 in enterprisesCyberArk
80% reported risky agent behaviors (unauthorized access, data exposure)Multiple surveys
AI agents now SOX-relevant when touching financial processesSafePaaS

Regulatory pressure forming:


4. Competitive Landscape — AI Agent Security Products

CompanyFocusOverlap with SecurityV0
Zenity (Microsoft partnership/marketplace)Runtime monitoring of AI agent behavior, prompt injection. Claims execution-path context, agent discovery, ownership mapping.Closer than initially assessed — but configuration-derived, not execution-evidence-backed
Noma SecurityReal-time prompt/response/tool call monitoringComplementary — sees conversation, not authority
Entro SecurityNHI + secrets + AI agent identity management. Maps agents to NHIs and human owners, monitors actions.Closest competitor — positions around discovery + action monitoring, but lacks execution-evidence provenance
Proofpoint (acquired Acuvity)AI security & governance for agentic workspacesBroad enterprise play
Virtue AIAgentSuite for agentic framework securityCompliance/governance focus
Geordie AIAgent behavior postureEarly stage
Prompt SecurityLLM input/output securityPrompt-level, not authority-level
LakeraPrompt injection detection (Guard)Prompt-level

Key finding: The market is fragmenting into:

  1. Prompt-level security (Lakera, Noma, Prompt Security)
  2. Identity management (Entro, Astrix)
  3. Behavioral monitoring (Zenity, Geordie)
  4. Cloud AI posture (Wiz AI-SPM, Orca AISPM)

The moat is narrower than "we talk about agent authority." Zenity, Entro, and CyberArk (now Palo Alto Networks, acquired Feb 11 2026) are already positioning around agent discovery, ownership, and drift monitoring. The real moat is: deterministic, first-party proof across execution identity, reachable data, egress, and accountable owner — evidence-grade findings, not configuration scanning.


5. SecurityV0 Fit — Authority Path Framework Maps Perfectly

The market pain maps directly to existing SecurityV0 concepts:

Market Pain (2026)SecurityV0 ConceptFinding Type
92% lack visibility into AI agent identitiesAuthority Path showing (observed vs potential)unknown_identity_binding
78% no policies for AI identity lifecycleOwnership decay detectionorphaned_ownership, ownership_degraded
Agents accumulate access, removing permissions is scaryScope driftscope_drift
NHI-to-human ratio 82:1NHI execution monitoringCore platform domain
AI agents now SOX-relevantEvidence-grade findings with immutable proofEvidence packs with SHA256
80% reported risky agent behaviorsExecution evidence + LLM/external egress detectionllm_egress, external_egress
No visibility into what agents actually doObserved Authority Path (execution-determined)unproven_execution

The 12 deterministic finding types already detect the exact failure patterns being reported in production AI agent incidents across the industry.


6. OWASP Top 10 for Agentic Applications 2026

Published December 2025 by the OWASP GenAI Security Project. Developed with 100+ industry experts. This is the definitive risk taxonomy for agentic AI — and maps directly to SecurityV0's finding types.

Full list: OWASP Top 10 for Agentic Applications 2026 | Aikido detailed guide | Palo Alto Networks analysis | Auth0 lessons | Gravitee practical review

IDRiskDescriptionSecurityV0 Mapping
ASI01Agent Goal HijackAttacker alters agent objectives or decision path through malicious text content. Agents can't reliably separate instructions from data.Scope drift — unauthorized objective changes
ASI02Tool Misuse and ExploitationAgents deploy legitimate tools unsafely due to unclear prompts or manipulated inputs, causing data loss or exfiltration.llm_egress, external_egress — execution authority abuse
ASI03Identity and Privilege AbuseAgents inherit user/system identities, which are unintentionally reused, escalated, or passed across agents without scoping.scope_drift, privilege_justification_gap — direct NHI authority problem
ASI04Agentic Supply Chain VulnerabilitiesDynamic runtime components (tools, plugins, MCP servers) can be compromised, altering agent behavior.Supply chain trust — connector integrity
ASI05Unexpected Code ExecutionAgents generate or run code/commands unsafely — shell commands, scripts, deserialization.unproven_execution — execution without evidence/approval
ASI06Memory and Context PoisoningAttackers poison memory systems, RAG databases, embeddings to influence future agent decisions. Persistent, unlike prompt injection.Temporal tracking — state manipulation over time
ASI07Insecure Inter-Agent CommunicationUnauthenticated or unencrypted multi-agent communication enables message interception and instruction injection.unknown_identity_binding — agent-to-agent trust gaps
ASI08Cascading FailuresErrors in one agent propagate across planning, execution, and downstream systems. Small misalignments compound into system-wide failures.Authority path analysis — blast radius visualization
ASI09Human-Agent Trust ExploitationUsers over-trust agent outputs; attackers exploit that trust to influence decisions or extract sensitive information.Ownership governance — accountability gaps
ASI10Rogue AgentsCompromised or misaligned agents act harmfully while appearing legitimate. May self-repeat, persist across sessions, or impersonate other agents.orphaned_ownership + dormant_authority — persistent unauthorized execution

Core principle: Least Agency — autonomy is a feature that should be earned, not a default setting. This aligns perfectly with SecurityV0's "observed vs potential authority" model.

SecurityV0 coverage: 7 of 10 OWASP Agentic risks map directly to existing SecurityV0 finding types. The remaining 3 (ASI04 supply chain, ASI06 memory poisoning, ASI09 trust exploitation) are adjacencies that could be addressed in future wedges.


7. Key Researchers and Organizations

Individual Researchers

Simon Willison — "The Lethal Trifecta"

  • Focus: Prompt injection, MCP security, AI agent risk communication
  • Key contribution: Defined the "Lethal Trifecta" — when an AI agent has private data access + untrusted content exposure + an exfiltration vector, data theft via prompt injection becomes inevitable
  • MCP security: Published analysis showing how MCP servers create prompt injection attack surfaces. Demonstrated a GitHub MCP exploit where issues in public repos could hijack agents and exfiltrate private repo data.
  • Position on guardrails: Deeply skeptical of "95% accuracy" guardrail products — "in web application security, 95% is a failing grade"
  • Links: Blog | Newsletter | X/Twitter
  • SecurityV0 relevance: His Lethal Trifecta maps to SecurityV0's llm_egress + reachable_sensitive_domain + external_egress finding combination

Johann Rehberger — Cross-Agent Privilege Escalation

  • Focus: Indirect prompt injection, AI coding agent exploitation, cross-agent attacks
  • Key contributions:
    • Coined "Cross-Agent Privilege Escalation" — multiple coding agents (GitHub Copilot, Claude Code) on the same system can be tricked into escalating each other's privileges
    • Demonstrated Gemini Advanced memory manipulation via "delayed tool invocation" — invisible instructions trigger on common words like "yes" or "sure"
    • Proved Claude data exfiltration — attackers exploit tool capabilities to read user data, save it, and use Anthropic's own APIs to send files to attacker accounts
    • Talk at 39C3: "Agentic ProbLLMs" — demonstrated RCE and data theft in AI coding agents
  • SecurityV0 relevance: Cross-Agent Privilege Escalation is essentially multi-identity scope_drift — SecurityV0's authority path model could visualize these escalation chains

Daniel Kang (UIUC) — Autonomous Agent Exploitation

Organizations and Frameworks

Invariant Labs — MCP Tool Poisoning

  • Focus: MCP security, tool poisoning attacks, agent guardrails
  • Key contributions:
    • First to propose Tool Poisoning Attacks (TPA) paradigm in April 2025
    • Built attack models: Shadowing Attacks, MCP Rug Pulls (tool descriptions change between approval and execution)
    • Demonstrated GitHub MCP exploit — private repo data exfiltration via public issue injection
    • Real-world impact: TPAs have compromised WhatsApp chat histories, GitHub private repos, and SSH credentials
    • Released MCP-Scan — security scanner for MCP servers (now maintained by Snyk)
  • Links: Blog | MCP injection experiments repo

Trail of Bits — Agent Security Auditing

  • Focus: Threat modeling, prompt injection auditing, formal verification for AI
  • Key contributions:
    • Audited Perplexity's Comet browser agent — found 4 prompt injection techniques that could exfiltrate private Gmail data
    • Exposed agentic browser vulnerabilities resembling XSS and CSRF attacks
    • Bypassed human approval protections for system command execution in AI agents, achieving RCE in 3 agent platforms
    • Released Slither-MCP and security wrappers for MCP
    • Won DARPA AIxCC (AI Cyber Challenge) at DEF CON 2025 — system found 28 vulnerabilities, patched 19
  • Links: Blog | awesome-ml-security | 2025 Year in Review

MITRE ATLAS — Adversarial AI Framework

Protect AI / Huntr — AI/ML Bug Bounty

  • Focus: World's first bug bounty platform for AI/ML vulnerabilities
  • Scale: 10,000+ security researchers, 125+ ML repos in scope, payouts up to $50,000 for critical vulnerabilities
  • Key data: Became the world's 5th largest CNA (CVE Naming Authority) — generating hundreds of vulnerability reports per year. 50.5% of vulnerabilities found have been fixed, 49.5% remain open.
  • Links: Huntr platform | AI Exploits collection

Wiz / Orca — AI Security Posture Management (AI-SPM)

  • Focus: Cloud-native AI security — discovering AI assets, misconfigurations, and exposure in cloud environments
  • Wiz AI-SPM: Maps entire AI estate via Security Graph, detects model exposure, prompt injection risk, misconfigured permissions. Found 85% of orgs using AI, 74% using managed AI services in State of AI in Cloud 2025 report.
  • Orca AISPM: Covers 50+ AI models, alerts on misconfigurations, overprivileged permissions, internet exposure. Provides automated remediation.
  • SecurityV0 differentiation: Wiz/Orca show configuration posture (what could happen). SecurityV0 shows execution posture (what did happen). Complementary, not competitive.
  • Links: Wiz AI-SPM | Orca AI-SPM

8. Critical Review (Codex Analysis)

Before the recommendation, key corrections and cautions from an independent review:

Scope Conflict

The original idea starts as a content experiment ("not a threat-intel product, for thought leadership only") but the research memo drifts into APIs, UI, and connector productization. This conflicts with:

  • SV0's own positioning as not a vulnerability scanner/SIEM/config linter (vision.md)
  • W1 explicitly excluding continuous monitoring and drift detection (definition.md)

Market Gap Is Narrower Than Claimed

The memo initially said "nobody owns execution-determined authority posture." But current vendor positioning is already close:

  • Zenity claims execution-path context, agent discovery, ownership mapping
  • Entro positions around mapping agents to NHIs and human owners, monitoring actions
  • CyberArk (acquired by Palo Alto Networks, closed Feb 11 2026) positioned around discovery, least privilege, drift monitoring, lifecycle, governance

The moat is not "we talk about agent authority." The moat is "we provide deterministic, first-party proof."

GitHub Connector Feasibility Caveat

The 6-8 week GitHub connector plan relies on secret-scanning/alerts API, but GitHub requires repo or org admin access for this endpoint. This works for customer-owned repos, not for broad public signal gathering. Many findings from public scanning also won't satisfy W1's stricter definition of exposure (evidence-backed execution + reachable data + egress).

Timing

  • A standalone incident monitor pushes toward a low-moat threat-content business
  • Generic AI-agent-security messaging is already noisy (OWASP Agentic Top 10 is live, Proofpoint acquired Acuvity Feb 12 2026, CyberArk (now Palo Alto Networks) described 2026 as shift from pilots to production)
  • The category exists. The window for category creation through vocabulary alone is closing.

8b. Recommendation (Revised)

Keep this inside SecurityV0. Do not build as a separate product.

The Clean Path

1. Now: SV0 Thought Leadership (Option B — content only)

  • Use public incidents to demonstrate the Authority Path framework
  • Every post answers: what authority existed, what identity carried it, where could data go, who owned it
  • Not a product, not an API — a content property that builds the SV0 vocabulary

2. Next: W1 Discovery/Assessment (already in progress)

  • Build paid SV0 around W1 first — point-in-time discovery inside customer environments
  • This is where the defensible moat lives: first-party execution evidence, not public scanning

3. Later: Continuous Drift/Monitoring (W2/W3)

  • Add continuous monitoring as the next wedge, not the first product
  • This is where the "Authority Breach Radar" concept lives — but only after W1 proves value

4. Feedback loop

  • Option B content validates which failure modes resonate most with CISOs
  • Informs W1 finding prioritization and W2/W3 feature design

The Positioning Line

Others tell CISOs that AI agent incidents are happening. SV0 shows where the same failure mode already exists in their environment — with proof.


9. Implementation Feasibility (Platform Analysis)

Based on deep exploration of the SecurityV0 codebase (sv0-platform, sv0-connectors, sv0-documentation).

Note: This section documents what's technically possible if/when a GitHub connector is prioritized for customer-owned repos (W2/W3 scope). It is not a recommendation to build this now — see Section 8b for the recommended path. The GitHub secret-scanning API requires admin access, so this applies to customer environments, not public signal gathering.

Verdict: Technically Feasible — 6-8 Weeks for Customer-Repo MVP

The platform already has all building blocks. No architectural changes needed.

What Already Exists

ComponentStatusReuse Level
Pluggable connector framework (NormalizedGraph output)Production100% — just add a new connector
Deterministic trigger evaluator (14 rules)Production100% — add new rule files
Temporal tracking (entity_versions, events, intervals)Production100% — works out of the box
Diff engine with circuit breakersProduction100% — prevents data loss
Evidence pack generator (SHA256 integrity)Production100% — immutable proof
Findings API + UI (table, detail, timeline)Production80% — extend with incident-specific views
9-entity graph modelProduction100% — GitHub repos = workloads, leaked keys = credentials

New Connector: GitHub AI Signals

Pattern to follow: sv0-connectors/integrations/entra-servicenow/ (existing reference implementation)

sv0-connectors/integrations/github-ai-signals/
├── src/github_ai_signals/
│ ├── core/
│ │ ├── extractor.py # GitHub API calls
│ │ └── transformer.py # → NormalizedGraph
│ └── __init__.py
├── tests/
└── pyproject.toml

GitHub APIs to use:

  • GET /repos/{owner}/{repo}/secret-scanning/alerts — exposed credentials
  • GET /repos/{owner}/{repo}/security/dependabot/alerts — vulnerable dependencies
  • GET /repos/{owner}/{repo}/code-scanning/alerts — code issues
  • GET /search/repositories?q={agent-framework-keywords} — find public AI agent repos

Entity mapping:

  • GitHub repo → workload (nodeType)
  • Exposed API key → credential (nodeType)
  • Repo owner → owner (nodeType)
  • Edge: EXPOSED_CREDENTIAL (workload → credential)

New Finding Rules (4-5 Rules, ~1-2 Hours Each)

Add to sv0-platform/src/evaluator/rules/:

RuleTriggerSeverity
ai_credential_exposedCredential found in GitHub secret scanningCRITICAL
ai_workload_public_exposurePublic repo + secret_alerts > 0HIGH
ai_vulnerable_dependencyWorkload with Dependabot alerts on AI frameworkHIGH
ai_unknown_ownerExposed credential with no traceable ownerHIGH
ai_egress_to_llmWorkload with egress classified as LLMHIGH

Pattern to follow: sv0-platform/src/evaluator/rules/orphaned-ownership.ts — each rule is ~50-100 lines of pure deterministic logic.

API Extensions

Add to sv0-platform/src/api/routes/:

GET  /api/v1/incidents?severity=critical&type=ai_credential_exposed
GET /api/v1/incidents/:id # Detail + timeline
POST /api/v1/incidents/:id/acknowledge

Effort: 2-3 days (reuse existing FindingDoc schema + query patterns from findings.ts)

UI Extensions

Reuse existing components from sv0-platform/ui/src/:

  1. Incident Feed Page — clone FindingsList.tsx, filter by AI finding types (~300 lines)
  2. Incident Detail — extend existing FindingDetail with public exposure metadata (repo URL, commit hash, first/last detected)
  3. Dashboard Widget — "Critical AI Exposures" count card (~100 lines)

Effort: 3-4 days (80% reuse of existing components)

Continuous Monitoring: Already Supported

The existing pipeline handles this natively:

  • Connectors run on schedule (1-24 hour intervals)
  • Diff engine detects changes automatically
  • Findings track intervals (active/resolved periods)
  • Entity versions preserve full temporal history
  • Circuit breakers prevent false negatives (block if >50% entities disappear)

No new infrastructure needed — just configure the GitHub connector sync frequency.

Phase 2 Sources (3-4 Weeks Each, Independent)

SourceAPINew Rules
CVE/NVD FeedNVD API v2ai_framework_cve — CVE filed against AI framework in use
MCP Server ExposureShodan/Censys APImcp_server_exposed — public MCP endpoint detected
RSS Security FeedsRSS + NLPai_incident_mention — customer infrastructure mentioned in security blog

Risks

RiskMitigation
GitHub API rate limitsOAuth app token, incremental sync mode, caching
False positivesfalse_positive status field; user can dismiss
Credential exposure in evidence packsNever store raw secrets — hash only, RBAC on API
Public repo scope explosionRequire explicit customer repo list (no open-ended scanning)

Key Files Reference

Architecture docs:

  • sv0-documentation/docs/architecture/00-overview.md — system design
  • sv0-documentation/docs/architecture/02-processing-pipeline.md — pipeline + scan safety
  • sv0-documentation/docs/architecture/05-connectors.md — connector interface contract

Platform code:

  • sv0-platform/src/evaluator/rules/ — 14 existing rule implementations
  • sv0-platform/src/evaluator/index.ts — rule evaluation pipeline
  • sv0-platform/src/ingestion/diff-engine.ts — change detection (280 lines)
  • sv0-platform/src/workers/handlers/sync-ingestion.ts — end-to-end pipeline (410 lines)
  • sv0-platform/src/api/routes/findings.ts — API template
  • sv0-platform/ui/src/pages/FindingsList.tsx — UI template

Connector examples:

  • sv0-connectors/integrations/entra-servicenow/src/entra_servicenow/core/transformer.py — transformer pattern (500+ lines)
  • sv0-connectors/integrations/azure-foundry/src/ — Azure integration pattern

10. Key Insight

The original question was "is there enough signal for a weekly incident feed?"

The answer: yes, overwhelmingly — but that's no longer the right question. A standalone incident monitor is a low-moat content business in a space that's already getting noisy. The part that's still early enough is SecurityV0's narrower thesis: deterministic proof of autonomous authority, ownership, and egress inside customer environments.

Use the incident signal as fuel for SV0 thought leadership. Build the paid product around W1 first-party evidence. Add continuous monitoring later.

The positioning that matters: others tell CISOs that AI agent incidents are happening. SV0 shows where the same failure mode already exists in their environment — with proof.


Sources