AI Exposure Incident Monitor — Research Overview

Date: 2026-03-08 Status: Research complete — decision pending Origin: AI Exposure Incident Monitor brief

Original Hypothesis

There are enough publicly observable artifacts related to AI agents, LLM workflows, or AI integrations with execution authority to support at least one strong, credible incident post per week.

Verdict: Signal validated — but the standalone product framing is late. The signal has gone from "maybe enough" to overwhelming. A generic "AI incident monitor" is now a crowded, low-moat space. The question is no longer whether the signal exists — it's how SecurityV0 uses it to demonstrate its narrower, higher-value thesis: deterministic proof of autonomous authority, ownership, and egress inside customer environments.

1. Signal Volume (March 2026)

The original success criteria was "10-20 candidate signals in 7 days." Current reality:

Metric	Data Point	Source
Orgs reporting AI agent security incidents	88% in last year	Help Net Security
MCP servers publicly exposed	~500–1,862 across independent scans (Bitsight: ~1,000; Trend Micro: 492; Knostic: ~1,862). The 8,000+ figure cited in some posts refers to registered listings, not confirmed exposed servers.	Bitsight
New AI incident IDs (3-month window)	108 (Nov 2025 – Jan 2026)	AI Incident Database
CISOs observing unintended agent behavior	47%	PR Newswire
CVEs across AI/ML/LLM ecosystem	100-150+ and doubling YoY	CVE databases, Protect AI Huntr
Enterprises running AI agents in production	~70%, another 23% planning	Gravitee

Key incidents proving signal density:

Check Point discovered RCE + API token exfiltration in Claude Code (CVE-2025-59536, CVE-2026-21852)
Supabase Cursor agent hijacked via embedded SQL in support tickets (mid-2025)
GitHub MCP vulnerability allowing embedded commands in Issues to hijack developer agents (May 2025)
Anthropic's official Git MCP server had path validation bypass

2. Existing Incident Feeds (Competition)

The "incident feed as thought leadership" angle now has competitors:

Source	What They Do	Differentiation from SecurityV0
PointGuard AI	Monthly AI Security Incident Roundup with AISSI severity scoring	Generic incident reporting, no authority path analysis
AI Incident Database	Comprehensive public incident tracking (600+ incidents)	Broad AI harms, not focused on execution authority
MITRE ATLAS	Adversarial TTPs against AI systems	Attacker-focused, not governance/exposure focused
Microsoft Cyber Pulse	AI security reports	Vendor-centric
OWASP Agentic Top 10	Risk taxonomy for agentic AI (2026 edition)	Framework, not incident feed
Protect AI Huntr	AI/ML bug bounty with public disclosures	Vulnerability-focused, not authority/governance-focused

Gap: None of these analyze incidents through an execution authority lens. None ask: "What was the observed vs potential authority path? Where was scope drift? Who owned this agent?"

3. CISO Sentiment — The Buyer Is Ready

Stat	Source
73% of CISOs critically concerned about AI agent risks, only 30% have mature safeguards	PR Newswire
92% lack full visibility into AI identities	Gravitee
78% have no formal policies for AI identity lifecycle	IANS Research
NHI-to-human ratio: 82:1 in enterprises	CyberArk
80% reported risky agent behaviors (unauthorized access, data exposure)	Multiple surveys
AI agents now SOX-relevant when touching financial processes	SafePaaS

Regulatory pressure forming:

NIST launched the AI Agent Standards Initiative
EU AI Act obligations for high-risk AI systems enforceable by August 2026
OWASP published dedicated Top 10 for Agentic Applications 2026

4. Competitive Landscape — AI Agent Security Products

Company	Focus	Overlap with SecurityV0
Zenity (Microsoft partnership/marketplace)	Runtime monitoring of AI agent behavior, prompt injection. Claims execution-path context, agent discovery, ownership mapping.	Closer than initially assessed — but configuration-derived, not execution-evidence-backed
Noma Security	Real-time prompt/response/tool call monitoring	Complementary — sees conversation, not authority
Entro Security	NHI + secrets + AI agent identity management. Maps agents to NHIs and human owners, monitors actions.	Closest competitor — positions around discovery + action monitoring, but lacks execution-evidence provenance
Proofpoint (acquired Acuvity)	AI security & governance for agentic workspaces	Broad enterprise play
Virtue AI	AgentSuite for agentic framework security	Compliance/governance focus
Geordie AI	Agent behavior posture	Early stage
Prompt Security	LLM input/output security	Prompt-level, not authority-level
Lakera	Prompt injection detection (Guard)	Prompt-level

Key finding: The market is fragmenting into:

Prompt-level security (Lakera, Noma, Prompt Security)
Identity management (Entro, Astrix)
Behavioral monitoring (Zenity, Geordie)
Cloud AI posture (Wiz AI-SPM, Orca AISPM)

The moat is narrower than "we talk about agent authority." Zenity, Entro, and CyberArk (now Palo Alto Networks, acquired Feb 11 2026) are already positioning around agent discovery, ownership, and drift monitoring. The real moat is: deterministic, first-party proof across execution identity, reachable data, egress, and accountable owner — evidence-grade findings, not configuration scanning.

5. SecurityV0 Fit — Authority Path Framework Maps Perfectly

The market pain maps directly to existing SecurityV0 concepts:

Market Pain (2026)	SecurityV0 Concept	Finding Type
92% lack visibility into AI agent identities	Authority Path showing (observed vs potential)	`unknown_identity_binding`
78% no policies for AI identity lifecycle	Ownership decay detection	`orphaned_ownership`, `ownership_degraded`
Agents accumulate access, removing permissions is scary	Scope drift	`scope_drift`
NHI-to-human ratio 82:1	NHI execution monitoring	Core platform domain
AI agents now SOX-relevant	Evidence-grade findings with immutable proof	Evidence packs with SHA256
80% reported risky agent behaviors	Execution evidence + LLM/external egress detection	`llm_egress`, `external_egress`
No visibility into what agents actually do	Observed Authority Path (execution-determined)	`unproven_execution`

The 12 deterministic finding types already detect the exact failure patterns being reported in production AI agent incidents across the industry.

6. OWASP Top 10 for Agentic Applications 2026

Published December 2025 by the OWASP GenAI Security Project. Developed with 100+ industry experts. This is the definitive risk taxonomy for agentic AI — and maps directly to SecurityV0's finding types.

Full list: OWASP Top 10 for Agentic Applications 2026 | Aikido detailed guide | Palo Alto Networks analysis | Auth0 lessons | Gravitee practical review

ID	Risk	Description	SecurityV0 Mapping
ASI01	Agent Goal Hijack	Attacker alters agent objectives or decision path through malicious text content. Agents can't reliably separate instructions from data.	Scope drift — unauthorized objective changes
ASI02	Tool Misuse and Exploitation	Agents deploy legitimate tools unsafely due to unclear prompts or manipulated inputs, causing data loss or exfiltration.	`llm_egress`, `external_egress` — execution authority abuse
ASI03	Identity and Privilege Abuse	Agents inherit user/system identities, which are unintentionally reused, escalated, or passed across agents without scoping.	`scope_drift`, `privilege_justification_gap` — direct NHI authority problem
ASI04	Agentic Supply Chain Vulnerabilities	Dynamic runtime components (tools, plugins, MCP servers) can be compromised, altering agent behavior.	Supply chain trust — connector integrity
ASI05	Unexpected Code Execution	Agents generate or run code/commands unsafely — shell commands, scripts, deserialization.	`unproven_execution` — execution without evidence/approval
ASI06	Memory and Context Poisoning	Attackers poison memory systems, RAG databases, embeddings to influence future agent decisions. Persistent, unlike prompt injection.	Temporal tracking — state manipulation over time
ASI07	Insecure Inter-Agent Communication	Unauthenticated or unencrypted multi-agent communication enables message interception and instruction injection.	`unknown_identity_binding` — agent-to-agent trust gaps
ASI08	Cascading Failures	Errors in one agent propagate across planning, execution, and downstream systems. Small misalignments compound into system-wide failures.	Authority path analysis — blast radius visualization
ASI09	Human-Agent Trust Exploitation	Users over-trust agent outputs; attackers exploit that trust to influence decisions or extract sensitive information.	Ownership governance — accountability gaps
ASI10	Rogue Agents	Compromised or misaligned agents act harmfully while appearing legitimate. May self-repeat, persist across sessions, or impersonate other agents.	`orphaned_ownership` + `dormant_authority` — persistent unauthorized execution

Core principle: Least Agency — autonomy is a feature that should be earned, not a default setting. This aligns perfectly with SecurityV0's "observed vs potential authority" model.

SecurityV0 coverage: 7 of 10 OWASP Agentic risks map directly to existing SecurityV0 finding types. The remaining 3 (ASI04 supply chain, ASI06 memory poisoning, ASI09 trust exploitation) are adjacencies that could be addressed in future wedges.

7. Key Researchers and Organizations

Individual Researchers

Simon Willison — "The Lethal Trifecta"

Focus: Prompt injection, MCP security, AI agent risk communication
Key contribution: Defined the "Lethal Trifecta" — when an AI agent has private data access + untrusted content exposure + an exfiltration vector, data theft via prompt injection becomes inevitable
MCP security: Published analysis showing how MCP servers create prompt injection attack surfaces. Demonstrated a GitHub MCP exploit where issues in public repos could hijack agents and exfiltrate private repo data.
Position on guardrails: Deeply skeptical of "95% accuracy" guardrail products — "in web application security, 95% is a failing grade"
Links: Blog | Newsletter | X/Twitter
SecurityV0 relevance: His Lethal Trifecta maps to SecurityV0's llm_egress + reachable_sensitive_domain + external_egress finding combination

Johann Rehberger — Cross-Agent Privilege Escalation

Focus: Indirect prompt injection, AI coding agent exploitation, cross-agent attacks
Key contributions:
- Coined "Cross-Agent Privilege Escalation" — multiple coding agents (GitHub Copilot, Claude Code) on the same system can be tricked into escalating each other's privileges
- Demonstrated Gemini Advanced memory manipulation via "delayed tool invocation" — invisible instructions trigger on common words like "yes" or "sure"
- Proved Claude data exfiltration — attackers exploit tool capabilities to read user data, save it, and use Anthropic's own APIs to send files to attacker accounts
- Talk at 39C3: "Agentic ProbLLMs" — demonstrated RCE and data theft in AI coding agents
SecurityV0 relevance: Cross-Agent Privilege Escalation is essentially multi-identity scope_drift — SecurityV0's authority path model could visualize these escalation chains

Daniel Kang (UIUC) — Autonomous Agent Exploitation

Focus: Dangerous capabilities of AI agents, autonomous vulnerability exploitation
Key contributions:
- Proved LLM agents can autonomously exploit 87% of one-day vulnerabilities when given CVE descriptions (GPT-4 vs 0% for other models)
- Showed teams of LLM agents can exploit zero-day vulnerabilities
- Created CVE-Bench — award-winning benchmark (ICML spotlight) used by frontier labs and governments to measure AI agents' exploitation capabilities
- Created InjecAgent — one of the first AI agent safety benchmarks
Links: Publications | Illinois Experts
SecurityV0 relevance: Demonstrates why dormant_authority on AI-connected identities is high-risk — an agent with standing permissions is an autonomous exploitation capability

Organizations and Frameworks

Invariant Labs — MCP Tool Poisoning

Focus: MCP security, tool poisoning attacks, agent guardrails
Key contributions:
- First to propose Tool Poisoning Attacks (TPA) paradigm in April 2025
- Built attack models: Shadowing Attacks, MCP Rug Pulls (tool descriptions change between approval and execution)
- Demonstrated GitHub MCP exploit — private repo data exfiltration via public issue injection
- Real-world impact: TPAs have compromised WhatsApp chat histories, GitHub private repos, and SSH credentials
- Released MCP-Scan — security scanner for MCP servers (now maintained by Snyk)
Links: Blog | MCP injection experiments repo

Trail of Bits — Agent Security Auditing

Focus: Threat modeling, prompt injection auditing, formal verification for AI
Key contributions:
- Audited Perplexity's Comet browser agent — found 4 prompt injection techniques that could exfiltrate private Gmail data
- Exposed agentic browser vulnerabilities resembling XSS and CSRF attacks
- Bypassed human approval protections for system command execution in AI agents, achieving RCE in 3 agent platforms
- Released Slither-MCP and security wrappers for MCP
- Won DARPA AIxCC (AI Cyber Challenge) at DEF CON 2025 — system found 28 vulnerabilities, patched 19
Links: Blog | awesome-ml-security | 2025 Year in Review

MITRE ATLAS — Adversarial AI Framework

Focus: Adversarial ML knowledge base mapping TTPs against AI systems (modeled after MITRE ATT&CK)
Current scope: 15 tactics, 66 techniques, 46 sub-techniques, 26 mitigations, 33 real-world case studies
2025 update: Collaborated with Zenity Labs to add 14 new techniques specifically for AI Agents and GenAI systems
Links: ATLAS | NIST presentation

Protect AI / Huntr — AI/ML Bug Bounty

Focus: World's first bug bounty platform for AI/ML vulnerabilities
Scale: 10,000+ security researchers, 125+ ML repos in scope, payouts up to $50,000 for critical vulnerabilities
Key data: Became the world's 5th largest CNA (CVE Naming Authority) — generating hundreds of vulnerability reports per year. 50.5% of vulnerabilities found have been fixed, 49.5% remain open.
Links: Huntr platform | AI Exploits collection

Wiz / Orca — AI Security Posture Management (AI-SPM)

Focus: Cloud-native AI security — discovering AI assets, misconfigurations, and exposure in cloud environments
Wiz AI-SPM: Maps entire AI estate via Security Graph, detects model exposure, prompt injection risk, misconfigured permissions. Found 85% of orgs using AI, 74% using managed AI services in State of AI in Cloud 2025 report.
Orca AISPM: Covers 50+ AI models, alerts on misconfigurations, overprivileged permissions, internet exposure. Provides automated remediation.
SecurityV0 differentiation: Wiz/Orca show configuration posture (what could happen). SecurityV0 shows execution posture (what did happen). Complementary, not competitive.
Links: Wiz AI-SPM | Orca AI-SPM

8. Critical Review (Codex Analysis)

Before the recommendation, key corrections and cautions from an independent review:

Scope Conflict

The original idea starts as a content experiment ("not a threat-intel product, for thought leadership only") but the research memo drifts into APIs, UI, and connector productization. This conflicts with:

SV0's own positioning as not a vulnerability scanner/SIEM/config linter (vision.md)
W1 explicitly excluding continuous monitoring and drift detection (definition.md)

Market Gap Is Narrower Than Claimed

The memo initially said "nobody owns execution-determined authority posture." But current vendor positioning is already close:

Zenity claims execution-path context, agent discovery, ownership mapping
Entro positions around mapping agents to NHIs and human owners, monitoring actions
CyberArk (acquired by Palo Alto Networks, closed Feb 11 2026) positioned around discovery, least privilege, drift monitoring, lifecycle, governance

The moat is not "we talk about agent authority." The moat is "we provide deterministic, first-party proof."

GitHub Connector Feasibility Caveat

The 6-8 week GitHub connector plan relies on secret-scanning/alerts API, but GitHub requires repo or org admin access for this endpoint. This works for customer-owned repos, not for broad public signal gathering. Many findings from public scanning also won't satisfy W1's stricter definition of exposure (evidence-backed execution + reachable data + egress).

Timing

A standalone incident monitor pushes toward a low-moat threat-content business
Generic AI-agent-security messaging is already noisy (OWASP Agentic Top 10 is live, Proofpoint acquired Acuvity Feb 12 2026, CyberArk (now Palo Alto Networks) described 2026 as shift from pilots to production)
The category exists. The window for category creation through vocabulary alone is closing.

8b. Recommendation (Revised)

Keep this inside SecurityV0. Do not build as a separate product.

The Clean Path

1. Now: SV0 Thought Leadership (Option B — content only)

Use public incidents to demonstrate the Authority Path framework
Every post answers: what authority existed, what identity carried it, where could data go, who owned it
Not a product, not an API — a content property that builds the SV0 vocabulary

2. Next: W1 Discovery/Assessment (already in progress)

Build paid SV0 around W1 first — point-in-time discovery inside customer environments
This is where the defensible moat lives: first-party execution evidence, not public scanning

3. Later: Continuous Drift/Monitoring (W2/W3)

Add continuous monitoring as the next wedge, not the first product
This is where the "Authority Breach Radar" concept lives — but only after W1 proves value

4. Feedback loop

Option B content validates which failure modes resonate most with CISOs
Informs W1 finding prioritization and W2/W3 feature design

The Positioning Line

Others tell CISOs that AI agent incidents are happening. SV0 shows where the same failure mode already exists in their environment — with proof.

9. Implementation Feasibility (Platform Analysis)

Based on deep exploration of the SecurityV0 codebase (sv0-platform, sv0-connectors, sv0-documentation).

Note: This section documents what's technically possible if/when a GitHub connector is prioritized for customer-owned repos (W2/W3 scope). It is not a recommendation to build this now — see Section 8b for the recommended path. The GitHub secret-scanning API requires admin access, so this applies to customer environments, not public signal gathering.

Verdict: Technically Feasible — 6-8 Weeks for Customer-Repo MVP

The platform already has all building blocks. No architectural changes needed.

What Already Exists

Component	Status	Reuse Level
Pluggable connector framework (NormalizedGraph output)	Production	100% — just add a new connector
Deterministic trigger evaluator (14 rules)	Production	100% — add new rule files
Temporal tracking (entity_versions, events, intervals)	Production	100% — works out of the box
Diff engine with circuit breakers	Production	100% — prevents data loss
Evidence pack generator (SHA256 integrity)	Production	100% — immutable proof
Findings API + UI (table, detail, timeline)	Production	80% — extend with incident-specific views
9-entity graph model	Production	100% — GitHub repos = workloads, leaked keys = credentials

New Connector: GitHub AI Signals

Pattern to follow: sv0-connectors/integrations/entra-servicenow/ (existing reference implementation)

sv0-connectors/integrations/github-ai-signals/
├── src/github_ai_signals/
│   ├── core/
│   │   ├── extractor.py      # GitHub API calls
│   │   └── transformer.py    # → NormalizedGraph
│   └── __init__.py
├── tests/
└── pyproject.toml

GitHub APIs to use:

GET /repos/{owner}/{repo}/secret-scanning/alerts — exposed credentials
GET /repos/{owner}/{repo}/security/dependabot/alerts — vulnerable dependencies
GET /repos/{owner}/{repo}/code-scanning/alerts — code issues
GET /search/repositories?q={agent-framework-keywords} — find public AI agent repos

Entity mapping:

GitHub repo → workload (nodeType)
Exposed API key → credential (nodeType)
Repo owner → owner (nodeType)
Edge: EXPOSED_CREDENTIAL (workload → credential)

New Finding Rules (4-5 Rules, ~1-2 Hours Each)

Add to sv0-platform/src/evaluator/rules/:

Rule	Trigger	Severity
`ai_credential_exposed`	Credential found in GitHub secret scanning	CRITICAL
`ai_workload_public_exposure`	Public repo + secret_alerts > 0	HIGH
`ai_vulnerable_dependency`	Workload with Dependabot alerts on AI framework	HIGH
`ai_unknown_owner`	Exposed credential with no traceable owner	HIGH
`ai_egress_to_llm`	Workload with egress classified as LLM	HIGH

Pattern to follow: sv0-platform/src/evaluator/rules/orphaned-ownership.ts — each rule is ~50-100 lines of pure deterministic logic.

API Extensions

Add to sv0-platform/src/api/routes/:

GET  /api/v1/incidents?severity=critical&type=ai_credential_exposed
GET  /api/v1/incidents/:id          # Detail + timeline
POST /api/v1/incidents/:id/acknowledge

Effort: 2-3 days (reuse existing FindingDoc schema + query patterns from findings.ts)

UI Extensions

Reuse existing components from sv0-platform/ui/src/:

Incident Feed Page — clone FindingsList.tsx, filter by AI finding types (~300 lines)
Incident Detail — extend existing FindingDetail with public exposure metadata (repo URL, commit hash, first/last detected)
Dashboard Widget — "Critical AI Exposures" count card (~100 lines)

Effort: 3-4 days (80% reuse of existing components)

Continuous Monitoring: Already Supported

The existing pipeline handles this natively:

Connectors run on schedule (1-24 hour intervals)
Diff engine detects changes automatically
Findings track intervals (active/resolved periods)
Entity versions preserve full temporal history
Circuit breakers prevent false negatives (block if >50% entities disappear)

No new infrastructure needed — just configure the GitHub connector sync frequency.

Phase 2 Sources (3-4 Weeks Each, Independent)

Source	API	New Rules
CVE/NVD Feed	NVD API v2	`ai_framework_cve` — CVE filed against AI framework in use
MCP Server Exposure	Shodan/Censys API	`mcp_server_exposed` — public MCP endpoint detected
RSS Security Feeds	RSS + NLP	`ai_incident_mention` — customer infrastructure mentioned in security blog

Risks

Risk	Mitigation
GitHub API rate limits	OAuth app token, incremental sync mode, caching
False positives	`false_positive` status field; user can dismiss
Credential exposure in evidence packs	Never store raw secrets — hash only, RBAC on API
Public repo scope explosion	Require explicit customer repo list (no open-ended scanning)

Key Files Reference

Architecture docs:

sv0-documentation/docs/architecture/00-overview.md — system design
sv0-documentation/docs/architecture/02-processing-pipeline.md — pipeline + scan safety
sv0-documentation/docs/architecture/05-connectors.md — connector interface contract

Platform code:

sv0-platform/src/evaluator/rules/ — 14 existing rule implementations
sv0-platform/src/evaluator/index.ts — rule evaluation pipeline
sv0-platform/src/ingestion/diff-engine.ts — change detection (280 lines)
sv0-platform/src/workers/handlers/sync-ingestion.ts — end-to-end pipeline (410 lines)
sv0-platform/src/api/routes/findings.ts — API template
sv0-platform/ui/src/pages/FindingsList.tsx — UI template

Connector examples:

sv0-connectors/integrations/entra-servicenow/src/entra_servicenow/core/transformer.py — transformer pattern (500+ lines)
sv0-connectors/integrations/azure-foundry/src/ — Azure integration pattern

10. Key Insight

The original question was "is there enough signal for a weekly incident feed?"

The answer: yes, overwhelmingly — but that's no longer the right question. A standalone incident monitor is a low-moat content business in a space that's already getting noisy. The part that's still early enough is SecurityV0's narrower thesis: deterministic proof of autonomous authority, ownership, and egress inside customer environments.

Use the incident signal as fuel for SV0 thought leadership. Build the paid product around W1 first-party evidence. Add continuous monitoring later.

The positioning that matters: others tell CISOs that AI agent incidents are happening. SV0 shows where the same failure mode already exists in their environment — with proof.

Original Hypothesis​

1. Signal Volume (March 2026)​

2. Existing Incident Feeds (Competition)​

3. CISO Sentiment — The Buyer Is Ready​

4. Competitive Landscape — AI Agent Security Products​

5. SecurityV0 Fit — Authority Path Framework Maps Perfectly​

6. OWASP Top 10 for Agentic Applications 2026​

7. Key Researchers and Organizations​

Individual Researchers​

Simon Willison — "The Lethal Trifecta"​

Johann Rehberger — Cross-Agent Privilege Escalation​

Daniel Kang (UIUC) — Autonomous Agent Exploitation​

Organizations and Frameworks​

Invariant Labs — MCP Tool Poisoning​

Trail of Bits — Agent Security Auditing​

MITRE ATLAS — Adversarial AI Framework​

Protect AI / Huntr — AI/ML Bug Bounty​

Wiz / Orca — AI Security Posture Management (AI-SPM)​

8. Critical Review (Codex Analysis)​

Scope Conflict​

Market Gap Is Narrower Than Claimed​

GitHub Connector Feasibility Caveat​

Timing​

8b. Recommendation (Revised)​

The Clean Path​

The Positioning Line​

9. Implementation Feasibility (Platform Analysis)​

Verdict: Technically Feasible — 6-8 Weeks for Customer-Repo MVP​

What Already Exists​

New Connector: GitHub AI Signals​

New Finding Rules (4-5 Rules, ~1-2 Hours Each)​

API Extensions​

UI Extensions​

Continuous Monitoring: Already Supported​

Phase 2 Sources (3-4 Weeks Each, Independent)​

Risks​

Key Files Reference​

10. Key Insight​

Sources​