AWS Connector — Research & Implementation Plan

Date: 2026-03-11 Status: Draft v2 — review feedback incorporated (Delta blockers 1–4, warnings 5–7) Scope: sv0-connectors (new aws connector), sv0-platform (graph model extensions) Trigger: Inetum customer engagement confirmed AWS workload coverage is a critical gap; customers run mixed Azure + AWS environments where authority paths cross cloud boundaries.

1. Context and Motivation

SecurityV0 currently covers the Microsoft identity plane (Entra ID, Azure Foundry, ServiceNow). Enterprise customers — including Inetum — operate significant workloads on AWS: Lambda functions calling RDS, ECS containers pulling from ECR, Bedrock agents orchestrating across accounts, and Step Functions triggering cross-account executions.

Without an AWS connector, authority paths that originate in or terminate on AWS resources cannot be materialised. This is a visibility gap: a Lambda function assuming an over-privileged IAM Role and reading production S3 data is structurally identical to the Azure Logic App → ServiceNow scenarios we already model — but completely invisible to the platform.

What the AWS connector must answer

What workloads run on AWS and what identities do they assume?
What resources can those identities access, and via which policies?
What is the code execution chain? (ECR → container → task role → resource)
Where does authority cross account or cloud boundaries?
What is the observed execution evidence? (CloudTrail as temporal data source)

2. AWS Identity Model — Key Concepts

Understanding the AWS IAM model is prerequisite to correct graph modelling.

2.1 Identity Types

AWS Entity	SecurityV0 Entity Type	Notes
IAM User	`identity` (`subtype: iam_user`)	Human or programmatic; avoid in modern AWS
IAM Role	`identity` (`subtype: iam_role`)	Primary execution identity; assumed via STS
IAM Group	`identity` (`subtype: iam_group`)	Attached to users; no direct execution
IAM Identity Center Permission Set	`identity` (`subtype: sso_permission_set`)	Federated human access; maps to role assumption
Service-Linked Role	`identity` (`subtype: service_linked_role`)	AWS-managed; low risk but should be visible
Instance Profile	edge annotation	Wraps an IAM Role for EC2 attachment; not a separate entity
OIDC Federated Identity	edge (`TRUSTS`)	Kubernetes Service Account → IAM Role via IRSA

2.2 Policy Types (Authority Sources)

Policy Type	Priority for Graph
Identity-based policies (inline + managed)	Critical — direct authority grant
Resource-based policies (S3 Bucket Policy, Lambda Resource Policy)	High — cross-account and cross-service authority
Permission Boundaries	High — ceiling on effective permissions; must be modelled as a constraint edge
Service Control Policies (SCPs)	High — org-level ceiling; organisation connector scope
Session Policies	Medium — short-lived; capture from CloudTrail
ACLs (legacy S3)	Low — mostly legacy, can be deferred

2.3 Structural vs Effective Permissions — Critical Scope Caveat

This connector models structural (granted) permissions, not effective permissions.

AWS IAM Conditions (aws:SourceVpc, aws:RequestedRegion, StringEquals aws:PrincipalOrgID, etc.) are not evaluated by the connector. A policy statement that grants s3:GetObject with a condition like "StringEquals": {"s3:prefix": "logs/"} will be ingested as an unconditional GRANTS edge to that S3 bucket.

This means:

Authority paths may over-report reachability when conditions restrict access at runtime
The via_roles and actions fields reflect policy text, not runtime enforcement
Wherever conditions are material to risk, the finding explanation should carry a conditions_not_evaluated: true flag and the UI should surface a caveat

Implication for Phase 1: all authority paths produced by the AWS connector carry an implicit "structurally reachable, conditions not evaluated" qualifier. This is consistent with how the Azure connector handles ARM RBAC conditions today. The caveat must be documented in the customer-facing setup guide and in the UI tooltip for AWS-sourced paths.

Conditions worth modelling in a future phase: aws:PrincipalOrgID (org boundary), aws:SourceAccount (confused-deputy prevention), sts:ExternalId (cross-account guard), aws:MultiFactorAuthPresent. These reduce the effective authority surface and would shrink false-positive path counts meaningfully.

2.4 Trust Relationships

Every IAM Role has a Trust Policy that defines who can assume it. This is the primary mechanism for cross-account and cross-service authority chains:

Lambda Service Principal → AssumeRole → Execution Role
ECS Task Definition       → AssumeRole → Task Role
Bedrock Agent             → AssumeRole → Agent Execution Role
Cross-Account Caller      → AssumeRole → Role in Target Account
EC2 Instance Profile      → AssumeRole → Role
OIDC Provider (EKS IRSA)  → AssumeRole → Role (via web identity)

These trust relationships map directly to RUNS_AS and ASSUMES edges in the SecurityV0 graph.

3. Workload Types to Model

3.1 AWS Lambda

What to collect:

Function name, ARN, runtime, description, last modified
Execution role ARN (RUNS_AS edge)
Resource-based policy (who can invoke the function — cross-service / cross-account)
VPC configuration (network isolation context)
Package type: Zip vs Image — if Image, ECR repository URI + image digest
Concurrency settings (reserved / provisioned)
Environment variable keys (not values — PII risk; but presence of DB_URL, SECRET_ARN etc. informs destination inference)
Layers (shared code; potential additional authority surface)
Event source mappings (what triggers this function: SQS, DynamoDB Streams, Kinesis, EventBridge)

Graph edges:

Lambda Function → RUNS_AS         → IAM Role (execution role)
Lambda Function → TRIGGERED_BY    → EventBridge Rule / SQS Queue / SNS Topic
Lambda Function → DEPLOYED_FROM   → ECR Repository (container image)
Lambda Function → READS_FROM      → S3 Bucket / DynamoDB Table (env var inference)
IAM Role        → HAS_POLICY      → IAM Policy
IAM Policy      → GRANTS          → Permission (action + resource ARN)
Permission      → APPLIES_TO      → AWS Resource (S3, DynamoDB, RDS, etc.)

3.2 ECS (Elastic Container Service)

What to collect:

Task Definition: family, revision, task role ARN, execution role ARN
Container definitions: image URI (ECR reference), command, environment variable keys
Services: cluster, desired count, launch type (Fargate vs EC2)
Cluster configuration

Key distinction: ECS has two roles:

Execution Role — used by ECS agent to pull ECR images, write CloudWatch logs
Task Role — used by the application code inside the container

Both must be captured. The Task Role is the workload's runtime identity (authority source). The Execution Role is infrastructure-level (lower priority but relevant for ECR pull chain).

Graph edges:

ECS Task Definition → RUNS_AS       → IAM Task Role
ECS Task Definition → DEPLOYED_FROM → ECR Repository (image URI)
ECS Task Definition → PULLS_VIA     → IAM Execution Role (ECR pull authority)
ECS Service         → RUNS          → ECS Task Definition

3.3 ECR (Elastic Container Registry) — The Code-Deploy Chain

ECR is the source of execution artefacts for containerised workloads. It is the AWS equivalent of an artefact registry and sits at the start of the execution chain.

What to collect:

Repository name, ARN, URI, account ID
Repository policy (who can pull/push — cross-account access)
Image tags and digests in use (link to live task definitions / Lambda functions)
Lifecycle policy (image retention — affects drift detection)
Encryption configuration (KMS key ARN)
Scan findings summary (ECR enhanced scanning via Inspector)

Why ECR matters for authority paths: An ECR repository with a permissive resource-based policy is an injection point — a cross-account role that can push images can alter what code runs inside the task, bypassing IAM role controls entirely. This is a high-severity pattern: the authority chain runs through the artefact, not just the role.

Graph edges:

ECR Repository      → HOSTS         → Container Image (tag/digest)
Container Image     → EXECUTED_BY   → Lambda Function / ECS Task Definition
ECR Repository      → PULL_ACCESS   → IAM Role (from repository policy)
ECR Repository      → PUSH_ACCESS   → IAM Role (CI/CD service role — high sensitivity)

3.4 AWS Bedrock (AI Workloads)

Given increasing Bedrock adoption, this is high value for the AI-workload coverage story.

What to collect:

Bedrock Agent: ID, name, execution role ARN, foundation model ID
Agent Action Groups: Lambda function ARN (the code the agent executes)
Knowledge Bases: ID, data source (S3 bucket), embedding model
Guardrails: ID, topics blocked, PII filtering
Model invocation logging: enabled/disabled (audit trail)

Graph edges:

Bedrock Agent       → RUNS_AS       → IAM Execution Role
Bedrock Agent       → INVOKES       → Lambda Function (action group)
Bedrock Agent       → READS_FROM    → S3 Bucket (knowledge base source)
Lambda Function     → RUNS_AS       → IAM Role (action group execution)

Why this matters: A Bedrock Agent is an autonomous workload. Its authority path — Agent → Role → S3 data → external API — is exactly the LLM egress pattern (egress_category: llm) already defined in the authority paths model. The data model supports it; we just need the connector.

3.5 Step Functions

What to collect:

State machine: ARN, name, type (Standard / Express), IAM role
Definition: states, resource ARNs invoked (Lambda, ECS, DynamoDB, etc.)
Execution logging configuration

Graph edges:

Step Functions SM   → RUNS_AS       → IAM Role
Step Functions SM   → INVOKES       → Lambda Function / ECS Task / SNS / SQS

Step Functions are orchestrators — they chain workloads. A state machine with an over-privileged role can invoke Lambda, start ECS tasks, write to DynamoDB, and call external APIs in a single execution. Visibility into the full invocation graph is critical for authority path reconstruction.

3.6 EventBridge

What to collect:

Rules: name, event pattern or schedule, target ARNs, IAM role (if cross-account)
Targets: Lambda, ECS, Step Functions, SNS, SQS, external API destinations

EventBridge rules are trigger sources for workloads. The TRIGGERED_BY edge completes the "what started this?" question in authority path lineage.

3.7 Sensitive Data Stores (Secrets Manager + SSM Parameter Store)

These are the highest-value destination types for sensitive credential data and were missing from the initial draft.

AWS Secrets Manager:

Secret ARN, name, description, KMS key ID (encryption key)
Resource policy (who can read cross-account)
Rotation status (unrotated secrets = stale credential risk)
CloudTrail GetSecretValue events — direct evidence of access

AWS Systems Manager Parameter Store:

Parameter name, ARN, type (String / SecureString / StringList)
KMS key ID (for SecureString)
GetParameter / GetParameters CloudTrail events

Both are resource entities with subtype: secrets_manager_secret / subtype: ssm_parameter. IAM policies granting secretsmanager:GetSecretValue or ssm:GetParameter create GRANTS edges to these destinations with sensitivity: restricted by default.

Graph edges:

IAM Policy     → GRANTS      → Secrets Manager Secret (sensitivity: restricted)
IAM Policy     → GRANTS      → SSM SecureString Parameter (sensitivity: restricted)
Lambda Function → READS_FROM → Secrets Manager Secret (env var `SECRET_ARN` inference)

Minimum policy additions (update §6):

"secretsmanager:ListSecrets", "secretsmanager:DescribeSecret", "secretsmanager:GetResourcePolicy",
"ssm:DescribeParameters", "ssm:GetParametersByPath", "ssm:ListTagsForResource"

Note: do not add secretsmanager:GetSecretValue or ssm:GetParameter — the connector reads metadata only, never secret values.

3.8 Lambda → Lambda Direct Invocation

Lambda functions can invoke other Lambda functions directly via lambda:InvokeFunction. This creates workload-to-workload authority edges that are distinct from the workload→identity→resource chain.

Graph edge:

Lambda Function A → INVOKES → Lambda Function B
Lambda Function B → RUNS_AS → IAM Role B → HAS_POLICY → ...

This is significant because a low-sensitivity Lambda with a permissive execution role can transitively reach high-sensitivity data by invoking a privileged Lambda. The auth_chain_depth must be incremented per hop.

How to detect: lambda:InvokeFunction permissions in identity-based policies + Lambda resource-based policies listing caller ARNs. Also collectible from CloudTrail Invoke events with a non-human caller identity.

Defer to Phase 2 but include in schema from Phase 1 to avoid a breaking change later.

4. Cross-Account and Multi-Account Modelling

AWS Organizations is the control plane for multi-account AWS environments. This is where Service Control Policies (SCPs) live — they are org-level authority ceilings that cannot be exceeded by any IAM policy in a member account.

4.1 What to collect (Organizations / IAM Identity Center)

AWS Organizations: Management account, member accounts, OUs, applied SCPs
IAM Identity Center (SSO): Permission sets, account assignments (user/group → permission set → account)
Cross-account trust policies: IAM roles with sts:AssumeRole trust to external accounts

4.2 Graph model for cross-account authority

SCP (Org-level)     → CONSTRAINS    → AWS Account (ceiling on all roles in account)
IAM Role (Account A)→ TRUSTS        → IAM Role (Account B) [cross-account AssumeRole]
IAM Role (Account A)→ TRUSTS        → AWS Service Principal (service-to-service)
IAM Role            → SUBJECT_TO    → Permission Boundary (per-role ceiling)

The CONSTRAINS and SUBJECT_TO edges are ceiling edges — they reduce effective authority. The graph engine must apply these during authority path materialisation to compute the effective permission set, not just the granted set.

5. CloudTrail — Temporal Evidence

CloudTrail is the AWS equivalent of execution logs. It provides the observed execution evidence the authority paths model depends on.

5.1 Priority events to capture

CloudTrail Event	Maps To
`AssumeRole`	Confirms a role assumption occurred; populates `execution_30d`, `last_execution_at`
`Invoke` (Lambda)	Confirms function execution; links to workload
`GetSecretValue` (Secrets Manager)	Sensitive data access; informs `data_domain`
`GetObject` / `PutObject` (S3)	Data read/write; informs `actions[]`
`DescribeInstances` / resource APIs	Resource enumeration; scope drift signal
`CreateRole` / `AttachRolePolicy`	Permission change event; drift trigger
`SwitchRole` (Console)	Human cross-account access

5.2 Architecture — Why `cloudtrail:LookupEvents` is not viable

Do not use cloudtrail:LookupEvents as the primary ingestion path.

The LookupEvents API has a hard limit of 5 TPS with 50 events per response. A production account generating thousands of AssumeRole and Invoke events per minute will hit this limit in seconds and return incomplete data. Additionally, LookupEvents only covers the last 90 days and cannot be filtered by time range efficiently. It is useful for one-off debugging, not systematic ingestion.

5.3 Recommended architecture: S3 + Athena

Option A — S3 direct read (recommended for MVP):

Customer has a CloudTrail trail writing to an S3 bucket (standard setup in most enterprises)
SV0 connector is granted s3:GetObject + s3:ListBucket on that bucket with a path prefix filter (e.g., AWSLogs/{account_id}/CloudTrail/)
Connector downloads compressed JSON log files for the target 30-day window, decompresses, filters for priority events
Aggregates AssumeRole + Invoke + GetSecretValue + GetObject calls per (principal, resource) pair → populates execution_30d / last_execution_at

Cost: S3 GET requests per scan. At ~10K log files per 30 days this is cents per scan. Customer controls the bucket.

Option B — Athena query (recommended for large accounts):

Customer creates an Athena table over the CloudTrail S3 prefix (AWS provides a standard DDL for this)
SV0 connector runs a parameterised Athena query with a 30-day eventTime filter and eventName IN (...) clause
Query results written to a customer-controlled S3 output bucket, connector reads results

Cost: Athena charges ~$5/TB scanned. For 30 days of typical CloudTrail data (~50 GB) this is ~$0.25 per scan.

Option C — EventBridge streaming (future): Customer enables CloudTrail → EventBridge; connector subscribes via SQS for near-real-time ingestion. More complex and requires customer-side infrastructure. Design Option A so this is addable as a configuration flag.

Recommendation: Start with Option A for MVP. Document Option B as the upgrade path for accounts producing >1 GB/day of CloudTrail logs.

6. Authentication Strategy

The connector needs AWS credentials. Options:

Method	Pros	Cons
Cross-account IAM Role (recommended)	No long-lived credentials; customer creates role with read-only policies	Requires customer to create role; ARN config per tenant
IAM User + Access Key	Simple to set up	Long-lived credentials; rotation risk
IAM Roles Anywhere (cert-based)	Works from non-AWS hosts	PKI setup complexity

Recommended approach: Cross-account read-only IAM Role with an external ID (ExternalId condition). Customer creates SecurityV0ReadOnlyRole in their account; connector assumes it via STS using the SV0 management account. ExternalId prevents confused-deputy attacks.

6.1 Bootstrap Credential Problem

The research previously left this unaddressed. This section resolves it.

The cross-account AssumeRole flow requires SV0 to have a starting AWS identity before it can call sts:AssumeRole into the customer account. The connector runs outside AWS (Docker on a Mac Mini M4), so it cannot use EC2 instance metadata or ECS task roles.

Resolved approach — SV0 Service Account IAM User per tenant:

SecurityV0 maintains a dedicated IAM User (e.g., sv0-connector-{tenant_id}) in a SV0-owned AWS management account. This user has only one permission: sts:AssumeRole on the specific customer role ARN.
The IAM User's access key + secret key are stored in 1Password per tenant, resolved at container start into env vars (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY).
The connector calls sts:AssumeRole with RoleArn=customer_role_arn and ExternalId=tenant_specific_secret to get short-lived session credentials.
All subsequent API calls use the short-lived session. Credentials expire after 1 hour (configurable up to 12h); connector refreshes before expiry.

This is structurally identical to how the ServiceNow connector handles Basic Auth credentials — a long-lived credential stored in 1Password bootstraps a shorter-lived session.

Security controls:

The IAM User's access key has zero permissions except sts:AssumeRole on the one role ARN
Customer's trust policy requires sts:ExternalId match — prevents impersonation even if the key leaks
Keys are rotated on the same schedule as other secrets (monthly minimum)
Key rotation is documented in the connector setup guide as a customer onboarding step

Future enhancement: Replace the IAM User with IAM Roles Anywhere + a certificate from the SV0 internal CA. Eliminates long-lived credentials entirely. Flag as a Phase 3 security hardening task.

6.2 Minimum Policy for the Customer Read-Only Role

Note: cloudtrail:LookupEvents has been removed (not viable at scale — see §5.2). CloudTrail access is now via S3 GetObject.

{
  "Effect": "Allow",
  "Action": [
    "iam:List*", "iam:Get*",
    "lambda:List*", "lambda:Get*",
    "ecs:List*", "ecs:Describe*",
    "ecr:List*", "ecr:Describe*", "ecr:GetRepositoryPolicy",
    "ecr:BatchGetImage",
    "bedrock:List*", "bedrock:Get*",
    "states:List*", "states:Describe*",
    "events:List*", "events:Describe*",
    "organizations:List*", "organizations:Describe*",
    "s3:ListAllMyBuckets", "s3:GetBucketPolicy", "s3:GetBucketAcl", "s3:GetBucketLocation",
    "secretsmanager:ListSecrets", "secretsmanager:DescribeSecret", "secretsmanager:GetResourcePolicy",
    "ssm:DescribeParameters", "ssm:GetParametersByPath", "ssm:ListTagsForResource",
    "kms:ListKeys", "kms:DescribeKey", "kms:GetKeyPolicy",
    "cloudtrail:GetTrailStatus", "cloudtrail:DescribeTrails",
    "athena:StartQueryExecution", "athena:GetQueryExecution", "athena:GetQueryResults"
  ],
  "Resource": "*"
},
{
  "Effect": "Allow",
  "Action": ["s3:GetObject", "s3:ListBucket"],
  "Resource": [
    "arn:aws:s3:::{customer-cloudtrail-bucket}",
    "arn:aws:s3:::{customer-cloudtrail-bucket}/*",
    "arn:aws:s3:::{customer-athena-results-bucket}",
    "arn:aws:s3:::{customer-athena-results-bucket}/*"
  ]
}

The CloudTrail and Athena bucket ARNs are provided by the customer during connector onboarding. Using Resource: "*" for S3 GetObject would be excessively broad and should be resisted even when customers offer it.

7. Connector Architecture — Implementation Sketch

Following the existing Extract → Transform → Diff → Load pattern:

7.1 Extract Phase

AWSExtractor
  ├── IAMExtractor         → users, roles, groups, policies, trust docs
  ├── LambdaExtractor      → functions, execution roles, event sources
  ├── ECSExtractor         → task definitions, services, clusters
  ├── ECRExtractor         → repositories, repository policies, image tags
  ├── BedrockExtractor     → agents, knowledge bases, action groups
  ├── StepFunctionsExtractor → state machines, definitions
  ├── EventBridgeExtractor → rules, targets
  ├── OrgsExtractor        → accounts, OUs, SCPs, SSO assignments
  └── CloudTrailExtractor  → recent AssumeRole, Invoke, data-access events

All extractors use boto3 with the assumed cross-account role session. Pagination is mandatory (AWS APIs are all paginated).

7.2 Transform Phase

Normalise raw AWS API responses to NormalizedGraph entities:

# Entity type mapping
IAM Role          → Entity(type="identity", subtype="iam_role",          source_system="aws_iam")
Lambda Function   → Entity(type="workload",  subtype="lambda_function",  source_system="aws_lambda")
ECS Task Def      → Entity(type="workload",  subtype="ecs_task",         source_system="aws_ecs")
ECR Repository    → Entity(type="resource",  subtype="ecr_repository",   source_system="aws_ecr")
S3 Bucket         → Entity(type="resource",  subtype="s3_bucket",        source_system="aws_s3")
Secrets Manager   → Entity(type="resource",  subtype="secrets_manager_secret", source_system="aws_secretsmanager")
SSM Parameter     → Entity(type="resource",  subtype="ssm_parameter",    source_system="aws_ssm")
IAM Policy        → Entity(type="role",      subtype="iam_policy",       source_system="aws_iam")
AWS Account       → Entity(type="tenant",    subtype="aws_account",      source_system="aws_orgs")

On IAM Policy → type="role": This mapping requires explicit justification against the SecurityV0 9-type data model. IAM Policies are not "roles" in the human sense; however, in the SecurityV0 entity model, type="role" denotes a permission assignment object — an entity that sits between an identity and a permission grant, analogous to an Azure RBAC role assignment or a ServiceNow OAuth scope. An IAM Managed Policy occupies this structural position: it binds to an identity (via HAS_POLICY) and grants permissions (via GRANTS). Using type="role" preserves the existing materialiser logic unchanged. An alternative mapping — type="permission_set" with a new subtype — would be cleaner semantically but requires a data model ADR and a materialiser change. Recommendation: file an ADR to introduce type="permission_set" before Phase 1 ships; use type="role" for the initial prototype only, and flag it with a _type_provisional: true annotation so the migration path is traceable.

7.3 Authority Path Materialisation Hooks

AWS authority paths follow the same materialisation chain as Azure:

Workload → RUNS_AS → IAM Role → HAS_POLICY → IAM Policy → GRANTS → Permission → APPLIES_TO → Resource

But AWS has additional complexity layers the materialiser must handle:

Effective permissions = identity policy ∩ NOT(permission boundary) ∩ NOT(SCP) — see §2.3 on the structural caveat; conditions are not evaluated in Phase 1.
Resource policies — S3 bucket policies, Lambda resource policies can grant access independently of the identity policy; these must generate their own GRANTS edges from the resource side, not just the identity side.
KMS key policies — implicit authority layer (warning): Many AWS resources (S3 objects, RDS clusters, Secrets Manager secrets, EBS volumes) are encrypted with KMS. A KMS key policy that grants kms:Decrypt to an IAM Role effectively grants read access to all resources encrypted with that key, regardless of the resource's own policy. This creates false-positive paths if the connector materialises S3 → IAM Role paths without also checking whether the role can decrypt the bucket's KMS key. In Phase 1, the materialiser should emit a kms_not_evaluated annotation on any path where the destination resource has a KMS key ARN. Phase 2 should collect KMS key policies and add DECRYPTABLE_BY edges so the materialiser can filter paths that would fail at the KMS layer.

All three require modelling as constraint/annotation edges evaluated at materialisation time.

8. Graph Enhancement Opportunities

8.1 Cross-Cloud Authority Paths (Azure → AWS)

The cross-connector correlation research identified the need for unified paths across platform boundaries. With an AWS connector live, the following cross-cloud patterns become detectable:

Azure Logic App → (HTTP) → API Gateway → Lambda → DynamoDB
Azure Foundry Agent → (HTTP) → Lambda (action group) → S3
GitHub Actions OIDC → AssumeRole → IAM Role → ECR push

These require the cross-connector entity resolution mechanism described in the correlation research. The AWS connector should emit entities with consistent ARN-based external_id values to enable correlation.

8.2 CI/CD → ECR → Runtime (Supply Chain Chain)

GitHub Actions       → (OIDC) → IAM Role (CI role)
IAM Role (CI)        → PUSH_TO → ECR Repository
ECR Repository       → IMAGE_RUNS_IN → Lambda / ECS Task
Lambda / ECS Task    → RUNS_AS → IAM Task Role
IAM Task Role        → HAS_POLICY → IAM Policy → GRANTS → S3/RDS/Secrets

This chain reveals that a misconfigured CI/CD pipeline (or compromised GitHub Actions workflow) can transitively own production data access. This is a supply chain authority path — a new finding type worth introducing.

8.3 Bedrock LLM Egress Paths

Bedrock Agent → RUNS_AS → IAM Role → GRANTS → S3 (knowledge base)
Bedrock Agent → INVOKES → Lambda (action group) → RUNS_AS → IAM Role → GRANTS → RDS

The existing egress_category: llm field in authority paths already anticipates this. The AWS connector enables the first real population of LLM egress paths.

8.4 Cross-Account Trust Amplification

Role in Dev Account → TRUSTS → Role in Prod Account
Role in Prod Account → HAS_POLICY → Production S3 / RDS

This is a trust amplification pattern — a lower-trust account can reach higher-trust resources via cross-account AssumeRole. The auth_chain_depth field already tracks hop count; cross-account hops should increment this and trigger a finding if depth exceeds threshold (e.g., > 2 hops into a restricted domain).

9. Gaps and Open Questions

Question	Recommended Resolution
Should SCPs be modelled as constraint edges or as a separate "ceiling" entity?	Separate ceiling entity with `CONSTRAINS` edge — mirrors how permission boundaries are handled
CloudTrail costs: who pays for the S3 export?	Customer responsibility; document in setup guide; S3 batch + Athena option keeps cost near zero for SV0
How to handle assume-role chains > 3 hops?	Cap at 5 hops in materialiser; emit a `deep_trust_chain` finding
EKS / Kubernetes identity (IRSA, Pod Identity)?	Defer to v2 of the connector — IRSA maps to IAM Roles and can be added incrementally
RDS / Aurora as destination: what metadata to collect?	Instance ARN, engine, VPC, security groups; no row-level data
Multi-region: how to handle?	Connector iterates all enabled regions; ARN uniquely identifies resources cross-region
How to distinguish `aws_lambda` as `source_system` per customer account?	Tenant-scoped source systems: `aws_lambda:{account_id}` — promote to first-class architectural decision; affects cross-connector entity correlation and UI display
Lambda→Lambda direct invocation (§3.8)?	Include `INVOKES` edge schema in Phase 1; implement detection in Phase 2 via policy inspection + CloudTrail
KMS key policies causing false-positive paths?	Phase 1: annotate paths with `kms_not_evaluated: true` when destination has a KMS key; Phase 2: collect key policies, add `DECRYPTABLE_BY` edges
IAM Access Analyzer as free signal?	IAM Access Analyzer already flags externally-accessible resources in customer accounts. Connector can call `accessanalyzer:ListFindings` to seed `scope_drift` findings without needing to re-derive them from policy text. Add to Phase 3 scope.
`IAM Policy → type="role"` provisional mapping?	File an ADR before Phase 1 ships to introduce `type="permission_set"`; current mapping is a prototype convenience only
ECR Inspector scan findings to graph edge?	`ECR Repository → HAS_FINDING → VulnerabilityFinding` — adds supply chain risk signal. Add `ecr:DescribeImageScanFindings` to the minimum policy; implement in Phase 2 alongside ECR pull chain.

10. Phased Delivery

Phase 1 — IAM + Lambda Baseline (MVP)

IAM: roles, policies, trust relationships
Lambda: functions, execution roles, event sources
Secrets Manager + SSM Parameter Store as destination resource types
CloudTrail ingestion via S3 direct read (Option A) for AssumeRole + Invoke events — 30-day window
Bootstrap credential mechanism: SV0 IAM User per tenant + ExternalId (§6.1)
File ADR for type="permission_set" before shipping
All AWS paths carry conditions_not_evaluated: true annotation; surface in UI

Phase 2 — Container, ECR, and Lambda→Lambda Chain

ECR: repositories, repository policies, image-to-workload linkage
ECR Inspector scan findings → HAS_FINDING graph edges
ECS: task definitions, task roles, execution roles
CI/CD → ECR → runtime chain edges (supply chain authority paths)
Lambda→Lambda invocation detection
KMS key policy collection + DECRYPTABLE_BY edges; suppress false-positive paths

Phase 3 — Multi-Account, Bedrock, and IAM Access Analyzer

AWS Organizations: accounts, SCPs, OUs
IAM Identity Center: permission sets, assignments
Bedrock agents, knowledge bases, action groups
Cross-account trust amplification findings
IAM Access Analyzer findings as seeded scope_drift signals
IAM Roles Anywhere as credential bootstrap replacement (security hardening)

Phase 4 — Deep Temporal Evidence and EKS

Full CloudTrail data-access event processing (S3, DynamoDB, Secrets Manager)
Athena upgrade for large accounts (Option B)
EventBridge streaming option (Option C)
EKS / IRSA / Pod Identity support

Connector Framework (05-connectors) — interface contract, Extract→Transform→Diff→Load pattern
Access Paths (10-access-paths) — path data model and API
Cross-Connector Entity Correlation Research — unified path materialisation across connectors
Azure Foundry Connector Plan — reference implementation patterns
Inetum Auth Integration Response — customer context driving this work

Next Action

Status: research-complete Decision needed from: PO (Ivan) Options:

Adopt — create GitHub issue in sv0-connectors to implement Phase 1 (IAM + Lambda MVP): AWSExtractor, IAM Roles/Policies, Lambda functions, CloudTrail AssumeRole temporal evidence, S3 batch export
Defer — revisit after current Inetum engagement closes (Q2 2026)
Reject — not applicable given Inetum and other customers confirmed AWS workload coverage requirement

GitHub Discussion: not yet created

1. Context and Motivation​

What the AWS connector must answer​

2. AWS Identity Model — Key Concepts​

2.1 Identity Types​

2.2 Policy Types (Authority Sources)​

2.3 Structural vs Effective Permissions — Critical Scope Caveat​

2.4 Trust Relationships​

3. Workload Types to Model​

3.1 AWS Lambda​

3.2 ECS (Elastic Container Service)​

3.3 ECR (Elastic Container Registry) — The Code-Deploy Chain​

3.4 AWS Bedrock (AI Workloads)​

3.5 Step Functions​

3.6 EventBridge​

3.7 Sensitive Data Stores (Secrets Manager + SSM Parameter Store)​

3.8 Lambda → Lambda Direct Invocation​

4. Cross-Account and Multi-Account Modelling​

4.1 What to collect (Organizations / IAM Identity Center)​

4.2 Graph model for cross-account authority​

5. CloudTrail — Temporal Evidence​

5.1 Priority events to capture​

5.2 Architecture — Why cloudtrail:LookupEvents is not viable​

5.3 Recommended architecture: S3 + Athena​

6. Authentication Strategy​

6.1 Bootstrap Credential Problem​

6.2 Minimum Policy for the Customer Read-Only Role​

7. Connector Architecture — Implementation Sketch​

7.1 Extract Phase​

7.2 Transform Phase​

7.3 Authority Path Materialisation Hooks​

8. Graph Enhancement Opportunities​

8.1 Cross-Cloud Authority Paths (Azure → AWS)​

8.2 CI/CD → ECR → Runtime (Supply Chain Chain)​

8.3 Bedrock LLM Egress Paths​

8.4 Cross-Account Trust Amplification​

9. Gaps and Open Questions​

10. Phased Delivery​

Phase 1 — IAM + Lambda Baseline (MVP)​

Phase 2 — Container, ECR, and Lambda→Lambda Chain​

Phase 3 — Multi-Account, Bedrock, and IAM Access Analyzer​

Phase 4 — Deep Temporal Evidence and EKS​

11. Related Documents​

Next Action​

1. Context and Motivation

What the AWS connector must answer

2. AWS Identity Model — Key Concepts

2.1 Identity Types

2.2 Policy Types (Authority Sources)

2.3 Structural vs Effective Permissions — Critical Scope Caveat

2.4 Trust Relationships

3. Workload Types to Model

3.1 AWS Lambda

3.2 ECS (Elastic Container Service)

3.3 ECR (Elastic Container Registry) — The Code-Deploy Chain

3.4 AWS Bedrock (AI Workloads)

3.5 Step Functions

3.6 EventBridge

3.7 Sensitive Data Stores (Secrets Manager + SSM Parameter Store)

3.8 Lambda → Lambda Direct Invocation

4. Cross-Account and Multi-Account Modelling

4.1 What to collect (Organizations / IAM Identity Center)

4.2 Graph model for cross-account authority

5. CloudTrail — Temporal Evidence

5.1 Priority events to capture

5.2 Architecture — Why `cloudtrail:LookupEvents` is not viable

5.3 Recommended architecture: S3 + Athena

6. Authentication Strategy

6.1 Bootstrap Credential Problem

6.2 Minimum Policy for the Customer Read-Only Role

7. Connector Architecture — Implementation Sketch

7.1 Extract Phase

7.2 Transform Phase

7.3 Authority Path Materialisation Hooks

8. Graph Enhancement Opportunities

8.1 Cross-Cloud Authority Paths (Azure → AWS)

8.2 CI/CD → ECR → Runtime (Supply Chain Chain)

8.3 Bedrock LLM Egress Paths

8.4 Cross-Account Trust Amplification

9. Gaps and Open Questions

10. Phased Delivery

Phase 1 — IAM + Lambda Baseline (MVP)

Phase 2 — Container, ECR, and Lambda→Lambda Chain

Phase 3 — Multi-Account, Bedrock, and IAM Access Analyzer

Phase 4 — Deep Temporal Evidence and EKS

11. Related Documents

Next Action