Skip to main content

AWS Connector — Research & Implementation Plan

Date: 2026-03-11 Status: Draft v2 — review feedback incorporated (Delta blockers 1–4, warnings 5–7) Scope: sv0-connectors (new aws connector), sv0-platform (graph model extensions) Trigger: Inetum customer engagement confirmed AWS workload coverage is a critical gap; customers run mixed Azure + AWS environments where authority paths cross cloud boundaries.


1. Context and Motivation

SecurityV0 currently covers the Microsoft identity plane (Entra ID, Azure Foundry, ServiceNow). Enterprise customers — including Inetum — operate significant workloads on AWS: Lambda functions calling RDS, ECS containers pulling from ECR, Bedrock agents orchestrating across accounts, and Step Functions triggering cross-account executions.

Without an AWS connector, authority paths that originate in or terminate on AWS resources cannot be materialised. This is a visibility gap: a Lambda function assuming an over-privileged IAM Role and reading production S3 data is structurally identical to the Azure Logic App → ServiceNow scenarios we already model — but completely invisible to the platform.

What the AWS connector must answer

  1. What workloads run on AWS and what identities do they assume?
  2. What resources can those identities access, and via which policies?
  3. What is the code execution chain? (ECR → container → task role → resource)
  4. Where does authority cross account or cloud boundaries?
  5. What is the observed execution evidence? (CloudTrail as temporal data source)

2. AWS Identity Model — Key Concepts

Understanding the AWS IAM model is prerequisite to correct graph modelling.

2.1 Identity Types

AWS EntitySecurityV0 Entity TypeNotes
IAM Useridentity (subtype: iam_user)Human or programmatic; avoid in modern AWS
IAM Roleidentity (subtype: iam_role)Primary execution identity; assumed via STS
IAM Groupidentity (subtype: iam_group)Attached to users; no direct execution
IAM Identity Center Permission Setidentity (subtype: sso_permission_set)Federated human access; maps to role assumption
Service-Linked Roleidentity (subtype: service_linked_role)AWS-managed; low risk but should be visible
Instance Profileedge annotationWraps an IAM Role for EC2 attachment; not a separate entity
OIDC Federated Identityedge (TRUSTS)Kubernetes Service Account → IAM Role via IRSA

2.2 Policy Types (Authority Sources)

Policy TypePriority for Graph
Identity-based policies (inline + managed)Critical — direct authority grant
Resource-based policies (S3 Bucket Policy, Lambda Resource Policy)High — cross-account and cross-service authority
Permission BoundariesHigh — ceiling on effective permissions; must be modelled as a constraint edge
Service Control Policies (SCPs)High — org-level ceiling; organisation connector scope
Session PoliciesMedium — short-lived; capture from CloudTrail
ACLs (legacy S3)Low — mostly legacy, can be deferred

2.3 Structural vs Effective Permissions — Critical Scope Caveat

This connector models structural (granted) permissions, not effective permissions.

AWS IAM Conditions (aws:SourceVpc, aws:RequestedRegion, StringEquals aws:PrincipalOrgID, etc.) are not evaluated by the connector. A policy statement that grants s3:GetObject with a condition like "StringEquals": {"s3:prefix": "logs/"} will be ingested as an unconditional GRANTS edge to that S3 bucket.

This means:

  • Authority paths may over-report reachability when conditions restrict access at runtime
  • The via_roles and actions fields reflect policy text, not runtime enforcement
  • Wherever conditions are material to risk, the finding explanation should carry a conditions_not_evaluated: true flag and the UI should surface a caveat

Implication for Phase 1: all authority paths produced by the AWS connector carry an implicit "structurally reachable, conditions not evaluated" qualifier. This is consistent with how the Azure connector handles ARM RBAC conditions today. The caveat must be documented in the customer-facing setup guide and in the UI tooltip for AWS-sourced paths.

Conditions worth modelling in a future phase: aws:PrincipalOrgID (org boundary), aws:SourceAccount (confused-deputy prevention), sts:ExternalId (cross-account guard), aws:MultiFactorAuthPresent. These reduce the effective authority surface and would shrink false-positive path counts meaningfully.


2.4 Trust Relationships

Every IAM Role has a Trust Policy that defines who can assume it. This is the primary mechanism for cross-account and cross-service authority chains:

Lambda Service Principal → AssumeRole → Execution Role
ECS Task Definition → AssumeRole → Task Role
Bedrock Agent → AssumeRole → Agent Execution Role
Cross-Account Caller → AssumeRole → Role in Target Account
EC2 Instance Profile → AssumeRole → Role
OIDC Provider (EKS IRSA) → AssumeRole → Role (via web identity)

These trust relationships map directly to RUNS_AS and ASSUMES edges in the SecurityV0 graph.


3. Workload Types to Model

3.1 AWS Lambda

What to collect:

  • Function name, ARN, runtime, description, last modified
  • Execution role ARN (RUNS_AS edge)
  • Resource-based policy (who can invoke the function — cross-service / cross-account)
  • VPC configuration (network isolation context)
  • Package type: Zip vs Image — if Image, ECR repository URI + image digest
  • Concurrency settings (reserved / provisioned)
  • Environment variable keys (not values — PII risk; but presence of DB_URL, SECRET_ARN etc. informs destination inference)
  • Layers (shared code; potential additional authority surface)
  • Event source mappings (what triggers this function: SQS, DynamoDB Streams, Kinesis, EventBridge)

Graph edges:

Lambda Function → RUNS_AS         → IAM Role (execution role)
Lambda Function → TRIGGERED_BY → EventBridge Rule / SQS Queue / SNS Topic
Lambda Function → DEPLOYED_FROM → ECR Repository (container image)
Lambda Function → READS_FROM → S3 Bucket / DynamoDB Table (env var inference)
IAM Role → HAS_POLICY → IAM Policy
IAM Policy → GRANTS → Permission (action + resource ARN)
Permission → APPLIES_TO → AWS Resource (S3, DynamoDB, RDS, etc.)

3.2 ECS (Elastic Container Service)

What to collect:

  • Task Definition: family, revision, task role ARN, execution role ARN
  • Container definitions: image URI (ECR reference), command, environment variable keys
  • Services: cluster, desired count, launch type (Fargate vs EC2)
  • Cluster configuration

Key distinction: ECS has two roles:

  • Execution Role — used by ECS agent to pull ECR images, write CloudWatch logs
  • Task Role — used by the application code inside the container

Both must be captured. The Task Role is the workload's runtime identity (authority source). The Execution Role is infrastructure-level (lower priority but relevant for ECR pull chain).

Graph edges:

ECS Task Definition → RUNS_AS       → IAM Task Role
ECS Task Definition → DEPLOYED_FROM → ECR Repository (image URI)
ECS Task Definition → PULLS_VIA → IAM Execution Role (ECR pull authority)
ECS Service → RUNS → ECS Task Definition

3.3 ECR (Elastic Container Registry) — The Code-Deploy Chain

ECR is the source of execution artefacts for containerised workloads. It is the AWS equivalent of an artefact registry and sits at the start of the execution chain.

What to collect:

  • Repository name, ARN, URI, account ID
  • Repository policy (who can pull/push — cross-account access)
  • Image tags and digests in use (link to live task definitions / Lambda functions)
  • Lifecycle policy (image retention — affects drift detection)
  • Encryption configuration (KMS key ARN)
  • Scan findings summary (ECR enhanced scanning via Inspector)

Why ECR matters for authority paths: An ECR repository with a permissive resource-based policy is an injection point — a cross-account role that can push images can alter what code runs inside the task, bypassing IAM role controls entirely. This is a high-severity pattern: the authority chain runs through the artefact, not just the role.

Graph edges:

ECR Repository      → HOSTS         → Container Image (tag/digest)
Container Image → EXECUTED_BY → Lambda Function / ECS Task Definition
ECR Repository → PULL_ACCESS → IAM Role (from repository policy)
ECR Repository → PUSH_ACCESS → IAM Role (CI/CD service role — high sensitivity)

3.4 AWS Bedrock (AI Workloads)

Given increasing Bedrock adoption, this is high value for the AI-workload coverage story.

What to collect:

  • Bedrock Agent: ID, name, execution role ARN, foundation model ID
  • Agent Action Groups: Lambda function ARN (the code the agent executes)
  • Knowledge Bases: ID, data source (S3 bucket), embedding model
  • Guardrails: ID, topics blocked, PII filtering
  • Model invocation logging: enabled/disabled (audit trail)

Graph edges:

Bedrock Agent       → RUNS_AS       → IAM Execution Role
Bedrock Agent → INVOKES → Lambda Function (action group)
Bedrock Agent → READS_FROM → S3 Bucket (knowledge base source)
Lambda Function → RUNS_AS → IAM Role (action group execution)

Why this matters: A Bedrock Agent is an autonomous workload. Its authority path — Agent → Role → S3 data → external API — is exactly the LLM egress pattern (egress_category: llm) already defined in the authority paths model. The data model supports it; we just need the connector.

3.5 Step Functions

What to collect:

  • State machine: ARN, name, type (Standard / Express), IAM role
  • Definition: states, resource ARNs invoked (Lambda, ECS, DynamoDB, etc.)
  • Execution logging configuration

Graph edges:

Step Functions SM   → RUNS_AS       → IAM Role
Step Functions SM → INVOKES → Lambda Function / ECS Task / SNS / SQS

Step Functions are orchestrators — they chain workloads. A state machine with an over-privileged role can invoke Lambda, start ECS tasks, write to DynamoDB, and call external APIs in a single execution. Visibility into the full invocation graph is critical for authority path reconstruction.

3.6 EventBridge

What to collect:

  • Rules: name, event pattern or schedule, target ARNs, IAM role (if cross-account)
  • Targets: Lambda, ECS, Step Functions, SNS, SQS, external API destinations

EventBridge rules are trigger sources for workloads. The TRIGGERED_BY edge completes the "what started this?" question in authority path lineage.

3.7 Sensitive Data Stores (Secrets Manager + SSM Parameter Store)

These are the highest-value destination types for sensitive credential data and were missing from the initial draft.

AWS Secrets Manager:

  • Secret ARN, name, description, KMS key ID (encryption key)
  • Resource policy (who can read cross-account)
  • Rotation status (unrotated secrets = stale credential risk)
  • CloudTrail GetSecretValue events — direct evidence of access

AWS Systems Manager Parameter Store:

  • Parameter name, ARN, type (String / SecureString / StringList)
  • KMS key ID (for SecureString)
  • GetParameter / GetParameters CloudTrail events

Both are resource entities with subtype: secrets_manager_secret / subtype: ssm_parameter. IAM policies granting secretsmanager:GetSecretValue or ssm:GetParameter create GRANTS edges to these destinations with sensitivity: restricted by default.

Graph edges:

IAM Policy     → GRANTS      → Secrets Manager Secret (sensitivity: restricted)
IAM Policy → GRANTS → SSM SecureString Parameter (sensitivity: restricted)
Lambda Function → READS_FROM → Secrets Manager Secret (env var `SECRET_ARN` inference)

Minimum policy additions (update §6):

"secretsmanager:ListSecrets", "secretsmanager:DescribeSecret", "secretsmanager:GetResourcePolicy",
"ssm:DescribeParameters", "ssm:GetParametersByPath", "ssm:ListTagsForResource"

Note: do not add secretsmanager:GetSecretValue or ssm:GetParameter — the connector reads metadata only, never secret values.

3.8 Lambda → Lambda Direct Invocation

Lambda functions can invoke other Lambda functions directly via lambda:InvokeFunction. This creates workload-to-workload authority edges that are distinct from the workload→identity→resource chain.

Graph edge:

Lambda Function A → INVOKES → Lambda Function B
Lambda Function B → RUNS_AS → IAM Role B → HAS_POLICY → ...

This is significant because a low-sensitivity Lambda with a permissive execution role can transitively reach high-sensitivity data by invoking a privileged Lambda. The auth_chain_depth must be incremented per hop.

How to detect: lambda:InvokeFunction permissions in identity-based policies + Lambda resource-based policies listing caller ARNs. Also collectible from CloudTrail Invoke events with a non-human caller identity.

Defer to Phase 2 but include in schema from Phase 1 to avoid a breaking change later.


4. Cross-Account and Multi-Account Modelling

AWS Organizations is the control plane for multi-account AWS environments. This is where Service Control Policies (SCPs) live — they are org-level authority ceilings that cannot be exceeded by any IAM policy in a member account.

4.1 What to collect (Organizations / IAM Identity Center)

  • AWS Organizations: Management account, member accounts, OUs, applied SCPs
  • IAM Identity Center (SSO): Permission sets, account assignments (user/group → permission set → account)
  • Cross-account trust policies: IAM roles with sts:AssumeRole trust to external accounts

4.2 Graph model for cross-account authority

SCP (Org-level)     → CONSTRAINS    → AWS Account (ceiling on all roles in account)
IAM Role (Account A)→ TRUSTS → IAM Role (Account B) [cross-account AssumeRole]
IAM Role (Account A)→ TRUSTS → AWS Service Principal (service-to-service)
IAM Role → SUBJECT_TO → Permission Boundary (per-role ceiling)

The CONSTRAINS and SUBJECT_TO edges are ceiling edges — they reduce effective authority. The graph engine must apply these during authority path materialisation to compute the effective permission set, not just the granted set.


5. CloudTrail — Temporal Evidence

CloudTrail is the AWS equivalent of execution logs. It provides the observed execution evidence the authority paths model depends on.

5.1 Priority events to capture

CloudTrail EventMaps To
AssumeRoleConfirms a role assumption occurred; populates execution_30d, last_execution_at
Invoke (Lambda)Confirms function execution; links to workload
GetSecretValue (Secrets Manager)Sensitive data access; informs data_domain
GetObject / PutObject (S3)Data read/write; informs actions[]
DescribeInstances / resource APIsResource enumeration; scope drift signal
CreateRole / AttachRolePolicyPermission change event; drift trigger
SwitchRole (Console)Human cross-account access

5.2 Architecture — Why cloudtrail:LookupEvents is not viable

Do not use cloudtrail:LookupEvents as the primary ingestion path.

The LookupEvents API has a hard limit of 5 TPS with 50 events per response. A production account generating thousands of AssumeRole and Invoke events per minute will hit this limit in seconds and return incomplete data. Additionally, LookupEvents only covers the last 90 days and cannot be filtered by time range efficiently. It is useful for one-off debugging, not systematic ingestion.

Option A — S3 direct read (recommended for MVP):

  1. Customer has a CloudTrail trail writing to an S3 bucket (standard setup in most enterprises)
  2. SV0 connector is granted s3:GetObject + s3:ListBucket on that bucket with a path prefix filter (e.g., AWSLogs/{account_id}/CloudTrail/)
  3. Connector downloads compressed JSON log files for the target 30-day window, decompresses, filters for priority events
  4. Aggregates AssumeRole + Invoke + GetSecretValue + GetObject calls per (principal, resource) pair → populates execution_30d / last_execution_at

Cost: S3 GET requests per scan. At ~10K log files per 30 days this is cents per scan. Customer controls the bucket.

Option B — Athena query (recommended for large accounts):

  1. Customer creates an Athena table over the CloudTrail S3 prefix (AWS provides a standard DDL for this)
  2. SV0 connector runs a parameterised Athena query with a 30-day eventTime filter and eventName IN (...) clause
  3. Query results written to a customer-controlled S3 output bucket, connector reads results

Cost: Athena charges ~$5/TB scanned. For 30 days of typical CloudTrail data (~50 GB) this is ~$0.25 per scan.

Option C — EventBridge streaming (future): Customer enables CloudTrail → EventBridge; connector subscribes via SQS for near-real-time ingestion. More complex and requires customer-side infrastructure. Design Option A so this is addable as a configuration flag.

Recommendation: Start with Option A for MVP. Document Option B as the upgrade path for accounts producing >1 GB/day of CloudTrail logs.


6. Authentication Strategy

The connector needs AWS credentials. Options:

MethodProsCons
Cross-account IAM Role (recommended)No long-lived credentials; customer creates role with read-only policiesRequires customer to create role; ARN config per tenant
IAM User + Access KeySimple to set upLong-lived credentials; rotation risk
IAM Roles Anywhere (cert-based)Works from non-AWS hostsPKI setup complexity

Recommended approach: Cross-account read-only IAM Role with an external ID (ExternalId condition). Customer creates SecurityV0ReadOnlyRole in their account; connector assumes it via STS using the SV0 management account. ExternalId prevents confused-deputy attacks.

6.1 Bootstrap Credential Problem

The research previously left this unaddressed. This section resolves it.

The cross-account AssumeRole flow requires SV0 to have a starting AWS identity before it can call sts:AssumeRole into the customer account. The connector runs outside AWS (Docker on a Mac Mini M4), so it cannot use EC2 instance metadata or ECS task roles.

Resolved approach — SV0 Service Account IAM User per tenant:

  1. SecurityV0 maintains a dedicated IAM User (e.g., sv0-connector-{tenant_id}) in a SV0-owned AWS management account. This user has only one permission: sts:AssumeRole on the specific customer role ARN.
  2. The IAM User's access key + secret key are stored in 1Password per tenant, resolved at container start into env vars (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY).
  3. The connector calls sts:AssumeRole with RoleArn=customer_role_arn and ExternalId=tenant_specific_secret to get short-lived session credentials.
  4. All subsequent API calls use the short-lived session. Credentials expire after 1 hour (configurable up to 12h); connector refreshes before expiry.

This is structurally identical to how the ServiceNow connector handles Basic Auth credentials — a long-lived credential stored in 1Password bootstraps a shorter-lived session.

Security controls:

  • The IAM User's access key has zero permissions except sts:AssumeRole on the one role ARN
  • Customer's trust policy requires sts:ExternalId match — prevents impersonation even if the key leaks
  • Keys are rotated on the same schedule as other secrets (monthly minimum)
  • Key rotation is documented in the connector setup guide as a customer onboarding step

Future enhancement: Replace the IAM User with IAM Roles Anywhere + a certificate from the SV0 internal CA. Eliminates long-lived credentials entirely. Flag as a Phase 3 security hardening task.

6.2 Minimum Policy for the Customer Read-Only Role

Note: cloudtrail:LookupEvents has been removed (not viable at scale — see §5.2). CloudTrail access is now via S3 GetObject.

{
"Effect": "Allow",
"Action": [
"iam:List*", "iam:Get*",
"lambda:List*", "lambda:Get*",
"ecs:List*", "ecs:Describe*",
"ecr:List*", "ecr:Describe*", "ecr:GetRepositoryPolicy",
"ecr:BatchGetImage",
"bedrock:List*", "bedrock:Get*",
"states:List*", "states:Describe*",
"events:List*", "events:Describe*",
"organizations:List*", "organizations:Describe*",
"s3:ListAllMyBuckets", "s3:GetBucketPolicy", "s3:GetBucketAcl", "s3:GetBucketLocation",
"secretsmanager:ListSecrets", "secretsmanager:DescribeSecret", "secretsmanager:GetResourcePolicy",
"ssm:DescribeParameters", "ssm:GetParametersByPath", "ssm:ListTagsForResource",
"kms:ListKeys", "kms:DescribeKey", "kms:GetKeyPolicy",
"cloudtrail:GetTrailStatus", "cloudtrail:DescribeTrails",
"athena:StartQueryExecution", "athena:GetQueryExecution", "athena:GetQueryResults"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::{customer-cloudtrail-bucket}",
"arn:aws:s3:::{customer-cloudtrail-bucket}/*",
"arn:aws:s3:::{customer-athena-results-bucket}",
"arn:aws:s3:::{customer-athena-results-bucket}/*"
]
}

The CloudTrail and Athena bucket ARNs are provided by the customer during connector onboarding. Using Resource: "*" for S3 GetObject would be excessively broad and should be resisted even when customers offer it.


7. Connector Architecture — Implementation Sketch

Following the existing Extract → Transform → Diff → Load pattern:

7.1 Extract Phase

AWSExtractor
├── IAMExtractor → users, roles, groups, policies, trust docs
├── LambdaExtractor → functions, execution roles, event sources
├── ECSExtractor → task definitions, services, clusters
├── ECRExtractor → repositories, repository policies, image tags
├── BedrockExtractor → agents, knowledge bases, action groups
├── StepFunctionsExtractor → state machines, definitions
├── EventBridgeExtractor → rules, targets
├── OrgsExtractor → accounts, OUs, SCPs, SSO assignments
└── CloudTrailExtractor → recent AssumeRole, Invoke, data-access events

All extractors use boto3 with the assumed cross-account role session. Pagination is mandatory (AWS APIs are all paginated).

7.2 Transform Phase

Normalise raw AWS API responses to NormalizedGraph entities:

# Entity type mapping
IAM Role → Entity(type="identity", subtype="iam_role", source_system="aws_iam")
Lambda Function → Entity(type="workload", subtype="lambda_function", source_system="aws_lambda")
ECS Task Def → Entity(type="workload", subtype="ecs_task", source_system="aws_ecs")
ECR Repository → Entity(type="resource", subtype="ecr_repository", source_system="aws_ecr")
S3 Bucket → Entity(type="resource", subtype="s3_bucket", source_system="aws_s3")
Secrets Manager → Entity(type="resource", subtype="secrets_manager_secret", source_system="aws_secretsmanager")
SSM Parameter → Entity(type="resource", subtype="ssm_parameter", source_system="aws_ssm")
IAM Policy → Entity(type="role", subtype="iam_policy", source_system="aws_iam")
AWS Account → Entity(type="tenant", subtype="aws_account", source_system="aws_orgs")

On IAM Policy → type="role": This mapping requires explicit justification against the SecurityV0 9-type data model. IAM Policies are not "roles" in the human sense; however, in the SecurityV0 entity model, type="role" denotes a permission assignment object — an entity that sits between an identity and a permission grant, analogous to an Azure RBAC role assignment or a ServiceNow OAuth scope. An IAM Managed Policy occupies this structural position: it binds to an identity (via HAS_POLICY) and grants permissions (via GRANTS). Using type="role" preserves the existing materialiser logic unchanged. An alternative mapping — type="permission_set" with a new subtype — would be cleaner semantically but requires a data model ADR and a materialiser change. Recommendation: file an ADR to introduce type="permission_set" before Phase 1 ships; use type="role" for the initial prototype only, and flag it with a _type_provisional: true annotation so the migration path is traceable.

7.3 Authority Path Materialisation Hooks

AWS authority paths follow the same materialisation chain as Azure:

Workload → RUNS_AS → IAM Role → HAS_POLICY → IAM Policy → GRANTS → Permission → APPLIES_TO → Resource

But AWS has additional complexity layers the materialiser must handle:

  1. Effective permissions = identity policy ∩ NOT(permission boundary) ∩ NOT(SCP) — see §2.3 on the structural caveat; conditions are not evaluated in Phase 1.
  2. Resource policies — S3 bucket policies, Lambda resource policies can grant access independently of the identity policy; these must generate their own GRANTS edges from the resource side, not just the identity side.
  3. KMS key policies — implicit authority layer (warning): Many AWS resources (S3 objects, RDS clusters, Secrets Manager secrets, EBS volumes) are encrypted with KMS. A KMS key policy that grants kms:Decrypt to an IAM Role effectively grants read access to all resources encrypted with that key, regardless of the resource's own policy. This creates false-positive paths if the connector materialises S3 → IAM Role paths without also checking whether the role can decrypt the bucket's KMS key. In Phase 1, the materialiser should emit a kms_not_evaluated annotation on any path where the destination resource has a KMS key ARN. Phase 2 should collect KMS key policies and add DECRYPTABLE_BY edges so the materialiser can filter paths that would fail at the KMS layer.

All three require modelling as constraint/annotation edges evaluated at materialisation time.


8. Graph Enhancement Opportunities

8.1 Cross-Cloud Authority Paths (Azure → AWS)

The cross-connector correlation research identified the need for unified paths across platform boundaries. With an AWS connector live, the following cross-cloud patterns become detectable:

Azure Logic App → (HTTP) → API Gateway → Lambda → DynamoDB
Azure Foundry Agent → (HTTP) → Lambda (action group) → S3
GitHub Actions OIDC → AssumeRole → IAM Role → ECR push

These require the cross-connector entity resolution mechanism described in the correlation research. The AWS connector should emit entities with consistent ARN-based external_id values to enable correlation.

8.2 CI/CD → ECR → Runtime (Supply Chain Chain)

GitHub Actions       → (OIDC) → IAM Role (CI role)
IAM Role (CI) → PUSH_TO → ECR Repository
ECR Repository → IMAGE_RUNS_IN → Lambda / ECS Task
Lambda / ECS Task → RUNS_AS → IAM Task Role
IAM Task Role → HAS_POLICY → IAM Policy → GRANTS → S3/RDS/Secrets

This chain reveals that a misconfigured CI/CD pipeline (or compromised GitHub Actions workflow) can transitively own production data access. This is a supply chain authority path — a new finding type worth introducing.

8.3 Bedrock LLM Egress Paths

Bedrock Agent → RUNS_AS → IAM Role → GRANTS → S3 (knowledge base)
Bedrock Agent → INVOKES → Lambda (action group) → RUNS_AS → IAM Role → GRANTS → RDS

The existing egress_category: llm field in authority paths already anticipates this. The AWS connector enables the first real population of LLM egress paths.

8.4 Cross-Account Trust Amplification

Role in Dev Account → TRUSTS → Role in Prod Account
Role in Prod Account → HAS_POLICY → Production S3 / RDS

This is a trust amplification pattern — a lower-trust account can reach higher-trust resources via cross-account AssumeRole. The auth_chain_depth field already tracks hop count; cross-account hops should increment this and trigger a finding if depth exceeds threshold (e.g., > 2 hops into a restricted domain).


9. Gaps and Open Questions

QuestionRecommended Resolution
Should SCPs be modelled as constraint edges or as a separate "ceiling" entity?Separate ceiling entity with CONSTRAINS edge — mirrors how permission boundaries are handled
CloudTrail costs: who pays for the S3 export?Customer responsibility; document in setup guide; S3 batch + Athena option keeps cost near zero for SV0
How to handle assume-role chains > 3 hops?Cap at 5 hops in materialiser; emit a deep_trust_chain finding
EKS / Kubernetes identity (IRSA, Pod Identity)?Defer to v2 of the connector — IRSA maps to IAM Roles and can be added incrementally
RDS / Aurora as destination: what metadata to collect?Instance ARN, engine, VPC, security groups; no row-level data
Multi-region: how to handle?Connector iterates all enabled regions; ARN uniquely identifies resources cross-region
How to distinguish aws_lambda as source_system per customer account?Tenant-scoped source systems: aws_lambda:{account_id} — promote to first-class architectural decision; affects cross-connector entity correlation and UI display
Lambda→Lambda direct invocation (§3.8)?Include INVOKES edge schema in Phase 1; implement detection in Phase 2 via policy inspection + CloudTrail
KMS key policies causing false-positive paths?Phase 1: annotate paths with kms_not_evaluated: true when destination has a KMS key; Phase 2: collect key policies, add DECRYPTABLE_BY edges
IAM Access Analyzer as free signal?IAM Access Analyzer already flags externally-accessible resources in customer accounts. Connector can call accessanalyzer:ListFindings to seed scope_drift findings without needing to re-derive them from policy text. Add to Phase 3 scope.
IAM Policy → type="role" provisional mapping?File an ADR before Phase 1 ships to introduce type="permission_set"; current mapping is a prototype convenience only
ECR Inspector scan findings to graph edge?ECR Repository → HAS_FINDING → VulnerabilityFinding — adds supply chain risk signal. Add ecr:DescribeImageScanFindings to the minimum policy; implement in Phase 2 alongside ECR pull chain.

10. Phased Delivery

Phase 1 — IAM + Lambda Baseline (MVP)

  • IAM: roles, policies, trust relationships
  • Lambda: functions, execution roles, event sources
  • Secrets Manager + SSM Parameter Store as destination resource types
  • CloudTrail ingestion via S3 direct read (Option A) for AssumeRole + Invoke events — 30-day window
  • Bootstrap credential mechanism: SV0 IAM User per tenant + ExternalId (§6.1)
  • File ADR for type="permission_set" before shipping
  • All AWS paths carry conditions_not_evaluated: true annotation; surface in UI

Phase 2 — Container, ECR, and Lambda→Lambda Chain

  • ECR: repositories, repository policies, image-to-workload linkage
  • ECR Inspector scan findings → HAS_FINDING graph edges
  • ECS: task definitions, task roles, execution roles
  • CI/CD → ECR → runtime chain edges (supply chain authority paths)
  • Lambda→Lambda invocation detection
  • KMS key policy collection + DECRYPTABLE_BY edges; suppress false-positive paths

Phase 3 — Multi-Account, Bedrock, and IAM Access Analyzer

  • AWS Organizations: accounts, SCPs, OUs
  • IAM Identity Center: permission sets, assignments
  • Bedrock agents, knowledge bases, action groups
  • Cross-account trust amplification findings
  • IAM Access Analyzer findings as seeded scope_drift signals
  • IAM Roles Anywhere as credential bootstrap replacement (security hardening)

Phase 4 — Deep Temporal Evidence and EKS

  • Full CloudTrail data-access event processing (S3, DynamoDB, Secrets Manager)
  • Athena upgrade for large accounts (Option B)
  • EventBridge streaming option (Option C)
  • EKS / IRSA / Pod Identity support


Next Action

Status: research-complete Decision needed from: PO (Ivan) Options:

  1. Adopt — create GitHub issue in sv0-connectors to implement Phase 1 (IAM + Lambda MVP): AWSExtractor, IAM Roles/Policies, Lambda functions, CloudTrail AssumeRole temporal evidence, S3 batch export
  2. Defer — revisit after current Inetum engagement closes (Q2 2026)
  3. Reject — not applicable given Inetum and other customers confirmed AWS workload coverage requirement

GitHub Discussion: not yet created