AWS Connector — Research & Implementation Plan
Date: 2026-03-11
Status: Draft v2 — review feedback incorporated (Delta blockers 1–4, warnings 5–7)
Scope: sv0-connectors (new aws connector), sv0-platform (graph model extensions)
Trigger: Inetum customer engagement confirmed AWS workload coverage is a critical gap; customers run mixed Azure + AWS environments where authority paths cross cloud boundaries.
1. Context and Motivation
SecurityV0 currently covers the Microsoft identity plane (Entra ID, Azure Foundry, ServiceNow). Enterprise customers — including Inetum — operate significant workloads on AWS: Lambda functions calling RDS, ECS containers pulling from ECR, Bedrock agents orchestrating across accounts, and Step Functions triggering cross-account executions.
Without an AWS connector, authority paths that originate in or terminate on AWS resources cannot be materialised. This is a visibility gap: a Lambda function assuming an over-privileged IAM Role and reading production S3 data is structurally identical to the Azure Logic App → ServiceNow scenarios we already model — but completely invisible to the platform.
What the AWS connector must answer
- What workloads run on AWS and what identities do they assume?
- What resources can those identities access, and via which policies?
- What is the code execution chain? (ECR → container → task role → resource)
- Where does authority cross account or cloud boundaries?
- What is the observed execution evidence? (CloudTrail as temporal data source)
2. AWS Identity Model — Key Concepts
Understanding the AWS IAM model is prerequisite to correct graph modelling.
2.1 Identity Types
| AWS Entity | SecurityV0 Entity Type | Notes |
|---|---|---|
| IAM User | identity (subtype: iam_user) | Human or programmatic; avoid in modern AWS |
| IAM Role | identity (subtype: iam_role) | Primary execution identity; assumed via STS |
| IAM Group | identity (subtype: iam_group) | Attached to users; no direct execution |
| IAM Identity Center Permission Set | identity (subtype: sso_permission_set) | Federated human access; maps to role assumption |
| Service-Linked Role | identity (subtype: service_linked_role) | AWS-managed; low risk but should be visible |
| Instance Profile | edge annotation | Wraps an IAM Role for EC2 attachment; not a separate entity |
| OIDC Federated Identity | edge (TRUSTS) | Kubernetes Service Account → IAM Role via IRSA |
2.2 Policy Types (Authority Sources)
| Policy Type | Priority for Graph |
|---|---|
| Identity-based policies (inline + managed) | Critical — direct authority grant |
| Resource-based policies (S3 Bucket Policy, Lambda Resource Policy) | High — cross-account and cross-service authority |
| Permission Boundaries | High — ceiling on effective permissions; must be modelled as a constraint edge |
| Service Control Policies (SCPs) | High — org-level ceiling; organisation connector scope |
| Session Policies | Medium — short-lived; capture from CloudTrail |
| ACLs (legacy S3) | Low — mostly legacy, can be deferred |
2.3 Structural vs Effective Permissions — Critical Scope Caveat
This connector models structural (granted) permissions, not effective permissions.
AWS IAM Conditions (aws:SourceVpc, aws:RequestedRegion, StringEquals aws:PrincipalOrgID, etc.) are not evaluated by the connector. A policy statement that grants s3:GetObject with a condition like "StringEquals": {"s3:prefix": "logs/"} will be ingested as an unconditional GRANTS edge to that S3 bucket.
This means:
- Authority paths may over-report reachability when conditions restrict access at runtime
- The
via_rolesandactionsfields reflect policy text, not runtime enforcement - Wherever conditions are material to risk, the finding explanation should carry a
conditions_not_evaluated: trueflag and the UI should surface a caveat
Implication for Phase 1: all authority paths produced by the AWS connector carry an implicit "structurally reachable, conditions not evaluated" qualifier. This is consistent with how the Azure connector handles ARM RBAC conditions today. The caveat must be documented in the customer-facing setup guide and in the UI tooltip for AWS-sourced paths.
Conditions worth modelling in a future phase: aws:PrincipalOrgID (org boundary), aws:SourceAccount (confused-deputy prevention), sts:ExternalId (cross-account guard), aws:MultiFactorAuthPresent. These reduce the effective authority surface and would shrink false-positive path counts meaningfully.
2.4 Trust Relationships
Every IAM Role has a Trust Policy that defines who can assume it. This is the primary mechanism for cross-account and cross-service authority chains:
Lambda Service Principal → AssumeRole → Execution Role
ECS Task Definition → AssumeRole → Task Role
Bedrock Agent → AssumeRole → Agent Execution Role
Cross-Account Caller → AssumeRole → Role in Target Account
EC2 Instance Profile → AssumeRole → Role
OIDC Provider (EKS IRSA) → AssumeRole → Role (via web identity)
These trust relationships map directly to RUNS_AS and ASSUMES edges in the SecurityV0 graph.
3. Workload Types to Model
3.1 AWS Lambda
What to collect:
- Function name, ARN, runtime, description, last modified
- Execution role ARN (
RUNS_ASedge) - Resource-based policy (who can invoke the function — cross-service / cross-account)
- VPC configuration (network isolation context)
- Package type:
ZipvsImage— ifImage, ECR repository URI + image digest - Concurrency settings (reserved / provisioned)
- Environment variable keys (not values — PII risk; but presence of
DB_URL,SECRET_ARNetc. informs destination inference) - Layers (shared code; potential additional authority surface)
- Event source mappings (what triggers this function: SQS, DynamoDB Streams, Kinesis, EventBridge)
Graph edges:
Lambda Function → RUNS_AS → IAM Role (execution role)
Lambda Function → TRIGGERED_BY → EventBridge Rule / SQS Queue / SNS Topic
Lambda Function → DEPLOYED_FROM → ECR Repository (container image)
Lambda Function → READS_FROM → S3 Bucket / DynamoDB Table (env var inference)
IAM Role → HAS_POLICY → IAM Policy
IAM Policy → GRANTS → Permission (action + resource ARN)
Permission → APPLIES_TO → AWS Resource (S3, DynamoDB, RDS, etc.)
3.2 ECS (Elastic Container Service)
What to collect:
- Task Definition: family, revision, task role ARN, execution role ARN
- Container definitions: image URI (ECR reference), command, environment variable keys
- Services: cluster, desired count, launch type (Fargate vs EC2)
- Cluster configuration
Key distinction: ECS has two roles:
- Execution Role — used by ECS agent to pull ECR images, write CloudWatch logs
- Task Role — used by the application code inside the container
Both must be captured. The Task Role is the workload's runtime identity (authority source). The Execution Role is infrastructure-level (lower priority but relevant for ECR pull chain).
Graph edges:
ECS Task Definition → RUNS_AS → IAM Task Role
ECS Task Definition → DEPLOYED_FROM → ECR Repository (image URI)
ECS Task Definition → PULLS_VIA → IAM Execution Role (ECR pull authority)
ECS Service → RUNS → ECS Task Definition
3.3 ECR (Elastic Container Registry) — The Code-Deploy Chain
ECR is the source of execution artefacts for containerised workloads. It is the AWS equivalent of an artefact registry and sits at the start of the execution chain.
What to collect:
- Repository name, ARN, URI, account ID
- Repository policy (who can pull/push — cross-account access)
- Image tags and digests in use (link to live task definitions / Lambda functions)
- Lifecycle policy (image retention — affects drift detection)
- Encryption configuration (KMS key ARN)
- Scan findings summary (ECR enhanced scanning via Inspector)
Why ECR matters for authority paths: An ECR repository with a permissive resource-based policy is an injection point — a cross-account role that can push images can alter what code runs inside the task, bypassing IAM role controls entirely. This is a high-severity pattern: the authority chain runs through the artefact, not just the role.
Graph edges:
ECR Repository → HOSTS → Container Image (tag/digest)
Container Image → EXECUTED_BY → Lambda Function / ECS Task Definition
ECR Repository → PULL_ACCESS → IAM Role (from repository policy)
ECR Repository → PUSH_ACCESS → IAM Role (CI/CD service role — high sensitivity)
3.4 AWS Bedrock (AI Workloads)
Given increasing Bedrock adoption, this is high value for the AI-workload coverage story.
What to collect:
- Bedrock Agent: ID, name, execution role ARN, foundation model ID
- Agent Action Groups: Lambda function ARN (the code the agent executes)
- Knowledge Bases: ID, data source (S3 bucket), embedding model
- Guardrails: ID, topics blocked, PII filtering
- Model invocation logging: enabled/disabled (audit trail)
Graph edges:
Bedrock Agent → RUNS_AS → IAM Execution Role
Bedrock Agent → INVOKES → Lambda Function (action group)
Bedrock Agent → READS_FROM → S3 Bucket (knowledge base source)
Lambda Function → RUNS_AS → IAM Role (action group execution)
Why this matters: A Bedrock Agent is an autonomous workload. Its authority path — Agent → Role → S3 data → external API — is exactly the LLM egress pattern (egress_category: llm) already defined in the authority paths model. The data model supports it; we just need the connector.
3.5 Step Functions
What to collect:
- State machine: ARN, name, type (Standard / Express), IAM role
- Definition: states, resource ARNs invoked (Lambda, ECS, DynamoDB, etc.)
- Execution logging configuration
Graph edges:
Step Functions SM → RUNS_AS → IAM Role
Step Functions SM → INVOKES → Lambda Function / ECS Task / SNS / SQS
Step Functions are orchestrators — they chain workloads. A state machine with an over-privileged role can invoke Lambda, start ECS tasks, write to DynamoDB, and call external APIs in a single execution. Visibility into the full invocation graph is critical for authority path reconstruction.
3.6 EventBridge
What to collect:
- Rules: name, event pattern or schedule, target ARNs, IAM role (if cross-account)
- Targets: Lambda, ECS, Step Functions, SNS, SQS, external API destinations
EventBridge rules are trigger sources for workloads. The TRIGGERED_BY edge completes the "what started this?" question in authority path lineage.
3.7 Sensitive Data Stores (Secrets Manager + SSM Parameter Store)
These are the highest-value destination types for sensitive credential data and were missing from the initial draft.
AWS Secrets Manager:
- Secret ARN, name, description, KMS key ID (encryption key)
- Resource policy (who can read cross-account)
- Rotation status (unrotated secrets = stale credential risk)
- CloudTrail
GetSecretValueevents — direct evidence of access
AWS Systems Manager Parameter Store:
- Parameter name, ARN, type (
String/SecureString/StringList) - KMS key ID (for
SecureString) GetParameter/GetParametersCloudTrail events
Both are resource entities with subtype: secrets_manager_secret / subtype: ssm_parameter. IAM policies granting secretsmanager:GetSecretValue or ssm:GetParameter create GRANTS edges to these destinations with sensitivity: restricted by default.
Graph edges:
IAM Policy → GRANTS → Secrets Manager Secret (sensitivity: restricted)
IAM Policy → GRANTS → SSM SecureString Parameter (sensitivity: restricted)
Lambda Function → READS_FROM → Secrets Manager Secret (env var `SECRET_ARN` inference)
Minimum policy additions (update §6):
"secretsmanager:ListSecrets", "secretsmanager:DescribeSecret", "secretsmanager:GetResourcePolicy",
"ssm:DescribeParameters", "ssm:GetParametersByPath", "ssm:ListTagsForResource"
Note: do not add secretsmanager:GetSecretValue or ssm:GetParameter — the connector reads metadata only, never secret values.
3.8 Lambda → Lambda Direct Invocation
Lambda functions can invoke other Lambda functions directly via lambda:InvokeFunction. This creates workload-to-workload authority edges that are distinct from the workload→identity→resource chain.
Graph edge:
Lambda Function A → INVOKES → Lambda Function B
Lambda Function B → RUNS_AS → IAM Role B → HAS_POLICY → ...
This is significant because a low-sensitivity Lambda with a permissive execution role can transitively reach high-sensitivity data by invoking a privileged Lambda. The auth_chain_depth must be incremented per hop.
How to detect: lambda:InvokeFunction permissions in identity-based policies + Lambda resource-based policies listing caller ARNs. Also collectible from CloudTrail Invoke events with a non-human caller identity.
Defer to Phase 2 but include in schema from Phase 1 to avoid a breaking change later.
4. Cross-Account and Multi-Account Modelling
AWS Organizations is the control plane for multi-account AWS environments. This is where Service Control Policies (SCPs) live — they are org-level authority ceilings that cannot be exceeded by any IAM policy in a member account.
4.1 What to collect (Organizations / IAM Identity Center)
- AWS Organizations: Management account, member accounts, OUs, applied SCPs
- IAM Identity Center (SSO): Permission sets, account assignments (user/group → permission set → account)
- Cross-account trust policies: IAM roles with
sts:AssumeRoletrust to external accounts
4.2 Graph model for cross-account authority
SCP (Org-level) → CONSTRAINS → AWS Account (ceiling on all roles in account)
IAM Role (Account A)→ TRUSTS → IAM Role (Account B) [cross-account AssumeRole]
IAM Role (Account A)→ TRUSTS → AWS Service Principal (service-to-service)
IAM Role → SUBJECT_TO → Permission Boundary (per-role ceiling)
The CONSTRAINS and SUBJECT_TO edges are ceiling edges — they reduce effective authority. The graph engine must apply these during authority path materialisation to compute the effective permission set, not just the granted set.
5. CloudTrail — Temporal Evidence
CloudTrail is the AWS equivalent of execution logs. It provides the observed execution evidence the authority paths model depends on.
5.1 Priority events to capture
| CloudTrail Event | Maps To |
|---|---|
AssumeRole | Confirms a role assumption occurred; populates execution_30d, last_execution_at |
Invoke (Lambda) | Confirms function execution; links to workload |
GetSecretValue (Secrets Manager) | Sensitive data access; informs data_domain |
GetObject / PutObject (S3) | Data read/write; informs actions[] |
DescribeInstances / resource APIs | Resource enumeration; scope drift signal |
CreateRole / AttachRolePolicy | Permission change event; drift trigger |
SwitchRole (Console) | Human cross-account access |
5.2 Architecture — Why cloudtrail:LookupEvents is not viable
Do not use
cloudtrail:LookupEventsas the primary ingestion path.
The LookupEvents API has a hard limit of 5 TPS with 50 events per response. A production account generating thousands of AssumeRole and Invoke events per minute will hit this limit in seconds and return incomplete data. Additionally, LookupEvents only covers the last 90 days and cannot be filtered by time range efficiently. It is useful for one-off debugging, not systematic ingestion.
5.3 Recommended architecture: S3 + Athena
Option A — S3 direct read (recommended for MVP):
- Customer has a CloudTrail trail writing to an S3 bucket (standard setup in most enterprises)
- SV0 connector is granted
s3:GetObject+s3:ListBucketon that bucket with a path prefix filter (e.g.,AWSLogs/{account_id}/CloudTrail/) - Connector downloads compressed JSON log files for the target 30-day window, decompresses, filters for priority events
- Aggregates
AssumeRole+Invoke+GetSecretValue+GetObjectcalls per (principal, resource) pair → populatesexecution_30d/last_execution_at
Cost: S3 GET requests per scan. At ~10K log files per 30 days this is cents per scan. Customer controls the bucket.
Option B — Athena query (recommended for large accounts):
- Customer creates an Athena table over the CloudTrail S3 prefix (AWS provides a standard DDL for this)
- SV0 connector runs a parameterised Athena query with a 30-day
eventTimefilter andeventName IN (...)clause - Query results written to a customer-controlled S3 output bucket, connector reads results
Cost: Athena charges ~$5/TB scanned. For 30 days of typical CloudTrail data (~50 GB) this is ~$0.25 per scan.
Option C — EventBridge streaming (future): Customer enables CloudTrail → EventBridge; connector subscribes via SQS for near-real-time ingestion. More complex and requires customer-side infrastructure. Design Option A so this is addable as a configuration flag.
Recommendation: Start with Option A for MVP. Document Option B as the upgrade path for accounts producing >1 GB/day of CloudTrail logs.
6. Authentication Strategy
The connector needs AWS credentials. Options:
| Method | Pros | Cons |
|---|---|---|
| Cross-account IAM Role (recommended) | No long-lived credentials; customer creates role with read-only policies | Requires customer to create role; ARN config per tenant |
| IAM User + Access Key | Simple to set up | Long-lived credentials; rotation risk |
| IAM Roles Anywhere (cert-based) | Works from non-AWS hosts | PKI setup complexity |
Recommended approach: Cross-account read-only IAM Role with an external ID (ExternalId condition). Customer creates SecurityV0ReadOnlyRole in their account; connector assumes it via STS using the SV0 management account. ExternalId prevents confused-deputy attacks.
6.1 Bootstrap Credential Problem
The research previously left this unaddressed. This section resolves it.
The cross-account AssumeRole flow requires SV0 to have a starting AWS identity before it can call sts:AssumeRole into the customer account. The connector runs outside AWS (Docker on a Mac Mini M4), so it cannot use EC2 instance metadata or ECS task roles.
Resolved approach — SV0 Service Account IAM User per tenant:
- SecurityV0 maintains a dedicated IAM User (e.g.,
sv0-connector-{tenant_id}) in a SV0-owned AWS management account. This user has only one permission:sts:AssumeRoleon the specific customer role ARN. - The IAM User's access key + secret key are stored in 1Password per tenant, resolved at container start into env vars (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY). - The connector calls
sts:AssumeRolewithRoleArn=customer_role_arnandExternalId=tenant_specific_secretto get short-lived session credentials. - All subsequent API calls use the short-lived session. Credentials expire after 1 hour (configurable up to 12h); connector refreshes before expiry.
This is structurally identical to how the ServiceNow connector handles Basic Auth credentials — a long-lived credential stored in 1Password bootstraps a shorter-lived session.
Security controls:
- The IAM User's access key has zero permissions except
sts:AssumeRoleon the one role ARN - Customer's trust policy requires
sts:ExternalIdmatch — prevents impersonation even if the key leaks - Keys are rotated on the same schedule as other secrets (monthly minimum)
- Key rotation is documented in the connector setup guide as a customer onboarding step
Future enhancement: Replace the IAM User with IAM Roles Anywhere + a certificate from the SV0 internal CA. Eliminates long-lived credentials entirely. Flag as a Phase 3 security hardening task.
6.2 Minimum Policy for the Customer Read-Only Role
Note: cloudtrail:LookupEvents has been removed (not viable at scale — see §5.2). CloudTrail access is now via S3 GetObject.
{
"Effect": "Allow",
"Action": [
"iam:List*", "iam:Get*",
"lambda:List*", "lambda:Get*",
"ecs:List*", "ecs:Describe*",
"ecr:List*", "ecr:Describe*", "ecr:GetRepositoryPolicy",
"ecr:BatchGetImage",
"bedrock:List*", "bedrock:Get*",
"states:List*", "states:Describe*",
"events:List*", "events:Describe*",
"organizations:List*", "organizations:Describe*",
"s3:ListAllMyBuckets", "s3:GetBucketPolicy", "s3:GetBucketAcl", "s3:GetBucketLocation",
"secretsmanager:ListSecrets", "secretsmanager:DescribeSecret", "secretsmanager:GetResourcePolicy",
"ssm:DescribeParameters", "ssm:GetParametersByPath", "ssm:ListTagsForResource",
"kms:ListKeys", "kms:DescribeKey", "kms:GetKeyPolicy",
"cloudtrail:GetTrailStatus", "cloudtrail:DescribeTrails",
"athena:StartQueryExecution", "athena:GetQueryExecution", "athena:GetQueryResults"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::{customer-cloudtrail-bucket}",
"arn:aws:s3:::{customer-cloudtrail-bucket}/*",
"arn:aws:s3:::{customer-athena-results-bucket}",
"arn:aws:s3:::{customer-athena-results-bucket}/*"
]
}
The CloudTrail and Athena bucket ARNs are provided by the customer during connector onboarding. Using Resource: "*" for S3 GetObject would be excessively broad and should be resisted even when customers offer it.
7. Connector Architecture — Implementation Sketch
Following the existing Extract → Transform → Diff → Load pattern:
7.1 Extract Phase
AWSExtractor
├── IAMExtractor → users, roles, groups, policies, trust docs
├── LambdaExtractor → functions, execution roles, event sources
├── ECSExtractor → task definitions, services, clusters
├── ECRExtractor → repositories, repository policies, image tags
├── BedrockExtractor → agents, knowledge bases, action groups
├── StepFunctionsExtractor → state machines, definitions
├── EventBridgeExtractor → rules, targets
├── OrgsExtractor → accounts, OUs, SCPs, SSO assignments
└── CloudTrailExtractor → recent AssumeRole, Invoke, data-access events
All extractors use boto3 with the assumed cross-account role session. Pagination is mandatory (AWS APIs are all paginated).
7.2 Transform Phase
Normalise raw AWS API responses to NormalizedGraph entities:
# Entity type mapping
IAM Role → Entity(type="identity", subtype="iam_role", source_system="aws_iam")
Lambda Function → Entity(type="workload", subtype="lambda_function", source_system="aws_lambda")
ECS Task Def → Entity(type="workload", subtype="ecs_task", source_system="aws_ecs")
ECR Repository → Entity(type="resource", subtype="ecr_repository", source_system="aws_ecr")
S3 Bucket → Entity(type="resource", subtype="s3_bucket", source_system="aws_s3")
Secrets Manager → Entity(type="resource", subtype="secrets_manager_secret", source_system="aws_secretsmanager")
SSM Parameter → Entity(type="resource", subtype="ssm_parameter", source_system="aws_ssm")
IAM Policy → Entity(type="role", subtype="iam_policy", source_system="aws_iam")
AWS Account → Entity(type="tenant", subtype="aws_account", source_system="aws_orgs")
On
IAM Policy → type="role": This mapping requires explicit justification against the SecurityV0 9-type data model. IAM Policies are not "roles" in the human sense; however, in the SecurityV0 entity model,type="role"denotes a permission assignment object — an entity that sits between an identity and a permission grant, analogous to an Azure RBAC role assignment or a ServiceNow OAuth scope. An IAM Managed Policy occupies this structural position: it binds to an identity (viaHAS_POLICY) and grants permissions (viaGRANTS). Usingtype="role"preserves the existing materialiser logic unchanged. An alternative mapping —type="permission_set"with a new subtype — would be cleaner semantically but requires a data model ADR and a materialiser change. Recommendation: file an ADR to introducetype="permission_set"before Phase 1 ships; usetype="role"for the initial prototype only, and flag it with a_type_provisional: trueannotation so the migration path is traceable.
7.3 Authority Path Materialisation Hooks
AWS authority paths follow the same materialisation chain as Azure:
Workload → RUNS_AS → IAM Role → HAS_POLICY → IAM Policy → GRANTS → Permission → APPLIES_TO → Resource
But AWS has additional complexity layers the materialiser must handle:
- Effective permissions = identity policy ∩ NOT(permission boundary) ∩ NOT(SCP) — see §2.3 on the structural caveat; conditions are not evaluated in Phase 1.
- Resource policies — S3 bucket policies, Lambda resource policies can grant access independently of the identity policy; these must generate their own
GRANTSedges from the resource side, not just the identity side. - KMS key policies — implicit authority layer (warning): Many AWS resources (S3 objects, RDS clusters, Secrets Manager secrets, EBS volumes) are encrypted with KMS. A KMS key policy that grants
kms:Decryptto an IAM Role effectively grants read access to all resources encrypted with that key, regardless of the resource's own policy. This creates false-positive paths if the connector materialises S3 → IAM Role paths without also checking whether the role can decrypt the bucket's KMS key. In Phase 1, the materialiser should emit akms_not_evaluatedannotation on any path where the destination resource has a KMS key ARN. Phase 2 should collect KMS key policies and addDECRYPTABLE_BYedges so the materialiser can filter paths that would fail at the KMS layer.
All three require modelling as constraint/annotation edges evaluated at materialisation time.
8. Graph Enhancement Opportunities
8.1 Cross-Cloud Authority Paths (Azure → AWS)
The cross-connector correlation research identified the need for unified paths across platform boundaries. With an AWS connector live, the following cross-cloud patterns become detectable:
Azure Logic App → (HTTP) → API Gateway → Lambda → DynamoDB
Azure Foundry Agent → (HTTP) → Lambda (action group) → S3
GitHub Actions OIDC → AssumeRole → IAM Role → ECR push
These require the cross-connector entity resolution mechanism described in the correlation research. The AWS connector should emit entities with consistent ARN-based external_id values to enable correlation.
8.2 CI/CD → ECR → Runtime (Supply Chain Chain)
GitHub Actions → (OIDC) → IAM Role (CI role)
IAM Role (CI) → PUSH_TO → ECR Repository
ECR Repository → IMAGE_RUNS_IN → Lambda / ECS Task
Lambda / ECS Task → RUNS_AS → IAM Task Role
IAM Task Role → HAS_POLICY → IAM Policy → GRANTS → S3/RDS/Secrets
This chain reveals that a misconfigured CI/CD pipeline (or compromised GitHub Actions workflow) can transitively own production data access. This is a supply chain authority path — a new finding type worth introducing.
8.3 Bedrock LLM Egress Paths
Bedrock Agent → RUNS_AS → IAM Role → GRANTS → S3 (knowledge base)
Bedrock Agent → INVOKES → Lambda (action group) → RUNS_AS → IAM Role → GRANTS → RDS
The existing egress_category: llm field in authority paths already anticipates this. The AWS connector enables the first real population of LLM egress paths.
8.4 Cross-Account Trust Amplification
Role in Dev Account → TRUSTS → Role in Prod Account
Role in Prod Account → HAS_POLICY → Production S3 / RDS
This is a trust amplification pattern — a lower-trust account can reach higher-trust resources via cross-account AssumeRole. The auth_chain_depth field already tracks hop count; cross-account hops should increment this and trigger a finding if depth exceeds threshold (e.g., > 2 hops into a restricted domain).
9. Gaps and Open Questions
| Question | Recommended Resolution |
|---|---|
| Should SCPs be modelled as constraint edges or as a separate "ceiling" entity? | Separate ceiling entity with CONSTRAINS edge — mirrors how permission boundaries are handled |
| CloudTrail costs: who pays for the S3 export? | Customer responsibility; document in setup guide; S3 batch + Athena option keeps cost near zero for SV0 |
| How to handle assume-role chains > 3 hops? | Cap at 5 hops in materialiser; emit a deep_trust_chain finding |
| EKS / Kubernetes identity (IRSA, Pod Identity)? | Defer to v2 of the connector — IRSA maps to IAM Roles and can be added incrementally |
| RDS / Aurora as destination: what metadata to collect? | Instance ARN, engine, VPC, security groups; no row-level data |
| Multi-region: how to handle? | Connector iterates all enabled regions; ARN uniquely identifies resources cross-region |
How to distinguish aws_lambda as source_system per customer account? | Tenant-scoped source systems: aws_lambda:{account_id} — promote to first-class architectural decision; affects cross-connector entity correlation and UI display |
| Lambda→Lambda direct invocation (§3.8)? | Include INVOKES edge schema in Phase 1; implement detection in Phase 2 via policy inspection + CloudTrail |
| KMS key policies causing false-positive paths? | Phase 1: annotate paths with kms_not_evaluated: true when destination has a KMS key; Phase 2: collect key policies, add DECRYPTABLE_BY edges |
| IAM Access Analyzer as free signal? | IAM Access Analyzer already flags externally-accessible resources in customer accounts. Connector can call accessanalyzer:ListFindings to seed scope_drift findings without needing to re-derive them from policy text. Add to Phase 3 scope. |
IAM Policy → type="role" provisional mapping? | File an ADR before Phase 1 ships to introduce type="permission_set"; current mapping is a prototype convenience only |
| ECR Inspector scan findings to graph edge? | ECR Repository → HAS_FINDING → VulnerabilityFinding — adds supply chain risk signal. Add ecr:DescribeImageScanFindings to the minimum policy; implement in Phase 2 alongside ECR pull chain. |
10. Phased Delivery
Phase 1 — IAM + Lambda Baseline (MVP)
- IAM: roles, policies, trust relationships
- Lambda: functions, execution roles, event sources
- Secrets Manager + SSM Parameter Store as destination resource types
- CloudTrail ingestion via S3 direct read (Option A) for
AssumeRole+Invokeevents — 30-day window - Bootstrap credential mechanism: SV0 IAM User per tenant + ExternalId (§6.1)
- File ADR for
type="permission_set"before shipping - All AWS paths carry
conditions_not_evaluated: trueannotation; surface in UI
Phase 2 — Container, ECR, and Lambda→Lambda Chain
- ECR: repositories, repository policies, image-to-workload linkage
- ECR Inspector scan findings →
HAS_FINDINGgraph edges - ECS: task definitions, task roles, execution roles
- CI/CD → ECR → runtime chain edges (supply chain authority paths)
- Lambda→Lambda invocation detection
- KMS key policy collection +
DECRYPTABLE_BYedges; suppress false-positive paths
Phase 3 — Multi-Account, Bedrock, and IAM Access Analyzer
- AWS Organizations: accounts, SCPs, OUs
- IAM Identity Center: permission sets, assignments
- Bedrock agents, knowledge bases, action groups
- Cross-account trust amplification findings
- IAM Access Analyzer findings as seeded
scope_driftsignals - IAM Roles Anywhere as credential bootstrap replacement (security hardening)
Phase 4 — Deep Temporal Evidence and EKS
- Full CloudTrail data-access event processing (S3, DynamoDB, Secrets Manager)
- Athena upgrade for large accounts (Option B)
- EventBridge streaming option (Option C)
- EKS / IRSA / Pod Identity support
11. Related Documents
- Connector Framework (05-connectors) — interface contract, Extract→Transform→Diff→Load pattern
- Access Paths (10-access-paths) — path data model and API
- Cross-Connector Entity Correlation Research — unified path materialisation across connectors
- Azure Foundry Connector Plan — reference implementation patterns
- Inetum Auth Integration Response — customer context driving this work
Next Action
Status: research-complete Decision needed from: PO (Ivan) Options:
- Adopt — create GitHub issue in
sv0-connectorsto implement Phase 1 (IAM + Lambda MVP):AWSExtractor, IAM Roles/Policies, Lambda functions, CloudTrailAssumeRoletemporal evidence, S3 batch export - Defer — revisit after current Inetum engagement closes (Q2 2026)
- Reject — not applicable given Inetum and other customers confirmed AWS workload coverage requirement
GitHub Discussion: not yet created