Security¶

Scope¶

This file covers what security controls an architecture must address — compliance, identity, secrets, encryption, network security, access management, audit, and incident response — regardless of cloud provider or on-premises platform. For provider-specific how (IAM policies, security groups, managed services), see the provider security files linked in See Also.

Checklist¶

Why This Matters¶

Security breaches are existential risks that extend far beyond technical remediation. A single compromised credential, unpatched vulnerability, or misconfigured security group can lead to data exfiltration, ransomware deployment, or regulatory action. The average cost of a data breach exceeds $4 million, and regulated industries face additional penalties — HIPAA violations carry fines up to $2 million per incident category, PCI DSS non-compliance can result in transaction processing bans, and GDPR fines can reach 4% of global annual revenue.

The most common security architecture failures are not exotic zero-day exploits but fundamental design gaps: overly permissive IAM roles that grant admin access to application workloads, secrets stored in environment variables or committed to source control, security groups that allow 0.0.0.0/0 inbound access "temporarily" and never get tightened, and audit logging that is either disabled or sends logs to a destination the attacker can delete. These gaps persist because security controls are often treated as a post-deployment checklist rather than an integral part of the architecture design.

A well-designed security architecture assumes breach (zero trust), limits blast radius through segmentation and least privilege, detects anomalies through comprehensive logging and monitoring, and enables rapid response through pre-planned incident procedures. Security decisions made during architecture design — IAM model, network topology, secrets management approach, encryption strategy — are extremely expensive to change after deployment, making it critical to get them right early.

Common Decisions (ADR Triggers)¶

ADR: Compliance Framework Selection¶

Context: The organization must identify which compliance standards apply and how they shape the technical architecture.

Options:

Framework	Applicability	Key Technical Requirements	Certification Effort
SOC 2 Type II	SaaS companies, any org handling customer data	Access controls, logging, encryption, change management, availability monitoring	6-12 months initial, annual audit
PCI DSS	Any system that stores, processes, or transmits cardholder data	Network segmentation, encryption, access logging, vulnerability scanning, penetration testing	3-6 months, annual assessment (QSA or SAQ)
HIPAA	Healthcare data (PHI)	Encryption at rest and in transit, access controls, audit trails, BAAs with vendors, breach notification	Ongoing, no formal certification — OCR audits
GDPR	Any system processing EU personal data	Data minimization, right to erasure, consent management, DPO appointment, cross-border transfer controls	Ongoing, supervisory authority enforcement
FedRAMP	Cloud services used by US federal agencies	NIST 800-53 controls, continuous monitoring, boundary definition, 3PAO assessment	12-18 months, continuous monitoring

Decision drivers: Customer requirements, data classification, geographic scope (EU data requires GDPR), industry vertical, and whether the organization plans to sell to government or regulated industries in the future.

ADR: Secrets Management Approach¶

Context: Application workloads need access to database credentials, API keys, TLS certificates, and encryption keys without exposing them in code, config files, or environment variables.

Options: - HashiCorp Vault: Platform-agnostic, supports dynamic secrets (generates short-lived credentials on demand), integrates with Kubernetes via sidecar injector or CSI driver. Requires operating and securing the Vault cluster itself (or using HCP Vault managed service). Best for multi-cloud or on-premises environments. - Cloud-native secrets manager (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager): Fully managed, integrates with provider IAM, supports automatic rotation for supported secret types (RDS credentials, service account keys). Simplest to operate. Creates provider lock-in for the secrets layer. - Kubernetes Secrets with external sync (External Secrets Operator): Syncs secrets from an external source (Vault, AWS SM, Azure KV) into Kubernetes Secrets. Applications consume standard K8s secrets without knowing the backend. Adds a reconciliation layer that must be monitored. - Mounted files from sealed secrets or SOPS: Secrets encrypted in Git (Sealed Secrets, Mozilla SOPS), decrypted at deploy time and mounted as files. Enables GitOps workflow for secrets. Rotation requires redeployment.

Decision drivers: Multi-cloud vs. single-cloud, whether dynamic/short-lived secrets are needed, team operational capacity to manage Vault, compliance requirements for secret auditability, and whether a GitOps workflow is in use.

ADR: Administrative Access Model¶

Context: Engineers and operators need secure access to production infrastructure for troubleshooting and maintenance without exposing management interfaces to the internet.

Options: - Bastion host (jump box): Dedicated hardened instance in a DMZ subnet. SSH or RDP through the bastion to reach internal resources. Requires managing the bastion instance, patching it, and controlling SSH keys. Audit trail depends on session logging configuration. - Cloud session manager (AWS SSM Session Manager, Azure Bastion): No inbound ports required, no SSH keys to manage. Full session logging to CloudTrail/CloudWatch or Azure Monitor. Requires cloud provider IAM for access control. No connectivity without cloud API access. - VPN with MFA: Site-to-site or client VPN (WireGuard, OpenVPN, cloud-native VPN) places users on internal network. Broader access than bastion — must be combined with strict security group rules. MFA is mandatory. - Zero trust network access (ZTNA): Identity-aware proxy (Cloudflare Access, Zscaler, BeyondCorp) verifies user identity, device posture, and context before granting per-resource access. No VPN or bastion required. Highest security posture, most complex initial setup.

Recommendation: For cloud environments, prefer cloud session manager (SSM/Azure Bastion) for its built-in audit trail and elimination of SSH key management. For hybrid or on-premises environments, deploy a hardened bastion with session recording, or evaluate ZTNA for organizations with mature identity infrastructure.

ADR: Encryption Key Management¶

Context: Encryption at rest requires key management — who generates keys, where they are stored, how they are rotated, and who can access them.

Options: - Provider-managed keys (default encryption): Cloud provider generates and manages keys (SSE-S3, Azure SSE with platform keys). Zero operational overhead. No customer control over key lifecycle. Acceptable for most workloads. - Customer-managed keys in cloud KMS: Customer creates keys in AWS KMS, Azure Key Vault, or GCP Cloud KMS. Full control over key rotation, access policies, and audit logging. Keys never leave the provider's HSM boundary. Small per-key and per-API-call cost. - Customer-managed keys in external HSM (BYOK/HYOK): Keys generated in on-premises HSM (Thales, Entrust) and imported into cloud KMS, or used directly via HSM-as-a-service. Required by some regulatory frameworks (banking, government). Highest operational complexity.

Decision drivers: Regulatory requirements for key custody, whether the organization needs to revoke cloud provider access to data, multi-cloud key portability needs, and operational capacity to manage HSM infrastructure.

ADR: Network Security Architecture¶

Context: The architecture must define how network traffic is segmented, filtered, and monitored between tiers and between services.

Options: - Traditional tier-based segmentation: Web tier, application tier, data tier on separate subnets with security groups/NACLs controlling traffic between tiers. Well-understood, maps to compliance frameworks. Coarse-grained — all services in a tier can communicate freely. - Micro-segmentation with service mesh: Per-service network policies (Kubernetes NetworkPolicy, Calico, Cilium) combined with service mesh mTLS (Istio, Linkerd). Every service-to-service connection is explicitly authorized. Fine-grained but operationally complex. - Zero trust with identity-aware proxy: All access — user and service — is authenticated and authorized per request regardless of network location. Eliminates implicit trust within the network perimeter. Requires mature identity infrastructure and comprehensive policy definition.

Decision drivers: Number of services (micro-segmentation benefits increase with service count), compliance requirements for network isolation, team familiarity with service mesh operations, and whether east-west traffic poses a significant threat in the environment.