CI/CD Pipeline Design¶

Scope¶

This file covers cloud-agnostic CI/CD pipeline architecture: platform selection, pipeline design patterns, artifact management, testing stages, deployment integration, and operational concerns like build performance and secret handling. For deployment strategy decisions (rolling, blue-green, canary), see Deployment. For provider-specific CI/CD tooling, see Azure DevOps, GCP DevOps, and OpenShift CI/CD. For infrastructure-as-code decisions, see IaC Planning.

Checklist¶

Why This Matters¶

CI/CD pipelines are the central nervous system of software delivery. A poorly designed pipeline either slows teams down — 30-minute builds, flaky tests, manual deployment steps — or creates risk by shipping untested or unsigned artifacts to production. Organizations that treat CI/CD as an afterthought end up with fragile, snowflake pipelines that only one person understands, and deployments that require a war room.

Supply chain attacks (SolarWinds, Codecov, xz-utils) have elevated CI/CD security from a nice-to-have to a board-level concern. Pipelines that use long-lived credentials, pull unverified dependencies, or skip artifact signing are attack surfaces. A compromised build pipeline can inject malicious code into every deployment across the organization. OIDC federation, signed artifacts, pinned dependencies, and ephemeral runners are now baseline requirements for any serious production pipeline.

The difference between high-performing and low-performing teams often comes down to CI/CD maturity. Teams that deploy multiple times per day with automated rollback recover from failures in minutes. Teams that deploy monthly with manual steps accumulate risk and spend weekends on failed releases. The pipeline design decisions made early — platform, branching strategy, testing stages, deployment integration — determine which category an organization falls into.

Common Decisions (ADR Triggers)¶

ADR: CI/CD Platform Selection¶

Context: The organization needs a CI/CD platform to build, test, and deploy applications across multiple environments.

Options:

Criterion	GitHub Actions	GitLab CI	Jenkins	Azure DevOps	Tekton	CircleCI
Source Control Integration	Native (GitHub)	Native (GitLab)	Plugin-based (any SCM)	Native (Azure Repos), GitHub	Any (event-driven)	GitHub, Bitbucket
Pipeline Definition	YAML workflows	`.gitlab-ci.yml`	Jenkinsfile (Groovy)	YAML pipelines	Kubernetes CRDs	YAML config
Runner Model	GitHub-hosted or self-hosted	SaaS or self-managed runners	Always self-hosted (controller + agents)	Microsoft-hosted or self-hosted	Kubernetes-native pods	SaaS or self-hosted (runner)
Ecosystem	Marketplace actions (reuse risk)	Built-in registry, security scanning	Plugin ecosystem (1800+ plugins)	Boards, artifacts, test plans integrated	Cloud-native, composable	Orbs marketplace
Security Features	OIDC federation, Dependabot, code scanning	SAST/DAST/container scanning built-in	Plugin-dependent, high maintenance	Service connections, managed identity	Chains (SLSA provenance)	Contexts, OIDC
Scaling	Auto-scales (SaaS), ARC for self-hosted	Auto-scale runners, Kubernetes executor	Manual scaling, Kubernetes plugin	Auto-scale agents (VMSS)	Kubernetes-native auto-scale	Auto-scales (SaaS)
Cost Model	Free tier + per-minute (SaaS), free self-hosted	Free tier + per-minute (SaaS), free self-managed	Free (OSS), infrastructure cost only	Free tier + per-minute, free self-hosted	Free (OSS), infrastructure cost only	Free tier + per-minute
Best Fit	GitHub-native teams, OSS projects	GitLab-native teams, integrated DevSecOps	Complex enterprise with existing investment	Microsoft/Azure shops	Kubernetes-native, supply chain security	SaaS-first small-to-mid teams

Decision drivers: Existing source control platform, security scanning requirements, self-hosted vs. SaaS preference, Kubernetes-native requirement, team familiarity, and budget model.

ADR: GitOps vs. Push-Based Deployment¶

Context: The team needs to decide how deployments are triggered and how desired state is reconciled with actual state.

Options: - GitOps pull-based (ArgoCD, Flux): A controller in the cluster watches a Git repository and reconciles desired state continuously. Provides drift detection, audit trail via Git history, and declarative rollback (revert a commit). Requires separating application code repos from deployment manifest repos. Adds operational complexity (controller HA, multi-cluster sync). - Push-based CI-triggered: The CI pipeline runs kubectl apply, helm upgrade, or a cloud CLI command after tests pass. Simpler to set up, single pipeline definition. No drift detection — manual changes go unnoticed. Credentials must be available to the CI system. - Hybrid: CI pipeline builds and tests artifacts, then updates a manifest repo (image tag bump), which ArgoCD/Flux picks up. Combines CI strengths (build/test) with GitOps strengths (deployment reconciliation).

Recommendation: Use the hybrid approach for production workloads — CI handles build/test/sign, GitOps handles deployment and drift detection. Use pure push-based for non-production environments where simplicity matters more than drift control.

ADR: Branching and Promotion Strategy¶

Context: The team must standardize how code flows from development to production.

Options: - Trunk-based development: All developers commit to main (or short-lived feature branches merged within 1-2 days). Releases cut from main. Requires feature flags for incomplete work. Fastest flow, lowest merge complexity, best for continuous deployment. - GitFlow: Long-lived develop and main branches, plus release/* and hotfix/* branches. Clear release process with explicit versioning. Higher merge complexity, slower flow. Suitable for packaged software with scheduled releases. - Environment branching: Branches map to environments (dev, staging, prod). Code promoted by merging between branches. Simple mental model but creates merge debt and cherry-pick nightmares. Generally discouraged.

Decision drivers: Release cadence (continuous vs. scheduled), team size, regulatory requirements for release traceability, and whether feature flags are in place.

ADR: Self-Hosted vs. SaaS Runners¶

Context: The team must decide where CI/CD jobs execute.

Options: - SaaS-hosted runners: Zero maintenance, auto-scaled, pay-per-minute. Cannot access private networks without tunneling. Compute constrained by vendor tiers. Data leaves your network boundary. - Self-hosted runners (persistent): Full network access, custom tooling pre-installed. Require patching, monitoring, and scaling. Risk of contamination between jobs if not properly isolated. - Self-hosted ephemeral runners (Kubernetes-based): Each job gets a fresh pod. Network access plus clean environment. Requires Kubernetes cluster and runner operator (Actions Runner Controller, GitLab Kubernetes executor). Higher infrastructure complexity, best isolation.

Recommendation: Use SaaS runners for open-source and public-facing projects. Use self-hosted ephemeral runners (Kubernetes-based) for enterprise workloads requiring private network access or strict isolation. Never use persistent self-hosted runners without job isolation for security-sensitive workloads.

ADR: Monorepo vs. Polyrepo Pipeline Design¶

Context: The organization has multiple services and must decide how CI/CD pipelines are structured relative to repository layout.

Options: - Monorepo with path-based triggers: All services in one repository, pipelines triggered by changed paths. Atomic cross-service changes, shared tooling. Requires build system intelligence (Bazel, Nx, Turborepo) to avoid rebuilding everything on every commit. Pipeline complexity grows with service count. - Polyrepo with per-service pipelines: Each service has its own repository and pipeline. Clear ownership boundaries, independent release cycles. Cross-service changes require coordinated PRs. Shared pipeline logic via templates or reusable workflows.

Decision drivers: Number of services, frequency of cross-service changes, team structure (Conway's Law), and build system maturity.