HashiCorp Terraform¶

Scope¶

HashiCorp Terraform: remote state backends (S3, GCS, Terraform Cloud), state locking, provider version constraints, module versioning, workspace and directory-based environment separation, lifecycle rules, import/moved blocks, and CI/CD integration with linting and plan review.

Checklist¶

Why This Matters¶

Terraform manages the actual infrastructure. A state file out of sync with reality causes resource duplication, accidental destruction, or drift that is expensive to reconcile. State locking prevents two engineers or CI pipelines from applying conflicting changes simultaneously, which can corrupt state or create partial deployments. Unpinned provider versions mean that terraform init on Tuesday may pull a different provider version than Monday, changing plan behavior silently. Module versioning without pinning means that a module author's breaking change propagates to all consumers on their next init. Lifecycle rules prevent Terraform from destroying a production database during a refactor. These are not edge cases; they are the primary failure modes teams encounter when scaling Terraform usage beyond a single operator.

License¶

HashiCorp transitioned all products from MPL 2.0 to BSL 1.1 in August 2023. The BSL restricts competitive use of the software — you cannot use it to build a product that competes with HashiCorp's commercial offerings. For internal infrastructure use, the BSL is functionally equivalent to open source. IBM completed its acquisition of HashiCorp in late 2024, which may affect product direction, licensing terms, and commercial offerings over time. Community forks under MPL 2.0 exist for organizations requiring open-source licensing: OpenTofu (Terraform fork, maintained by the Linux Foundation) and OpenBao (Vault fork). Evaluate license terms and IBM's product roadmap for your specific use case before adoption.

Common Decisions (ADR Triggers)¶

OpenTofu vs Terraform: The BSL 1.1 license change in Terraform 1.6 prompted the creation of OpenTofu, a community-maintained fork under MPL 2.0. OpenTofu tracks Terraform feature parity with some divergences (e.g., client-side state encryption). Evaluate whether the BSL terms affect your use case (competitive SaaS offerings are restricted; internal infrastructure use is unaffected). ADR required when adopting Terraform 1.6+ or migrating an existing Terraform codebase.
State backend selection: S3 + DynamoDB is the most common for AWS shops (cheap, reliable, well-documented). Terraform Cloud provides state management, locking, run history, and policy enforcement in one service but adds cost and vendor dependency. GCS and Azure Blob are natural choices for their respective clouds. Document the backend, who has access, and the backup/recovery strategy.
Workspace-based vs directory-based environment separation: Workspaces share the same code with different state files, requiring conditionals (terraform.workspace == "prod") for environment-specific values. Directory-based separation (envs/dev/, envs/prod/) duplicates backend config but makes each environment's configuration explicit. Workspaces are simpler for small differences; directories are clearer for large divergence. Terragrunt offers a third approach with DRY configuration and automatic backend config generation.
Module granularity: One module per logical component (VPC, EKS cluster, RDS instance) vs large modules that bundle related resources (entire application stack). Fine-grained modules are more reusable but require more wiring. Coarse modules are faster to deploy but harder to customize. Decide based on team size and reuse requirements.
Terraform Cloud/Enterprise vs self-managed CI/CD: Terraform Cloud provides remote execution, state management, Sentinel policies, cost estimation, and a UI. Self-managed (GitHub Actions, GitLab CI, Jenkins) provides more control and avoids per-resource pricing. The decision affects workflow, auditability, and policy enforcement capabilities.
Terragrunt adoption: Terragrunt adds DRY configuration, automatic remote state setup, dependency management between modules, and run-all for multi-module operations. It adds a tool to the chain (Terragrunt wraps Terraform) but significantly reduces boilerplate in large, multi-environment, multi-account setups. Adopt early or not at all; retrofitting is painful.
Import strategy for existing resources: terraform import (CLI) vs import blocks (declarative, plannable). Import blocks are preferred in Terraform 1.5+ because they appear in plan output and can be code-reviewed. Document the process for adopting brownfield infrastructure.
Sensitive data handling: Terraform state stores all attribute values in plaintext. Options include encrypting the state backend (S3 SSE-KMS), using Vault for dynamic secrets that rotate automatically, or restructuring to keep secrets outside Terraform entirely. The choice affects the security model and operational complexity.

AI and GenAI Capabilities¶

HCP Terraform AI — AI-assisted features within HashiCorp Cloud Platform Terraform. Provides code generation suggestions for Terraform HCL, error diagnosis and remediation guidance when plans or applies fail, and module recommendations from the Terraform Registry based on the configuration context. Available in HCP Terraform (cloud), not in Terraform OSS CLI. Represents an early-stage AI integration — capabilities are expanding but are not yet comparable to full copilot experiences in other platforms.

Reference Architectures¶

Version Notes¶

Feature	1.5	1.6	1.7	1.8	1.9
`import` block (declarative)	GA	GA	GA	GA	GA
`removed` block	Not available	GA	GA	GA	GA
`check` block (assertions)	GA	GA	GA	GA	GA
`terraform test` framework	Experimental	GA	GA (improved mocking)	GA	GA
Provider-defined functions	Not available	Not available	GA	GA	GA
`ephemeral` resources	Not available	Not available	Not available	Not available	GA
`moved` block (refactoring)	GA	GA	GA	GA	GA
S3 state backend native locking	Not available	Not available	Not available	GA (no DynamoDB needed)	GA
`.tf.json` generation improvements	GA	GA	GA	GA	GA
Provider installation caching	GA	GA	GA (improved)	GA	GA
`terraform plan -generate-config-out`	GA (import config gen)	GA	GA	GA	GA
HCP Terraform (cloud block)	GA	GA	GA	GA	GA
License	MPL 2.0	BSL 1.1	BSL 1.1	BSL 1.1	BSL 1.1

Key changes across versions: - 1.5 -- Import blocks: Introduced declarative import blocks that appear in terraform plan output and can be code-reviewed, replacing the imperative terraform import CLI command. Combined with terraform plan -generate-config-out, this enables generating HCL configuration for existing resources during import. - 1.6 -- Removed blocks and testing GA: The removed block allows declaring that a resource should be removed from state without destroying the actual infrastructure (the inverse of import). The terraform test framework reached GA, enabling integration tests with .tftest.hcl files that create real infrastructure, run assertions, and clean up. - 1.7 -- Provider functions and mock testing: Provider-defined functions allow providers to expose custom functions callable in HCL expressions (e.g., provider::aws::arn_parse()). The testing framework added mock providers and override capabilities for faster, isolated testing without provisioning real infrastructure. - 1.8 -- S3 native locking and refactoring: The S3 backend gained native state locking using S3 conditional writes, eliminating the need for a DynamoDB table for lock management. This simplifies the most common backend configuration. - 1.9 -- Ephemeral resources: Ephemeral resources are created, used during the plan/apply cycle, and then discarded -- they are never stored in state. This is designed for sensitive data like temporary credentials, reducing the security risk of secrets persisting in state files. - License change (1.6): Terraform changed from Mozilla Public License 2.0 (open source) to Business Source License 1.1 starting with version 1.6. This prompted the creation of OpenTofu, a community fork maintaining the MPL 2.0 license. Evaluate licensing implications for your organization when choosing between Terraform 1.6+ and OpenTofu.

Multi-Account AWS with Terragrunt¶

repo-structure/
  |-- terragrunt.hcl              (root config: remote_state, generate provider)
  |-- _envcommon/                  (shared module configurations)
  |     |-- vpc.hcl
  |     |-- eks.hcl
  |     |-- rds.hcl
  |
  |-- accounts/
       |-- dev/
       |    |-- account.hcl        (account_id, region defaults)
       |    |-- us-east-1/
       |         |-- vpc/
       |         |    |-- terragrunt.hcl  (include root + envcommon/vpc)
       |         |-- eks/
       |         |    |-- terragrunt.hcl  (dependency on vpc)
       |         |-- rds/
       |              |-- terragrunt.hcl  (dependency on vpc)
       |
       |-- staging/
       |    |-- (same structure, different values)
       |
       |-- prod/
            |-- account.hcl
            |-- us-east-1/
            |    |-- (same structure, production sizing)
            |-- eu-west-1/
                 |-- (same structure, EU region)

Remote State per component:
  s3://company-terraform-state/dev/us-east-1/vpc/terraform.tfstate
  s3://company-terraform-state/prod/us-east-1/eks/terraform.tfstate

DynamoDB: company-terraform-locks (single table, all states)

Terraform Cloud Organization¶

Terraform Cloud Organization: company-infra
  |
  +-- Project: Platform
  |     |-- Workspace: platform-vpc-dev        (auto-apply, VCS: main branch, path: modules/vpc)
  |     |-- Workspace: platform-vpc-prod       (manual apply, VCS: main branch, path: modules/vpc)
  |     |-- Workspace: platform-eks-dev        (run trigger: after vpc-dev)
  |     |-- Workspace: platform-eks-prod       (run trigger: after vpc-prod, Sentinel policies)
  |
  +-- Project: Applications
  |     |-- Workspace: app-frontend-dev
  |     |-- Workspace: app-frontend-prod
  |     |-- Workspace: app-backend-dev
  |     |-- Workspace: app-backend-prod
  |
  +-- Variable Sets:
  |     |-- AWS Credentials (applied to all workspaces, env vars: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
  |     |-- Common Tags (applied to all, TF vars: project, owner, cost_center)
  |     |-- Production Overrides (applied to *-prod, TF vars: instance sizes, replica counts)
  |
  +-- Policy Sets (Sentinel):
        |-- cost-control: deny if monthly_cost_delta > $500
        |-- security-baseline: deny if aws_s3_bucket without encryption
        |-- tagging-enforcement: deny if required tags missing

CI/CD Pipeline (GitHub Actions)¶

Workflow: terraform.yml
  |
  +-- On: pull_request (paths: terraform/**)
  |     |-- Job: validate
  |     |     terraform fmt -check
  |     |     terraform init -backend=false
  |     |     terraform validate
  |     |     tflint --recursive
  |     |     checkov -d . --framework terraform
  |     |
  |     |-- Job: plan
  |           terraform init (with backend)
  |           terraform plan -out=tfplan
  |           Post plan summary as PR comment (tfcmt or custom script)
  |
  +-- On: push to main (paths: terraform/**)
        |-- Job: apply
              terraform init
              terraform apply tfplan
              (tfplan artifact from approved PR)
              Post apply summary to Slack

Reference Links¶

Terraform Documentation -- CLI, configuration language, state management, backends, and provider development
Terraform Registry -- official provider and module registry with versioned releases
Terraform Tutorials -- hands-on guides for state management, modules, CI/CD, and Terraform Cloud