Testing IaC

Infrastructure code powers the systems your users depend on. Unlike application code, testing it means provisioning real cloud resources - which is slow, costs money, and requires careful orchestration. This page covers the theory, tools, and workflows that make IaC testing practical.

Why IaC Testing Is Different

The Case for Continual Testing

Building infrastructure is rarely a one-off task. Systems require continuous patching, upgrading, and improvement. Integrating testing directly into the daily development cycle - not deferring it to a “QA phase” - delivers three key advantages:

Painless go-lives - the first production deployment uses the same automation that ran throughout development, removing deployment drama
Avoiding technical debt - catching problems immediately is cheaper than investigating and rewriting broken code later
Tight feedback loops - validating each change as early as possible is the single strongest differentiator of high-performing teams

What to Test

A comprehensive IaC testing strategy reaches well beyond “does the resource get created”:

Validation target	Example
Code quality	Syntax, formatting, complexity
Functionality	An HTTPS route reaches the web servers end-to-end
Security	Ports, permissions, exposed secrets
Compliance	Regulatory, contractual, and internal policy adherence
Provenance	Supply-chain checks - vulnerabilities, license compatibility, SBOM
Performance	Network latency, connection throughput
Scalability	Autoscaling triggers and the desired outcome (capacity actually increases)
Availability	Destroy a node → verify automatic replacement
Operability	Logging, monitoring, scheduled maintenance tasks

What Not to Test

Unique Challenges

Challenge	Why it matters	Mitigation
Declarative code is low-value to unit-test	A test that restates a hardcoded value is redundant	Test variable logic, combinations of declarations, and functional outcomes instead
Testing is slow	Provisioning takes minutes to hours	Divide into small stacks, shift tests offline, use progressive pipelines
Testing costs money	Real cloud resources incur charges	Destroy immediately, use isolated test accounts, automate cleanup
Complex dependencies	Components depend on upstream/downstream stacks	Use test fixtures, mocks, and local emulators

Unit vs. Integration Testing

Terraform’s entire purpose is configuring external systems, which makes pure unit testing inherently difficult. The vast majority of IaC testing is integration testing - verifying how resources, modules, and systems interact. Newer features like native mocks (v1.7+) are beginning to close this gap, but integration tests remain the backbone.

Progressive Testing

Progressive testing runs suites in sequence - fast and narrow first, slow and broad later - optimising for fast, accurate feedback. Only after the quick checks pass does the pipeline invest in expensive cloud provisioning.

Conceptual Models

The Test Pyramid

A large base of fast, offline checks; a middle layer of integration tests; a small top of full-system tests. Higher levels test emergent behaviour, not individual components.

The Infrastructure Test Diamond

For declarative codebases the pyramid inverts: low-level unit tests are less valuable (they restate the code), so the bulk of testing sits in the middle layer - stack tests. The diamond tapers to fewer offline checks at the bottom and fewer system-wide tests at the top.

The Swiss Cheese Model

No single layer catches everything. Stack multiple layers so that the “holes” in one are covered by the next - just like stacking Swiss cheese slices eventually blocks every gap.

Pipeline Stages

Stage	Environment	Scope	Speed
Build test (offline)	Local / CI agent	Syntax, linting, supply-chain, policy	Seconds
Stack test (online)	IaaS - isolated stack only	The stack itself, using test fixtures for dependencies	Minutes
Integrated infra test	IaaS - multiple stacks	Cross-stack integration points	Minutes–hours
Product test	IaaS - full stack + workload	Application behaviour on top of infrastructure	Hours

Testing in Production

Pre-release testing covers known unknowns. Production environments have characteristics that cannot be replicated:

Data - real-world volume and variety
Users - creative, unpredictable interactions
Traffic - long-term compounding effects (e.g. logs filling storage)
Concurrency - rare timing-dependent interactions

Safe production testing techniques: monitoring and observability, zero-downtime deployment, progressive rollout, dedicated test data records, and chaos engineering.

Offline Testing (Static Analysis)

Offline tests run on a developer workstation or CI agent - no cloud provisioning required. They execute in seconds.

Syntax Checking

terraform init -backend=false   # initialise without a backend
terraform validate              # parse code, catch typos and naming errors

terraform validate does not need variables set, but it does require an initialised workspace. Use -backend=false for automated pipelines without a state backend.

Static Code Analysis (Linting)

TFLint evaluates code for bugs, errors, and style violations without running it:

plugin "terraform" {
  enabled = true
  preset  = "all"   # "recommended" is the default; "all" adds rules like
                     # requiring descriptions on variables and outputs
}

plugin "aws" {
  enabled = true
}

Feature	Detail
Plugin system	Core `terraform` plugin + cloud-specific plugins (AWS, GCP, Azure) + OPA plugin
Exceptions	Disable rules globally in `.tflint.hcl` or per-line with `# tflint-ignore: rule_name`
Autofix	`tflint --fix` auto-corrects simple issues (e.g. comment style) - review findings first

Connected Static Analysis

Some tools connect to the cloud API to verify that referenced resources (AMIs, VM sizes) actually exist, catching errors that pure offline analysis misses.

Supply-Chain Checks

Infrastructure code imports third-party modules, base images, and libraries. Supply-chain tools:

Check components against public vulnerability databases
Verify license compatibility with organisational policy
Generate a Software Bill of Materials (SBOM) for future vulnerability tracking

Local Infrastructure Emulators

Platform	Emulator
AWS	LocalStack, Moto
Azure	Azurite

Security Scanning

Security validation lets teams catch vulnerabilities before infrastructure is deployed.

Open-Source Scanners

Tool	Notes
Checkov	Evaluates Terraform, plans, Helm, CloudFormation. CLI-only, no central service required.
Trivy	Successor to TFSec. Broader provider coverage. Minimal configuration to start.

Both are free and run locally. Running them simultaneously adds redundancy - Trivy covers more providers, but Checkov contains unique rules Trivy may miss.

Handling Exceptions

When a finding is intentional, create an explicit, documented exception:

Checkov:

resource "aws_s3_bucket" "public_assets" {
  #checkov:skip=CKV_AWS_18:Intentionally public - serves static marketing assets
  bucket = "acme-public-assets"
}

Trivy:

# Public bucket for static marketing assets
#trivy:ignore:AVD-AWS-0086
resource "aws_s3_bucket" "public_assets" {
  bucket = "acme-public-assets"
}

Commercial Solutions

Platforms like Snyk, Checkmarx, and Mend offer centralised dashboards, cross-project issue tracking, and SBOM generation. Even when using a commercial scanner, keep Checkov running as a free, extra layer.

Custom Policies

Checkov (YAML) - the recommended approach for most teams:

metadata:
  id: "CUSTOM_001"
  name: "Block GPU instance families"
  category: "FINANCE"
definition:
  cond_type: "attribute"
  resource_types:
    - "aws_instance"
  attribute: "instance_type"
  operator: "not_starting_with"
  value: "p3."

Store YAML policies in a central Git repository. Pull them at runtime with --external-checks-git. Checkov also supports Python for policies requiring API calls.

OPA with TFLint - uses the Rego policy language via TFLint’s OPA plugin. Powerful but Rego has a steep learning curve; prefer Checkov YAML unless your organisation already uses OPA elsewhere.

Preview Testing: `terraform plan` in CI

Some tools can preview changes before applying (Terraform plan, Pulumi preview). You can write automated assertions against the preview output:

Fail if the preview would destroy a database
Fail if a deprecated resource type would be created
Fail if a security-sensitive change is detected

Terratest

Terratest (by Gruntwork, since 2018) is the long-standing standard for IaC testing. Tests are written in Go.

Structure

func TestWebServer(t *testing.T) {
    opts := &terraform.Options{
        TerraformDir: "../examples/web-server",
        Vars: map[string]interface{}{
            "environment": "test",
            "name_suffix": random.UniqueId(),  // avoid name collisions
        },
    }
    defer terraform.Destroy(t, opts)
    terraform.InitAndApply(t, opts)

    url := terraform.Output(t, opts, "endpoint_url")
    http_helper.HttpGetWithRetry(t, url, nil, 200, "OK", 10, 5*time.Second)
}

Strengths

Maturity - 20+ helper packages for AWS, Azure, DNS, HTTP, SSH, Docker, Kubernetes, etc.
AI support - LLMs generate Terratest/Go code effectively because of abundant training data
Full language power - loops, error handling, retries, custom assertions

Limitations

Requires learning Go
Can only validate via Terraform outputs - cannot access internal module state directly

Terraform Native Testing Framework

Introduced in Terraform/OpenTofu v1.6, the native framework lets you write tests in HCL - no second language required.

Structure

variables {
  environment = "test"
}

run "create_web_server" {
  command = apply

  assert {
    condition     = aws_instance.web.tags["Environment"] == "test"
    error_message = "Instance was not tagged with the correct environment."
  }

  assert {
    condition     = output.endpoint_url != ""
    error_message = "Endpoint URL output must not be empty."
  }
}

Strengths

No Go required - tests are pure HCL
Internal module access - can reference local values, internal resource attributes, and data sources directly (no need to expose them as outputs)
Mocks (v1.7+) - replace real providers with fakes; default return values (0 for numbers, false for bools, "" for strings) or inject specific values via override_resource / override_data blocks

Mocks Example

mock_provider "aws" {
  override_resource {
    target = aws_instance.web
    values = {
      id        = "i-mock123"
      public_ip = "10.0.0.1"
    }
  }
}

run "plan_only" {
  command = plan

  assert {
    condition     = aws_instance.web.instance_type == "t3.micro"
    error_message = "Expected t3.micro instance type."
  }
}

Mocks avoid provisioning real infrastructure entirely - tests run in seconds and cost nothing.

Limitations

Version-locked - cannot test code targeting older Terraform versions (e.g., you cannot use mocks with v1.6, and the framework itself doesn’t exist before v1.6)
AI support is weak - LLMs currently struggle with the native framework due to insufficient training data. Teams relying heavily on AI-generated tests should use Terratest until models catch up.

Test Instance Lifecycles

How you manage the cloud environments where tests run has a major impact on speed, reliability, and cost.

Pattern	How it works	Speed	Reliability	Trade-off
Persistent stack	Always running; pipeline updates it	✅ Fast	⚠️ Can get “wedged”	Failed updates may block the entire team
Ephemeral stack	Created from scratch, destroyed after every run	❌ Slow	✅ Clean every time	Too slow for rapid development; `destroy` can fail, requiring cleanup tools
Dual persistent + ephemeral	Both run in parallel	Mixed	Mixed	Antipattern - combines the worst of both; teams still unwedge persistent and wait for ephemeral
Periodic rebuild	Persistent during the day; destroyed and rebuilt nightly	✅ Fast (daytime)	⚠️ Masks design issues	Hides accumulated state problems that could cause production outages
Continuous reset	After each successful test, a background job destroys and rebuilds using the last production version	✅ Fast	✅ Clean	If the background rebuild fails silently, the next developer discovers a broken environment

Test Fixtures

When provisioning a full dependency graph is too slow or introduces unrelated failures, use test fixtures - lightweight stand-ins:

Replacing upstream providers:

Instead of deploying a production-grade VPC with security controls and logging, deploy a minimal fixture with just a VPC and two subnets.

Replacing downstream consumers:

Deploy a lightweight serverless function and an external client. Assert that the client can connect through the infrastructure you’re testing (gateway, routing rules) and receive a 200 OK.

Test Orchestration

Orchestrate the full lifecycle: create fixtures → provision stack → run validations → consolidate results → destroy everything.

Guidelines:

Support local testing - developers must be able to run the same test scripts on their workstation before pushing. Use the exact same orchestration scripts locally and in CI.
Don’t couple to the CI tool - write orchestration in standalone scripts (Bash, Python, Make). The CI stage should do nothing but call the script. This keeps tests portable across CI platforms and runnable locally.

Authentication for Tests

Tests interact with real cloud APIs and need credentials.

Approach	Recommendation
OIDC	✅ Preferred - short-lived, automatically rotated, no secrets to leak
Static credentials	⚠️ Avoid in CI - use only for local development if OIDC is impractical

Concurrency and Name Collisions

Multiple PRs running tests simultaneously will collide if they create identically-named resources. Inject randomness:

resource "random_string" "suffix" {
  length  = 6
  special = false
  upper   = false
}

For resources that require global uniqueness even after deletion (e.g., AWS Secrets Manager), add the random suffix directly in the module, not just in tests.

Timeout Management

Automated Account Cleanup

Tests will inevitably crash before cleanup runs. Protect your budget:

Never test in production accounts - use isolated test accounts
Schedule nuke jobs - run tools like aws-nuke or azure-nuke nightly via a scheduled CI job to erase everything in test accounts
Use resource groups (Azure, GCP) - group test infrastructure and delete the entire group for guaranteed cleanup

Test Code Quality

Test suites are code. Apply the same standards:

Clear variable names and extensive comments explaining each test’s goal
Review test code in PRs with the same rigour as infrastructure code
Refactor test suites when they become harder to maintain than the infrastructure they validate

Testing IaC

Why IaC Testing Is Different

The Case for Continual Testing

What to Test

What Not to Test

Unique Challenges

Unit vs. Integration Testing

Progressive Testing

Conceptual Models

Pipeline Stages

Testing in Production

Offline Testing (Static Analysis)

Syntax Checking

Static Code Analysis (Linting)

Connected Static Analysis

Supply-Chain Checks

Local Infrastructure Emulators

Security Scanning

Open-Source Scanners

Handling Exceptions

Commercial Solutions

Custom Policies

Preview Testing: terraform plan in CI

Terratest

Structure

Strengths

Limitations

Terraform Native Testing Framework

Structure

Strengths

Mocks Example

Limitations

Test Instance Lifecycles

Test Fixtures

Test Orchestration

Authentication for Tests

Concurrency and Name Collisions

Timeout Management

Automated Account Cleanup

Test Code Quality

Preview Testing: `terraform plan` in CI