Testing IaC
Infrastructure code powers the systems your users depend on. Unlike application code, testing it means provisioning real cloud resources - which is slow, costs money, and requires careful orchestration. This page covers the theory, tools, and workflows that make IaC testing practical.
Why IaC Testing Is Different
Section titled “Why IaC Testing Is Different”The Case for Continual Testing
Section titled “The Case for Continual Testing”Building infrastructure is rarely a one-off task. Systems require continuous patching, upgrading, and improvement. Integrating testing directly into the daily development cycle - not deferring it to a “QA phase” - delivers three key advantages:
- Painless go-lives - the first production deployment uses the same automation that ran throughout development, removing deployment drama
- Avoiding technical debt - catching problems immediately is cheaper than investigating and rewriting broken code later
- Tight feedback loops - validating each change as early as possible is the single strongest differentiator of high-performing teams
What to Test
Section titled “What to Test”A comprehensive IaC testing strategy reaches well beyond “does the resource get created”:
| Validation target | Example |
|---|---|
| Code quality | Syntax, formatting, complexity |
| Functionality | An HTTPS route reaches the web servers end-to-end |
| Security | Ports, permissions, exposed secrets |
| Compliance | Regulatory, contractual, and internal policy adherence |
| Provenance | Supply-chain checks - vulnerabilities, license compatibility, SBOM |
| Performance | Network latency, connection throughput |
| Scalability | Autoscaling triggers and the desired outcome (capacity actually increases) |
| Availability | Destroy a node → verify automatic replacement |
| Operability | Logging, monitoring, scheduled maintenance tasks |
What Not to Test
Section titled “What Not to Test”Unique Challenges
Section titled “Unique Challenges”| Challenge | Why it matters | Mitigation |
|---|---|---|
| Declarative code is low-value to unit-test | A test that restates a hardcoded value is redundant | Test variable logic, combinations of declarations, and functional outcomes instead |
| Testing is slow | Provisioning takes minutes to hours | Divide into small stacks, shift tests offline, use progressive pipelines |
| Testing costs money | Real cloud resources incur charges | Destroy immediately, use isolated test accounts, automate cleanup |
| Complex dependencies | Components depend on upstream/downstream stacks | Use test fixtures, mocks, and local emulators |
Unit vs. Integration Testing
Section titled “Unit vs. Integration Testing”Terraform’s entire purpose is configuring external systems, which makes pure unit testing inherently difficult. The vast majority of IaC testing is integration testing - verifying how resources, modules, and systems interact. Newer features like native mocks (v1.7+) are beginning to close this gap, but integration tests remain the backbone.
Progressive Testing
Section titled “Progressive Testing”Progressive testing runs suites in sequence - fast and narrow first, slow and broad later - optimising for fast, accurate feedback. Only after the quick checks pass does the pipeline invest in expensive cloud provisioning.
Conceptual Models
Section titled “Conceptual Models”The Test Pyramid
A large base of fast, offline checks; a middle layer of integration tests; a small top of full-system tests. Higher levels test emergent behaviour, not individual components.
The Infrastructure Test Diamond
For declarative codebases the pyramid inverts: low-level unit tests are less valuable (they restate the code), so the bulk of testing sits in the middle layer - stack tests. The diamond tapers to fewer offline checks at the bottom and fewer system-wide tests at the top.
The Swiss Cheese Model
No single layer catches everything. Stack multiple layers so that the “holes” in one are covered by the next - just like stacking Swiss cheese slices eventually blocks every gap.
Pipeline Stages
Section titled “Pipeline Stages”| Stage | Environment | Scope | Speed |
|---|---|---|---|
| Build test (offline) | Local / CI agent | Syntax, linting, supply-chain, policy | Seconds |
| Stack test (online) | IaaS - isolated stack only | The stack itself, using test fixtures for dependencies | Minutes |
| Integrated infra test | IaaS - multiple stacks | Cross-stack integration points | Minutes–hours |
| Product test | IaaS - full stack + workload | Application behaviour on top of infrastructure | Hours |
Testing in Production
Section titled “Testing in Production”Pre-release testing covers known unknowns. Production environments have characteristics that cannot be replicated:
- Data - real-world volume and variety
- Users - creative, unpredictable interactions
- Traffic - long-term compounding effects (e.g. logs filling storage)
- Concurrency - rare timing-dependent interactions
Safe production testing techniques: monitoring and observability, zero-downtime deployment, progressive rollout, dedicated test data records, and chaos engineering.
Offline Testing (Static Analysis)
Section titled “Offline Testing (Static Analysis)”Offline tests run on a developer workstation or CI agent - no cloud provisioning required. They execute in seconds.
Syntax Checking
Section titled “Syntax Checking”terraform init -backend=false # initialise without a backendterraform validate # parse code, catch typos and naming errorsterraform validate does not need variables set, but it does require an initialised workspace. Use -backend=false for automated pipelines without a state backend.
Static Code Analysis (Linting)
Section titled “Static Code Analysis (Linting)”TFLint evaluates code for bugs, errors, and style violations without running it:
plugin "terraform" { enabled = true preset = "all" # "recommended" is the default; "all" adds rules like # requiring descriptions on variables and outputs}
plugin "aws" { enabled = true}| Feature | Detail |
|---|---|
| Plugin system | Core terraform plugin + cloud-specific plugins (AWS, GCP, Azure) + OPA plugin |
| Exceptions | Disable rules globally in .tflint.hcl or per-line with # tflint-ignore: rule_name |
| Autofix | tflint --fix auto-corrects simple issues (e.g. comment style) - review findings first |
Connected Static Analysis
Section titled “Connected Static Analysis”Some tools connect to the cloud API to verify that referenced resources (AMIs, VM sizes) actually exist, catching errors that pure offline analysis misses.
Supply-Chain Checks
Section titled “Supply-Chain Checks”Infrastructure code imports third-party modules, base images, and libraries. Supply-chain tools:
- Check components against public vulnerability databases
- Verify license compatibility with organisational policy
- Generate a Software Bill of Materials (SBOM) for future vulnerability tracking
Local Infrastructure Emulators
Section titled “Local Infrastructure Emulators”| Platform | Emulator |
|---|---|
| AWS | LocalStack, Moto |
| Azure | Azurite |
Security Scanning
Section titled “Security Scanning”Security validation lets teams catch vulnerabilities before infrastructure is deployed.
Open-Source Scanners
Section titled “Open-Source Scanners”| Tool | Notes |
|---|---|
| Checkov | Evaluates Terraform, plans, Helm, CloudFormation. CLI-only, no central service required. |
| Trivy | Successor to TFSec. Broader provider coverage. Minimal configuration to start. |
Both are free and run locally. Running them simultaneously adds redundancy - Trivy covers more providers, but Checkov contains unique rules Trivy may miss.
Handling Exceptions
Section titled “Handling Exceptions”When a finding is intentional, create an explicit, documented exception:
Checkov:
resource "aws_s3_bucket" "public_assets" { #checkov:skip=CKV_AWS_18:Intentionally public - serves static marketing assets bucket = "acme-public-assets"}Trivy:
# Public bucket for static marketing assets#trivy:ignore:AVD-AWS-0086resource "aws_s3_bucket" "public_assets" { bucket = "acme-public-assets"}Commercial Solutions
Section titled “Commercial Solutions”Platforms like Snyk, Checkmarx, and Mend offer centralised dashboards, cross-project issue tracking, and SBOM generation. Even when using a commercial scanner, keep Checkov running as a free, extra layer.
Custom Policies
Section titled “Custom Policies”Checkov (YAML) - the recommended approach for most teams:
metadata: id: "CUSTOM_001" name: "Block GPU instance families" category: "FINANCE"definition: cond_type: "attribute" resource_types: - "aws_instance" attribute: "instance_type" operator: "not_starting_with" value: "p3."Store YAML policies in a central Git repository. Pull them at runtime with --external-checks-git. Checkov also supports Python for policies requiring API calls.
OPA with TFLint - uses the Rego policy language via TFLint’s OPA plugin. Powerful but Rego has a steep learning curve; prefer Checkov YAML unless your organisation already uses OPA elsewhere.
Preview Testing: terraform plan in CI
Section titled “Preview Testing: terraform plan in CI”Some tools can preview changes before applying (Terraform plan, Pulumi preview). You can write automated assertions against the preview output:
- Fail if the preview would destroy a database
- Fail if a deprecated resource type would be created
- Fail if a security-sensitive change is detected
Terratest
Section titled “Terratest”Terratest (by Gruntwork, since 2018) is the long-standing standard for IaC testing. Tests are written in Go.
Structure
Section titled “Structure”func TestWebServer(t *testing.T) { opts := &terraform.Options{ TerraformDir: "../examples/web-server", Vars: map[string]interface{}{ "environment": "test", "name_suffix": random.UniqueId(), // avoid name collisions }, } defer terraform.Destroy(t, opts) terraform.InitAndApply(t, opts)
url := terraform.Output(t, opts, "endpoint_url") http_helper.HttpGetWithRetry(t, url, nil, 200, "OK", 10, 5*time.Second)}Strengths
Section titled “Strengths”- Maturity - 20+ helper packages for AWS, Azure, DNS, HTTP, SSH, Docker, Kubernetes, etc.
- AI support - LLMs generate Terratest/Go code effectively because of abundant training data
- Full language power - loops, error handling, retries, custom assertions
Limitations
Section titled “Limitations”- Requires learning Go
- Can only validate via Terraform outputs - cannot access internal module state directly
Terraform Native Testing Framework
Section titled “Terraform Native Testing Framework”Introduced in Terraform/OpenTofu v1.6, the native framework lets you write tests in HCL - no second language required.
Structure
Section titled “Structure”variables { environment = "test"}
run "create_web_server" { command = apply
assert { condition = aws_instance.web.tags["Environment"] == "test" error_message = "Instance was not tagged with the correct environment." }
assert { condition = output.endpoint_url != "" error_message = "Endpoint URL output must not be empty." }}Strengths
Section titled “Strengths”- No Go required - tests are pure HCL
- Internal module access - can reference local values, internal resource attributes, and data sources directly (no need to expose them as outputs)
- Mocks (v1.7+) - replace real providers with fakes; default return values (
0for numbers,falsefor bools,""for strings) or inject specific values viaoverride_resource/override_datablocks
Mocks Example
Section titled “Mocks Example”mock_provider "aws" { override_resource { target = aws_instance.web values = { id = "i-mock123" public_ip = "10.0.0.1" } }}
run "plan_only" { command = plan
assert { condition = aws_instance.web.instance_type == "t3.micro" error_message = "Expected t3.micro instance type." }}Mocks avoid provisioning real infrastructure entirely - tests run in seconds and cost nothing.
Limitations
Section titled “Limitations”- Version-locked - cannot test code targeting older Terraform versions (e.g., you cannot use mocks with v1.6, and the framework itself doesn’t exist before v1.6)
- AI support is weak - LLMs currently struggle with the native framework due to insufficient training data. Teams relying heavily on AI-generated tests should use Terratest until models catch up.
Test Instance Lifecycles
Section titled “Test Instance Lifecycles”How you manage the cloud environments where tests run has a major impact on speed, reliability, and cost.
| Pattern | How it works | Speed | Reliability | Trade-off |
|---|---|---|---|---|
| Persistent stack | Always running; pipeline updates it | ✅ Fast | ⚠️ Can get “wedged” | Failed updates may block the entire team |
| Ephemeral stack | Created from scratch, destroyed after every run | ❌ Slow | ✅ Clean every time | Too slow for rapid development; destroy can fail, requiring cleanup tools |
| Dual persistent + ephemeral | Both run in parallel | Mixed | Mixed | Antipattern - combines the worst of both; teams still unwedge persistent and wait for ephemeral |
| Periodic rebuild | Persistent during the day; destroyed and rebuilt nightly | ✅ Fast (daytime) | ⚠️ Masks design issues | Hides accumulated state problems that could cause production outages |
| Continuous reset | After each successful test, a background job destroys and rebuilds using the last production version | ✅ Fast | ✅ Clean | If the background rebuild fails silently, the next developer discovers a broken environment |
Test Fixtures
Section titled “Test Fixtures”When provisioning a full dependency graph is too slow or introduces unrelated failures, use test fixtures - lightweight stand-ins:
Replacing upstream providers:
Instead of deploying a production-grade VPC with security controls and logging, deploy a minimal fixture with just a VPC and two subnets.
Replacing downstream consumers:
Deploy a lightweight serverless function and an external client. Assert that the client can connect through the infrastructure you’re testing (gateway, routing rules) and receive a 200 OK.
Test Orchestration
Section titled “Test Orchestration”Orchestrate the full lifecycle: create fixtures → provision stack → run validations → consolidate results → destroy everything.
Guidelines:
- Support local testing - developers must be able to run the same test scripts on their workstation before pushing. Use the exact same orchestration scripts locally and in CI.
- Don’t couple to the CI tool - write orchestration in standalone scripts (Bash, Python, Make). The CI stage should do nothing but call the script. This keeps tests portable across CI platforms and runnable locally.
Authentication for Tests
Section titled “Authentication for Tests”Tests interact with real cloud APIs and need credentials.
| Approach | Recommendation |
|---|---|
| OIDC | ✅ Preferred - short-lived, automatically rotated, no secrets to leak |
| Static credentials | ⚠️ Avoid in CI - use only for local development if OIDC is impractical |
Concurrency and Name Collisions
Section titled “Concurrency and Name Collisions”Multiple PRs running tests simultaneously will collide if they create identically-named resources. Inject randomness:
resource "random_string" "suffix" { length = 6 special = false upper = false}For resources that require global uniqueness even after deletion (e.g., AWS Secrets Manager), add the random suffix directly in the module, not just in tests.
Timeout Management
Section titled “Timeout Management”Automated Account Cleanup
Section titled “Automated Account Cleanup”Tests will inevitably crash before cleanup runs. Protect your budget:
- Never test in production accounts - use isolated test accounts
- Schedule nuke jobs - run tools like
aws-nukeorazure-nukenightly via a scheduled CI job to erase everything in test accounts - Use resource groups (Azure, GCP) - group test infrastructure and delete the entire group for guaranteed cleanup
Test Code Quality
Section titled “Test Code Quality”Test suites are code. Apply the same standards:
- Clear variable names and extensive comments explaining each test’s goal
- Review test code in PRs with the same rigour as infrastructure code
- Refactor test suites when they become harder to maintain than the infrastructure they validate