Governance & Compliance

Governance, compliance, and security cannot be afterthoughts bolted onto a finished system - they must be built directly into the system’s foundations. When infrastructure is managed as code through automated pipelines, governance transforms from an opposing force that slows delivery into an inherent, natural part of the workflow.

Shift Left

Building Security In

The central goal of governance is ensuring that “the system we built is built right” - covering quality, correctness, and operational qualities like performance, reliability, and cost. Requirements originate from:

Source	Examples
Organisational policies	Architectural guidelines, internal standards
Standards bodies	PCI compliance, SOC 2
Government regulators	GDPR, HIPAA, financial sector requirements

The Everyday Emergency Philosophy

Organisations often struggle with compliance during emergencies. Engineers bypass slow pipelines to apply manual fixes, creating “special” emergency processes that relax security controls.

The solution: make your normal workflow fast enough to push a fix during a crisis.

Design the most stripped-down, minimal process that can still reliably deliver changes and meet governance requirements. Then make it your default workflow:

If a process is good and safe enough to use in an emergency, it should be good and safe enough to use every day.

Continuous Compliance Strategies

Strategy	What it achieves
Automated validation	Every change validated by default - no reliance on human attention to catch errors
Live system monitoring	Continuous, automatic compliance validation of running infrastructure
System visibility	Deep operational insight for investigating incidents
Audit readiness	All changes and activities automatically recorded for evidence generation

DevSecOps and Team Topologies

The DevSecOps Philosophy

Development, infrastructure, and governance teams must jointly own responsibility for governance - security and compliance are not a siloed concern for a single, separate team.

In a modern workflow where developers build quality and compliance directly into code using automated validations, governance specialists shift from strict gatekeepers to enablers who support stream-aligned development teams.

Key Team Interactions

Interaction	Role
Security enablement	Help development teams implement secure code - documentation, training, close collaboration on tricky security aspects
Security scanning services	Provide automated tools that teams integrate into their pipelines - code scanning, dependency checks, automated penetration testing
Security research	Handle unfamiliar situations (e.g., security controls for generative AI); develop approaches that eventually feed into standard platform services

Compliance as Code

Compliance controls are written as code and automatically enforced within delivery pipelines, enabling the short feedback loops necessary for shifting left.

Control Types

Type	Behaviour	Example
Detection	Identify and report violations for human evaluation	Detect a cyberattack attempt and alert security staff, rather than auto-blocking (which could cause a denial-of-service outage)
Prevention	Actively disallow noncompliant actions	Pipeline blocks deployment if code contains a known vulnerability
Correction	Detect + automatically remediate	Instantly disable an unauthorised user account discovered on a server

Controls by Component Design Layer

Controls are applied at different levels of the infrastructure hierarchy:

Layer	Scope	Example
Global	Foundation-level, broadest restrictions	Deny all inbound traffic on all ports
Environment	Environment-specific policies	Enforce encryption at rest for all resources in production
Shared infrastructure	Shared components (clusters, networking)	Require private subnets for databases
Workload infrastructure	Infrastructure supporting a specific workload	Allow HTTPS on port 443 for this service only
Workload	The application itself	Enforce minimum TLS version

Rule of thumb:

Lower levels → broad, globally scoped restrictions
Higher levels → precise exceptions to those broad rules

This ensures policies are implemented at the most specific, appropriate level while aligning with the dependency structure of the infrastructure.

Controls by Workflow Stage

Controls are integrated at four points in the delivery lifecycle:

Platform Controls

Infrastructure code that deploys and configures policies natively supported by the IaaS platform (e.g., IAM restrictions on who can modify resources).

Treat like standard infrastructure code - tested, delivered, and applied consistently
Write automated tests that deliberately attempt restricted actions to prove the platform denies them
Apply to all environments, not just production, to catch issues early

Delivery Controls

Automated tests running inside the delivery pipeline that validate compliance before deployment:

Static code analysis for unsafe patterns
Dependency vulnerability scanning
Functional tests proving audit logging works correctly

Deployment Controls

Execute at the moment code is being applied to an environment:

Can block a noncompliant deployment (prevention)
Or record change details to create an audit evidence trail (detection)
Should be enforced across all environments

Monitoring Controls

Run continuously against live infrastructure after deployment:

Confirm the system remains compliant over time
Catch manual modifications that bypass the pipeline
Operate out of band to detect unexpected changes

Security Scanning Tools

Open-Source Scanners

Use multiple scanners - there’s no harm in redundancy, and one tool may catch vulnerabilities that another misses.

Checkov:

Free, open-source, runs entirely locally (no central service dependency)
Scans Terraform code, generated plans, Helm charts, CloudFormation, and more
Inline exceptions with documented reasons:

resource "aws_s3_bucket" "public_data" {
  #checkov:skip=CKV_AWS_19:Public dataset intentionally unencrypted
  bucket = "open-research-data"
}

Trivy:

Built by the creators of the now-deprecated TFSec
Covers more providers than Checkov
Exceptions via project-wide .trivyignore file or inline comments

Commercial Scanners

Tool	Differentiator
Snyk	Centralised vulnerability dashboard, SBOM generation
Checkmarx	Enterprise code security platform
Mend	Open-source risk management

Even when using a commercial tool, run a free scanner like Checkov alongside it for an extra layer of protection.

Custom Policy Enforcement

Standard security scanners check for known vulnerabilities. Custom policies enforce organisational rules - technical, compliance, financial, or contractual:

OPA with TFLint

Enforce policies using the Open Policy Agent (OPA) framework and its Rego language via an experimental TFLint plugin:

# Deny expensive GPU instances
deny[msg] {
  instance := input.resource.aws_instance[name]
  re_match("^(p|g)[0-9]", instance.instance_type)
  msg := sprintf("GPU instance type '%s' is not allowed for '%s'", [instance.instance_type, name])
}

Custom Checkov Rules (YAML)

Checkov offers a simpler approach - define custom policies as YAML files:

metadata:
  id: "CUSTOM_001"
  name: "Deny GPU instance types"
  severity: "HIGH"
definition:
  not:
    cond_type: "attribute"
    resource_types:
      - "aws_instance"
    attribute: "instance_type"
    operator: "starting_with"
    value: "p"

For advanced cases requiring API calls, Checkov also supports Python-based rules.

Centralising Policies

Avoid copying policy files into every module. Load policies from a shared Git repository:

checkov -d . --external-checks-git https://github.com/org/iac-policies

This ensures all projects automatically use the latest organisational policies.

Audit and Compliance Automation

Automated pipelines naturally produce the artefacts that auditors need:

Artefact	Source
Change history	Git commits, PR reviews, merge approvals
Test evidence	Pipeline test results, security scan reports
Deployment records	Pipeline logs, deployment timestamps, applied plan output
Compliance validation	Checkov/Trivy scan results at every stage
Live monitoring	Continuous drift detection alerts, reconciliation logs

By making compliance a continuous, automated process, every routine deployment generates audit evidence - there’s no scramble to produce documentation before a certification review.