Skip to content

Governance & Compliance

Governance, compliance, and security cannot be afterthoughts bolted onto a finished system - they must be built directly into the system’s foundations. When infrastructure is managed as code through automated pipelines, governance transforms from an opposing force that slows delivery into an inherent, natural part of the workflow.


The central goal of governance is ensuring that “the system we built is built right” - covering quality, correctness, and operational qualities like performance, reliability, and cost. Requirements originate from:

SourceExamples
Organisational policiesArchitectural guidelines, internal standards
Standards bodiesPCI compliance, SOC 2
Government regulatorsGDPR, HIPAA, financial sector requirements

Organisations often struggle with compliance during emergencies. Engineers bypass slow pipelines to apply manual fixes, creating “special” emergency processes that relax security controls.

The solution: make your normal workflow fast enough to push a fix during a crisis.

Design the most stripped-down, minimal process that can still reliably deliver changes and meet governance requirements. Then make it your default workflow:

If a process is good and safe enough to use in an emergency, it should be good and safe enough to use every day.

StrategyWhat it achieves
Automated validationEvery change validated by default - no reliance on human attention to catch errors
Live system monitoringContinuous, automatic compliance validation of running infrastructure
System visibilityDeep operational insight for investigating incidents
Audit readinessAll changes and activities automatically recorded for evidence generation

Development, infrastructure, and governance teams must jointly own responsibility for governance - security and compliance are not a siloed concern for a single, separate team.

In a modern workflow where developers build quality and compliance directly into code using automated validations, governance specialists shift from strict gatekeepers to enablers who support stream-aligned development teams.

InteractionRole
Security enablementHelp development teams implement secure code - documentation, training, close collaboration on tricky security aspects
Security scanning servicesProvide automated tools that teams integrate into their pipelines - code scanning, dependency checks, automated penetration testing
Security researchHandle unfamiliar situations (e.g., security controls for generative AI); develop approaches that eventually feed into standard platform services

Compliance controls are written as code and automatically enforced within delivery pipelines, enabling the short feedback loops necessary for shifting left.

TypeBehaviourExample
DetectionIdentify and report violations for human evaluationDetect a cyberattack attempt and alert security staff, rather than auto-blocking (which could cause a denial-of-service outage)
PreventionActively disallow noncompliant actionsPipeline blocks deployment if code contains a known vulnerability
CorrectionDetect + automatically remediateInstantly disable an unauthorised user account discovered on a server

Controls are applied at different levels of the infrastructure hierarchy:

LayerScopeExample
GlobalFoundation-level, broadest restrictionsDeny all inbound traffic on all ports
EnvironmentEnvironment-specific policiesEnforce encryption at rest for all resources in production
Shared infrastructureShared components (clusters, networking)Require private subnets for databases
Workload infrastructureInfrastructure supporting a specific workloadAllow HTTPS on port 443 for this service only
WorkloadThe application itselfEnforce minimum TLS version

Rule of thumb:

  • Lower levels → broad, globally scoped restrictions
  • Higher levels → precise exceptions to those broad rules

This ensures policies are implemented at the most specific, appropriate level while aligning with the dependency structure of the infrastructure.


Controls are integrated at four points in the delivery lifecycle:

Infrastructure code that deploys and configures policies natively supported by the IaaS platform (e.g., IAM restrictions on who can modify resources).

  • Treat like standard infrastructure code - tested, delivered, and applied consistently
  • Write automated tests that deliberately attempt restricted actions to prove the platform denies them
  • Apply to all environments, not just production, to catch issues early

Automated tests running inside the delivery pipeline that validate compliance before deployment:

  • Static code analysis for unsafe patterns
  • Dependency vulnerability scanning
  • Functional tests proving audit logging works correctly

Execute at the moment code is being applied to an environment:

  • Can block a noncompliant deployment (prevention)
  • Or record change details to create an audit evidence trail (detection)
  • Should be enforced across all environments

Run continuously against live infrastructure after deployment:

  • Confirm the system remains compliant over time
  • Catch manual modifications that bypass the pipeline
  • Operate out of band to detect unexpected changes

Use multiple scanners - there’s no harm in redundancy, and one tool may catch vulnerabilities that another misses.

Checkov:

  • Free, open-source, runs entirely locally (no central service dependency)
  • Scans Terraform code, generated plans, Helm charts, CloudFormation, and more
  • Inline exceptions with documented reasons:
resource "aws_s3_bucket" "public_data" {
#checkov:skip=CKV_AWS_19:Public dataset intentionally unencrypted
bucket = "open-research-data"
}

Trivy:

  • Built by the creators of the now-deprecated TFSec
  • Covers more providers than Checkov
  • Exceptions via project-wide .trivyignore file or inline comments
ToolDifferentiator
SnykCentralised vulnerability dashboard, SBOM generation
CheckmarxEnterprise code security platform
MendOpen-source risk management

Even when using a commercial tool, run a free scanner like Checkov alongside it for an extra layer of protection.


Standard security scanners check for known vulnerabilities. Custom policies enforce organisational rules - technical, compliance, financial, or contractual:

Enforce policies using the Open Policy Agent (OPA) framework and its Rego language via an experimental TFLint plugin:

# Deny expensive GPU instances
deny[msg] {
instance := input.resource.aws_instance[name]
re_match("^(p|g)[0-9]", instance.instance_type)
msg := sprintf("GPU instance type '%s' is not allowed for '%s'", [instance.instance_type, name])
}

Checkov offers a simpler approach - define custom policies as YAML files:

metadata:
id: "CUSTOM_001"
name: "Deny GPU instance types"
severity: "HIGH"
definition:
not:
cond_type: "attribute"
resource_types:
- "aws_instance"
attribute: "instance_type"
operator: "starting_with"
value: "p"

For advanced cases requiring API calls, Checkov also supports Python-based rules.

Avoid copying policy files into every module. Load policies from a shared Git repository:

Terminal window
checkov -d . --external-checks-git https://github.com/org/iac-policies

This ensures all projects automatically use the latest organisational policies.


Automated pipelines naturally produce the artefacts that auditors need:

ArtefactSource
Change historyGit commits, PR reviews, merge approvals
Test evidencePipeline test results, security scan reports
Deployment recordsPipeline logs, deployment timestamps, applied plan output
Compliance validationCheckov/Trivy scan results at every stage
Live monitoringContinuous drift detection alerts, reconciliation logs

By making compliance a continuous, automated process, every routine deployment generates audit evidence - there’s no scramble to produce documentation before a certification review.