Skip to content

Art of Writing a Dockerfile

A Dockerfile is a plain text file containing ordered instructions that Docker executes top-to-bottom to build an image. Each instruction that modifies the filesystem creates a new layer — and layer order matters more than most people realize.

The mental model: write a Dockerfile the same way you’d set up a server manually, then optimize for caching and size.

Docker caches each layer. If a layer hasn’t changed since the last build, Docker reuses it — skipping that step entirely. The moment a layer changes, everything below it rebuilds.

The rule: put things that change least at the top, things that change most at the bottom.

# ❌ Bad order — COPY happens before dependency install
# Any file change invalidates the pip install layer
FROM python:3.12-slim
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
# ✅ Good order — dependencies install from cache unless requirements.txt changes
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt . # Copy only the dep manifest first
RUN pip install -r requirements.txt # This layer is cached until requirements.txt changes
COPY . . # Source code changes don't bust the dep cache

This single change can cut rebuild times from minutes to seconds on most projects.

These are the ones you’ll use in almost every Dockerfile:

InstructionWhen to use it
FROMAlways first. Sets the base image.
WORKDIRSets working directory. Creates it if it doesn’t exist. Prefer over RUN mkdir && cd.
COPYCopy local files into the image. Prefer over ADD for local files.
ADDLike COPY but also handles remote URLs and auto-extracts tarballs. Use only when you need those features.
RUNExecute commands at build time. Each RUN is a layer — chain related commands with &&.
ENVSet environment variables that persist into the running container.
ARGBuild-time variables only. Not available at runtime. Safe for build config, not secrets.
EXPOSEDocuments which port the app listens on. Does not actually publish the port.
CMDDefault command when the container starts. Overridable at docker run.
ENTRYPOINTSets the container’s executable. CMD becomes its default arguments.
USERSwitch to a non-root user. Should be near the end, after installs.
HEALTHCHECKDefine a command Docker uses to check if the container is healthy.
LABELAttach metadata (maintainer, version, etc.) using OCI-standard keys.
VOLUMEDeclare mount points. Signals intent to operators — doesn’t configure mounts.

This trips up almost everyone at least once.

CMDENTRYPOINT
RoleDefault arguments / commandThe executable itself
Overridable?Yes — any args after docker run image replace itOnly with --entrypoint flag
When combinedCMD provides default args to ENTRYPOINTENTRYPOINT receives CMD as args
# Standalone CMD — the whole command is replaceable
CMD ["python3", "app.py"]
# docker run myimage python3 other.py → runs other.py instead
# ENTRYPOINT + CMD — idiomatic pattern for "this image IS a tool"
ENTRYPOINT ["nginx"]
CMD ["-g", "daemon off;"]
# docker run myimage -c /etc/nginx/custom.conf → passes -c as arg to nginx
# docker run --entrypoint sh myimage → replaces nginx entirely

Use ENTRYPOINT when your image wraps a specific executable (a CLI tool, a server). Use CMD alone when you want full flexibility. Always use the exec form (["executable", "arg"]) over shell form (executable arg) — exec form doesn’t spawn a shell process and handles signals correctly.

# ARG: build-time only, not in the final image environment
ARG NODE_VERSION=20
FROM node:${NODE_VERSION}-alpine
# ENV: available at build AND runtime
ENV NODE_ENV=production
ENV PORT=3000

Multi-stage builds are the most impactful optimization for compiled languages or apps with heavy build tooling. The idea: use a fat image to build, copy only the binary/artifact into a minimal final image.

# Stage 1: Build
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download # Cache dependencies separately
COPY . .
RUN CGO_ENABLED=0 go build -o /server ./cmd/server
# Stage 2: Run — minimal image, no Go toolchain
FROM scratch
COPY --from=builder /server /server
EXPOSE 8080
ENTRYPOINT ["/server"]

The final image here contains only the compiled binary — no Go compiler, no source code, no package manager. Result: a 5–15MB image instead of 300MB+.

For interpreted languages, multi-stage still helps by separating build dependencies from runtime:

# Stage 1: Install deps (includes build tools for native modules)
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
# Stage 2: Runtime image
FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
USER node # Drop privileges before the app starts
EXPOSE 3000
CMD ["node", "server.js"]

By default, Docker runs container processes as root (uid 0). If an attacker escapes the container, they’re root on the host — or close to it. This is the single most impactful security change you can make to a Dockerfile.

FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Create a dedicated user and group with no login shell
RUN groupadd -r appuser && useradd -r -g appuser appuser
COPY --chown=appuser:appuser . . # Set ownership at copy time
USER appuser # Switch before CMD/ENTRYPOINT
CMD ["python3", "app.py"]

Don’t COPY credential files or use ARG for secrets — both leak into image history. Use RUN --mount=type=secret:

# syntax=docker/dockerfile:1
FROM python:3.12-slim
# Mount a secret at build time — it never lands in any image layer
RUN --mount=type=secret,id=pip_token \
pip install --extra-index-url \
"https://$(cat /run/secrets/pip_token)@pypi.company.com/simple/" \
private-package
Terminal window
# Pass the secret at build time — not stored in image
docker build --secret id=pip_token,src=./pip_token.txt -t myapp .
  • Choose a smaller base image. alpine variants are typically 5–10x smaller than debian-based ones. distroless or scratch images are smaller still and have no shell — minimal attack surface.

  • Combine RUN commands that modify the same set of files, so intermediate files don’t persist in layers:

    # ❌ Each RUN is a separate layer — cache files persist between layers
    RUN apt-get update
    RUN apt-get install -y curl
    RUN rm -rf /var/lib/apt/lists/*
    # ✅ Single layer — cache cleaned in the same step
    RUN apt-get update && \
    apt-get install -y --no-install-recommends curl && \
    rm -rf /var/lib/apt/lists/*
  • Use .dockerignore to exclude node_modules, .git, build artifacts, and local env files from the build context. A large build context slows every build even if the files aren’t COPY’d.

# ❌ Unpinned — base image can change without warning
FROM node:latest
# ✅ Tag-pinned — predictable, but tags can be re-pushed
FROM node:20.12-alpine
# ✅✅ Digest-pinned — immutable, guaranteed the same image every time
FROM node:20.12-alpine@sha256:abc123...

Pin to at least a minor version tag in CI. For production base images, digest pinning removes any ambiguity.

A production-grade Python API Dockerfile combining everything above:

# syntax=docker/dockerfile:1
FROM python:3.12-slim AS base
# Image metadata (OCI standard labels)
LABEL org.opencontainers.image.authors="[email protected]"
LABEL org.opencontainers.image.source="https://github.com/company/myapi"
# Build-time config — not in final environment
ARG APP_VERSION=dev
# Runtime environment
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PORT=8000
WORKDIR /app
# --- Dependency layer (cached until requirements.txt changes) ---
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install --no-cache-dir -r requirements.txt
# --- App layer ---
COPY . .
# Create non-root user and set ownership
RUN groupadd -r api && useradd -r -g api api && \
chown -R api:api /app
USER api
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD python3 -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"
CMD ["python3", "-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]