Art of Writing a Dockerfile
A Dockerfile is a plain text file containing ordered instructions that Docker executes top-to-bottom to build an image. Each instruction that modifies the filesystem creates a new layer — and layer order matters more than most people realize.
The mental model: write a Dockerfile the same way you’d set up a server manually, then optimize for caching and size.
Instruction Order Matters (Layer Cache)
Section titled “Instruction Order Matters (Layer Cache)”Docker caches each layer. If a layer hasn’t changed since the last build, Docker reuses it — skipping that step entirely. The moment a layer changes, everything below it rebuilds.
The rule: put things that change least at the top, things that change most at the bottom.
# ❌ Bad order — COPY happens before dependency install# Any file change invalidates the pip install layerFROM python:3.12-slimCOPY . /appWORKDIR /appRUN pip install -r requirements.txt# ✅ Good order — dependencies install from cache unless requirements.txt changesFROM python:3.12-slimWORKDIR /appCOPY requirements.txt . # Copy only the dep manifest firstRUN pip install -r requirements.txt # This layer is cached until requirements.txt changesCOPY . . # Source code changes don't bust the dep cacheThis single change can cut rebuild times from minutes to seconds on most projects.
Core Instructions
Section titled “Core Instructions”These are the ones you’ll use in almost every Dockerfile:
| Instruction | When to use it |
|---|---|
FROM | Always first. Sets the base image. |
WORKDIR | Sets working directory. Creates it if it doesn’t exist. Prefer over RUN mkdir && cd. |
COPY | Copy local files into the image. Prefer over ADD for local files. |
ADD | Like COPY but also handles remote URLs and auto-extracts tarballs. Use only when you need those features. |
RUN | Execute commands at build time. Each RUN is a layer — chain related commands with &&. |
ENV | Set environment variables that persist into the running container. |
ARG | Build-time variables only. Not available at runtime. Safe for build config, not secrets. |
EXPOSE | Documents which port the app listens on. Does not actually publish the port. |
CMD | Default command when the container starts. Overridable at docker run. |
ENTRYPOINT | Sets the container’s executable. CMD becomes its default arguments. |
USER | Switch to a non-root user. Should be near the end, after installs. |
HEALTHCHECK | Define a command Docker uses to check if the container is healthy. |
LABEL | Attach metadata (maintainer, version, etc.) using OCI-standard keys. |
VOLUME | Declare mount points. Signals intent to operators — doesn’t configure mounts. |
CMD vs ENTRYPOINT
Section titled “CMD vs ENTRYPOINT”This trips up almost everyone at least once.
CMD | ENTRYPOINT | |
|---|---|---|
| Role | Default arguments / command | The executable itself |
| Overridable? | Yes — any args after docker run image replace it | Only with --entrypoint flag |
| When combined | CMD provides default args to ENTRYPOINT | ENTRYPOINT receives CMD as args |
# Standalone CMD — the whole command is replaceableCMD ["python3", "app.py"]
# docker run myimage python3 other.py → runs other.py instead# ENTRYPOINT + CMD — idiomatic pattern for "this image IS a tool"ENTRYPOINT ["nginx"]CMD ["-g", "daemon off;"]
# docker run myimage -c /etc/nginx/custom.conf → passes -c as arg to nginx# docker run --entrypoint sh myimage → replaces nginx entirelyUse ENTRYPOINT when your image wraps a specific executable (a CLI tool, a server). Use CMD alone when you want full flexibility. Always use the exec form (["executable", "arg"]) over shell form (executable arg) — exec form doesn’t spawn a shell process and handles signals correctly.
ARG vs ENV
Section titled “ARG vs ENV”# ARG: build-time only, not in the final image environmentARG NODE_VERSION=20FROM node:${NODE_VERSION}-alpine
# ENV: available at build AND runtimeENV NODE_ENV=productionENV PORT=3000Multi-Stage Builds
Section titled “Multi-Stage Builds”Multi-stage builds are the most impactful optimization for compiled languages or apps with heavy build tooling. The idea: use a fat image to build, copy only the binary/artifact into a minimal final image.
# Stage 1: BuildFROM golang:1.22-alpine AS builderWORKDIR /appCOPY go.mod go.sum ./RUN go mod download # Cache dependencies separatelyCOPY . .RUN CGO_ENABLED=0 go build -o /server ./cmd/server
# Stage 2: Run — minimal image, no Go toolchainFROM scratchCOPY --from=builder /server /serverEXPOSE 8080ENTRYPOINT ["/server"]The final image here contains only the compiled binary — no Go compiler, no source code, no package manager. Result: a 5–15MB image instead of 300MB+.
For interpreted languages, multi-stage still helps by separating build dependencies from runtime:
# Stage 1: Install deps (includes build tools for native modules)FROM node:20-alpine AS depsWORKDIR /appCOPY package*.json ./RUN npm ci --only=production
# Stage 2: Runtime imageFROM node:20-alpine AS runtimeWORKDIR /appCOPY --from=deps /app/node_modules ./node_modulesCOPY . .USER node # Drop privileges before the app startsEXPOSE 3000CMD ["node", "server.js"]Running as a Non-Root User
Section titled “Running as a Non-Root User”By default, Docker runs container processes as root (uid 0). If an attacker escapes the container, they’re root on the host — or close to it. This is the single most impactful security change you can make to a Dockerfile.
FROM python:3.12-slimWORKDIR /appCOPY requirements.txt .RUN pip install --no-cache-dir -r requirements.txt
# Create a dedicated user and group with no login shellRUN groupadd -r appuser && useradd -r -g appuser appuser
COPY --chown=appuser:appuser . . # Set ownership at copy time
USER appuser # Switch before CMD/ENTRYPOINTCMD ["python3", "app.py"]Handling Build-Time Secrets
Section titled “Handling Build-Time Secrets”Don’t COPY credential files or use ARG for secrets — both leak into image history. Use RUN --mount=type=secret:
# syntax=docker/dockerfile:1FROM python:3.12-slim
# Mount a secret at build time — it never lands in any image layerRUN --mount=type=secret,id=pip_token \ pip install --extra-index-url \ "https://$(cat /run/secrets/pip_token)@pypi.company.com/simple/" \ private-package# Pass the secret at build time — not stored in imagedocker build --secret id=pip_token,src=./pip_token.txt -t myapp .Reducing Image Size
Section titled “Reducing Image Size”-
Choose a smaller base image.
alpinevariants are typically 5–10x smaller thandebian-based ones.distrolessorscratchimages are smaller still and have no shell — minimal attack surface. -
Combine
RUNcommands that modify the same set of files, so intermediate files don’t persist in layers:# ❌ Each RUN is a separate layer — cache files persist between layersRUN apt-get updateRUN apt-get install -y curlRUN rm -rf /var/lib/apt/lists/*# ✅ Single layer — cache cleaned in the same stepRUN apt-get update && \apt-get install -y --no-install-recommends curl && \rm -rf /var/lib/apt/lists/* -
Use
.dockerignoreto excludenode_modules,.git, build artifacts, and local env files from the build context. A large build context slows every build even if the files aren’tCOPY’d.
Pinning Base Images
Section titled “Pinning Base Images”# ❌ Unpinned — base image can change without warningFROM node:latest
# ✅ Tag-pinned — predictable, but tags can be re-pushedFROM node:20.12-alpine
# ✅✅ Digest-pinned — immutable, guaranteed the same image every timeFROM node:20.12-alpine@sha256:abc123...Pin to at least a minor version tag in CI. For production base images, digest pinning removes any ambiguity.
complete Example
Section titled “complete Example”A production-grade Python API Dockerfile combining everything above:
# syntax=docker/dockerfile:1FROM python:3.12-slim AS base
# Image metadata (OCI standard labels)LABEL org.opencontainers.image.source="https://github.com/company/myapi"
# Build-time config — not in final environmentARG APP_VERSION=dev
# Runtime environmentENV PYTHONDONTWRITEBYTECODE=1 \ PYTHONUNBUFFERED=1 \ PORT=8000
WORKDIR /app
# --- Dependency layer (cached until requirements.txt changes) ---COPY requirements.txt .RUN --mount=type=cache,target=/root/.cache/pip \ pip install --no-cache-dir -r requirements.txt
# --- App layer ---COPY . .
# Create non-root user and set ownershipRUN groupadd -r api && useradd -r -g api api && \ chown -R api:api /app
USER api
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \ CMD python3 -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"
CMD ["python3", "-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]