Dockerfile Best Practices en Multi-Stage Builds

A well-written Dockerfile is the foundation of efficient, secure, and reproducible container images. This comprehensive guide covers Dockerfile best practices — from basic instruction syntax and layer caching to multi-stage builds, security hardening, and production-ready examples for Node.js, Python, Go, and Java.

Generate docker-compose.yml for your Dockerized apps →

Dockerfile Basics

A Dockerfile is a text file containing instructions that Docker reads to build an image. Each instruction creates a new layer in the image. Here are the most important instructions:

FROM — Base Image

Every Dockerfile starts with FROM. It sets the base image for subsequent instructions.

# Use an official base image
FROM node:20-alpine

# Use a specific digest for reproducibility
FROM node:20-alpine@sha256:abcdef...

# Use a version alias
FROM python:3.12-slim

RUN — Execute Commands

RUN executes commands during the image build process. Each RUN creates a new layer.

# Shell form (runs in /bin/sh -c)
RUN apt-get update && apt-get install -y curl

# Exec form (no shell processing)
RUN ["apt-get", "install", "-y", "curl"]

COPY — Copy Files

COPY transfers files and directories from the build context into the image.

# Copy a single file
COPY package.json /app/

# Copy a directory
COPY src/ /app/src/

# Copy with ownership (avoids extra chown layer)
COPY --chown=node:node . /app/

CMD — Default Command

CMD sets the default command that runs when a container starts. Only the last CMD takes effect.

# Exec form (preferred)
CMD ["node", "server.js"]

# Shell form
CMD node server.js

ENTRYPOINT — Fixed Command

ENTRYPOINT sets a command that always runs. CMD arguments are appended to ENTRYPOINT.

# ENTRYPOINT + CMD pattern
ENTRYPOINT ["python", "manage.py"]
CMD ["runserver", "0.0.0.0:8000"]

# docker run myapp migrate   → python manage.py migrate
# docker run myapp            → python manage.py runserver 0.0.0.0:8000

WORKDIR — Working Directory

WORKDIR sets the working directory for RUN, CMD, ENTRYPOINT, COPY, and ADD instructions.

WORKDIR /app

# All subsequent commands run in /app
COPY package.json .
RUN npm install
COPY . .

EXPOSE — Document Ports

EXPOSE documents which ports the container listens on. It does not actually publish the port.

EXPOSE 3000
EXPOSE 8080/tcp
EXPOSE 8125/udp

Layer Caching — Order Matters

Docker caches each layer. If a layer has not changed, Docker reuses the cached version. This means the order of instructions dramatically affects build speed. Put instructions that change less frequently at the top.

Golden rule: Copy dependency files first, install dependencies, then copy source code. This way, dependency installation is cached unless package files change.

Bad — Cache busted on every code change

FROM node:20-alpine
WORKDIR /app

# BAD: Copying everything first means ANY file change
# invalidates the npm install cache
COPY . .
RUN npm install
CMD ["node", "server.js"]

Good — Dependencies cached separately

FROM node:20-alpine
WORKDIR /app

# GOOD: Copy only dependency files first
COPY package.json package-lock.json ./
RUN npm ci --only=production

# Then copy source code (this layer changes often)
COPY . .
CMD ["node", "server.js"]

With this structure, changing your source code does not trigger a reinstall of all dependencies. Only the final COPY layer and beyond are rebuilt.

Multi-Stage Builds

Multi-stage builds let you use multiple FROM statements in a single Dockerfile. Each FROM starts a new build stage. You can copy artifacts from one stage to another, leaving behind everything you do not need in the final image.

Benefits: smaller final images, no build tools in production, better security, cleaner separation of concerns.

Node.js Multi-Stage Example

Build stage compiles TypeScript, production stage runs only the compiled JavaScript:

# ===== Stage 1: Build =====
FROM node:20-alpine AS builder
WORKDIR /app

COPY package.json package-lock.json ./
RUN npm ci

COPY tsconfig.json ./
COPY src/ ./src/
RUN npm run build

# ===== Stage 2: Production =====
FROM node:20-alpine AS production
WORKDIR /app

# Install only production dependencies
COPY package.json package-lock.json ./
RUN npm ci --only=production && npm cache clean --force

# Copy compiled output from builder
COPY --from=builder /app/dist ./dist

# Security: run as non-root
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

EXPOSE 3000
CMD ["node", "dist/server.js"]

Size Comparison

Approach	Image Size
Single stage (node:20)	~1.1 GB
Single stage (node:20-alpine)	~400 MB
Multi-stage (build + alpine)	~150 MB
Multi-stage (distroless)	~120 MB

.dockerignore

The .dockerignore file excludes files from the build context, reducing build time and preventing sensitive files from being included in images.

Without .dockerignore, Docker sends every file in your project to the daemon — including node_modules, .git history, .env secrets, and build artifacts.

# .dockerignore

# Dependencies (will be installed in container)
node_modules
npm-debug.log*

# Version control
.git
.gitignore

# Environment / secrets
.env
.env.*
*.pem

# IDE and OS files
.vscode
.idea
.DS_Store
Thumbs.db

# Build output
dist
build
coverage

# Docker files (not needed inside container)
Dockerfile*
docker-compose*
.dockerignore

# Documentation
README.md
CHANGELOG.md
docs/

Base Image Selection

Choosing the right base image affects image size, security, and compatibility. Here is a comparison of common base image variants:

Variant	Size	Packages	Use Case
node:20	~1.1 GB	Full Debian, all common packages	Development, debugging
node:20-slim	~200 MB	Minimal Debian, essential packages	Production (good balance)
node:20-alpine	~130 MB	Musl libc, BusyBox — smallest	Production (smallest size)
gcr.io/distroless/nodejs20	~120 MB	No shell, no package manager	Production (maximum security)

Recommendation: Start with alpine or slim for most applications. Use distroless for maximum security when you do not need shell access for debugging.

Security Best Practices

Container security starts with the Dockerfile. Follow these practices to minimize the attack surface.

Run as Non-Root User

By default, containers run as root. Always create and switch to a non-root user:

FROM node:20-alpine
WORKDIR /app

COPY --chown=node:node . .
RUN npm ci --only=production

# Switch to the built-in non-root user
USER node

CMD ["node", "server.js"]

# For Debian-based images, create a user explicitly
FROM python:3.12-slim
WORKDIR /app

RUN groupadd -r appgroup && useradd -r -g appgroup -d /app appuser
COPY --chown=appuser:appgroup . .
RUN pip install --no-cache-dir -r requirements.txt

USER appuser
CMD ["python", "app.py"]

Never Put Secrets in Build Args

Build args are visible in the image history. Use runtime environment variables or Docker secrets instead.

# BAD — secret visible in image history
ARG DB_PASSWORD
RUN echo "db_pass=$DB_PASSWORD" > /app/config

# GOOD — use runtime environment variables
ENV DB_PASSWORD=""
# Set at runtime: docker run -e DB_PASSWORD=secret myapp

# BEST — use Docker BuildKit secrets (not stored in layers)
RUN --mount=type=secret,id=db_pass \
    cat /run/secrets/db_pass > /app/config
# Build: docker build --secret id=db_pass,src=./password.txt .

Scan Images for Vulnerabilities

Use tools like Trivy, Snyk, or Docker Scout to scan your images:

# Trivy — popular open-source scanner
trivy image myapp:latest

# Docker Scout (built into Docker Desktop)
docker scout cves myapp:latest

# Snyk
snyk container test myapp:latest

# Scan during CI/CD pipeline
# GitHub Actions example:
# - name: Scan image
#   uses: aquasecurity/trivy-action@master
#   with:
#     image-ref: myapp:latest
#     severity: CRITICAL,HIGH

RUN Optimization

Each RUN instruction creates a new layer. Combine commands and clean up in the same layer to reduce image size.

Combine RUN Commands

Combine related commands with && and clean up package manager caches:

Bad — 3 layers, cache remains

# Each RUN = new layer; apt cache stays in first layer
RUN apt-get update
RUN apt-get install -y curl wget git
RUN rm -rf /var/lib/apt/lists/*

Good — 1 layer, cache cleaned

# Single layer, cache cleaned in same layer
RUN apt-get update \
    && apt-get install -y --no-install-recommends \
       curl \
       wget \
       git \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get purge -y --auto-remove

COPY vs ADD

Both COPY and ADD transfer files into the image, but they behave differently:

Instruction	Behavior	When to Use
COPY	Simple file/directory copy	Default choice — use for most cases
ADD	Copy + URL download + tar extraction	Only when you need tar auto-extraction

# COPY — simple and predictable
COPY ./config /app/config

# ADD — auto-extracts tar archives
ADD app.tar.gz /app/

# For downloading files, prefer RUN + curl
RUN curl -fsSL https://example.com/file.tar.gz | tar xz -C /app/

Best practice: Always use COPY unless you specifically need ADD's tar extraction feature. For downloading files, use RUN with curl or wget instead — this gives you better control over caching and error handling.

HEALTHCHECK

HEALTHCHECK tells Docker how to test whether the container is still working. Docker uses this to determine if the container needs to be restarted.

HTTP Health Check

# HTTP health check using curl
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

# HTTP health check using wget (for Alpine without curl)
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

# TCP port check (no HTTP endpoint needed)
HEALTHCHECK --interval=15s --timeout=3s --retries=5 \
  CMD nc -z localhost 3000 || exit 1

# Database health check
HEALTHCHECK --interval=10s --timeout=5s --retries=5 \
  CMD pg_isready -U postgres || exit 1

Options Explained

--interval — interval — Time between checks (default: 30s)
--timeout — timeout — Maximum time to wait for a check (default: 30s)
--retries — retries — Consecutive failures before marking unhealthy (default: 3)
--start-period — start_period — Grace period for container startup (default: 0s)

ARG vs ENV

ARG and ENV both define variables, but they are available at different times and have different lifetimes:

Feature	ARG	ENV
Available	Build time only	Build time + runtime
In final image	No	Yes
Override	--build-arg flag	-e flag or .env file
Default value	ARG NAME=default	ENV NAME=default

# ARG — only available during build
ARG NODE_VERSION=20
FROM node:${NODE_VERSION}-alpine

# ARG after FROM must be re-declared
ARG APP_VERSION=1.0.0
RUN echo "Building version $APP_VERSION"

# ENV — available during build AND in running container
ENV NODE_ENV=production
ENV PORT=3000

# Common pattern: ARG → ENV (build-time default, runtime override)
ARG DEFAULT_PORT=3000
ENV PORT=${DEFAULT_PORT}
# Override at build: docker build --build-arg DEFAULT_PORT=8080 .
# Override at run:   docker run -e PORT=8080 myapp

Real-World Multi-Stage Dockerfiles

Node.js (Express/NestJS)

# ===== Build Stage =====
FROM node:20-alpine AS builder
WORKDIR /app

COPY package.json package-lock.json ./
RUN npm ci

COPY tsconfig.json ./
COPY src/ ./src/
COPY public/ ./public/
RUN npm run build

# ===== Production Stage =====
FROM node:20-alpine
WORKDIR /app

ENV NODE_ENV=production

COPY package.json package-lock.json ./
RUN npm ci --only=production \
    && npm cache clean --force

COPY --from=builder /app/dist ./dist
COPY --from=builder /app/public ./public

RUN addgroup -S app && adduser -S app -G app
USER app

EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

CMD ["node", "dist/server.js"]

Python (FastAPI/Django)

# ===== Build Stage =====
FROM python:3.12-slim AS builder
WORKDIR /app

# Install build dependencies
RUN apt-get update \
    && apt-get install -y --no-install-recommends gcc libpq-dev \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt

# ===== Production Stage =====
FROM python:3.12-slim
WORKDIR /app

# Install runtime dependencies only
RUN apt-get update \
    && apt-get install -y --no-install-recommends libpq5 \
    && rm -rf /var/lib/apt/lists/*

# Copy installed packages from builder
COPY --from=builder /install /usr/local

COPY . .

RUN groupadd -r app && useradd -r -g app -d /app app \
    && chown -R app:app /app
USER app

EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')" || exit 1

CMD ["gunicorn", "app.main:app", "-w", "4", "-k", "uvicorn.workers.UvicornWorker", "-b", "0.0.0.0:8000"]

Go (Gin/Fiber)

# ===== Build Stage =====
FROM golang:1.22-alpine AS builder
WORKDIR /app

# Cache dependencies
COPY go.mod go.sum ./
RUN go mod download

COPY . .

# Build static binary with CGO disabled
RUN CGO_ENABLED=0 GOOS=linux go build \
    -ldflags="-w -s" \
    -o /app/server ./cmd/server

# ===== Production Stage =====
FROM gcr.io/distroless/static-debian12
WORKDIR /app

COPY --from=builder /app/server .
COPY --from=builder /app/config ./config

EXPOSE 8080
USER nonroot:nonroot

ENTRYPOINT ["/app/server"]

Go compiles to a single static binary, making it perfect for distroless or scratch base images. The final image is typically under 20MB.

Java (Spring Boot)

# ===== Build Stage =====
FROM eclipse-temurin:21-jdk-alpine AS builder
WORKDIR /app

# Cache Gradle/Maven dependencies
COPY build.gradle settings.gradle gradlew ./
COPY gradle/ ./gradle/
RUN ./gradlew dependencies --no-daemon

COPY src/ ./src/
RUN ./gradlew bootJar --no-daemon

# ===== Production Stage =====
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app

# Extract Spring Boot layers for better caching
COPY --from=builder /app/build/libs/*.jar app.jar
RUN java -Djarmode=layertools -jar app.jar extract

RUN addgroup -S app && adduser -S app -G app
USER app

EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:8080/actuator/health || exit 1

ENTRYPOINT ["java", "-jar", "app.jar"]

For Java, the JRE base image is required at runtime. Using jlink to create a custom JRE can reduce the image size further.

Frequently Asked Questions

What is a multi-stage Docker build?

A multi-stage build uses multiple FROM statements in a single Dockerfile. Each FROM starts a new stage. You can selectively copy artifacts from one stage to another, leaving behind build tools, source code, and dependencies that are not needed at runtime. This results in much smaller, more secure production images.

How do I reduce Docker image size?

Use multi-stage builds to separate build and runtime environments. Choose smaller base images (alpine, slim, or distroless). Combine RUN commands and clean up caches in the same layer. Use .dockerignore to exclude unnecessary files. Remove development dependencies from the final image.

Should I use Alpine or Debian slim as a base image?

Alpine is smaller (about 5MB vs 80MB for slim) and has a smaller attack surface. However, Alpine uses musl libc instead of glibc, which can cause compatibility issues with some native Node.js modules or Python packages. If you encounter issues, switch to slim. For Go, Alpine works perfectly since Go compiles to static binaries.

Why should I not run containers as root?

Running as root inside a container means that if an attacker exploits a vulnerability, they have root privileges within the container. Combined with a container escape vulnerability, this could give them root access to the host. Running as a non-root user limits the damage of any exploit.

What is the difference between CMD and ENTRYPOINT?

CMD sets a default command that can be overridden when running the container (docker run myimage /bin/sh replaces CMD). ENTRYPOINT sets a fixed command that always runs — arguments passed to docker run are appended to ENTRYPOINT. Use ENTRYPOINT when the container should always run a specific executable, and CMD for default arguments that users might override.