Skip to main content
Stripe SystemsStripe Systems
DevOps📅 January 23, 2026· 13 min read

Docker Image Hardening for Production — Distroless, Non-Root Users, and Layer Optimization

✍️
Stripe Systems Engineering

A default Docker image built from node:18 or python:3.11 ships with hundreds of packages you do not need in production — compilers, package managers, shells, debug utilities. Each unnecessary package is a potential CVE. This post covers the specific techniques for reducing attack surface, shrinking image size, and enforcing runtime security constraints.

Why Image Hardening Matters

Three concerns drive image hardening:

Attack surface: A container image with 400 installed packages has 400 packages worth of potential vulnerabilities. The node:18 image (Debian Bookworm-based) ships with apt, curl, wget, gcc, make, perl, and hundreds of libraries. An attacker who gains code execution inside the container has a full toolkit available.

CVE exposure: Every package in your image is scanned by vulnerability databases. More packages mean more CVE matches. Most of these CVEs are in packages your application never uses — but they still appear in compliance reports and trigger alerts.

Compliance: SOC 2, PCI DSS, and HIPAA require demonstrating that production systems minimize unnecessary software. An auditor looking at a 1.2GB image containing a C compiler will ask why.

Base Image Selection

Comparison

Base ImageSizePackage ManagerShellPackagesUse Case
node:20~950MBaptbash~400Development only
node:20-slim~200MBaptbash~100When you need apt
node:20-alpine~130MBapksh~30General production
gcr.io/distroless/nodejs20~130MBNoneNone~10Hardened production
cgr.dev/chainguard/node~90MBNone (apk in -dev)None~5Hardened production
scratch0MBNoneNone0Static binaries (Go, Rust)

Alpine

Alpine uses musl libc instead of glibc. This matters for:

  • Node.js native modules: Packages with native bindings (e.g., bcrypt, sharp) may need to be compiled against musl. Use npm rebuild in the build stage.
  • DNS resolution: musl's DNS resolver behaves differently from glibc. It does not support search directives in /etc/resolv.conf the same way. In Kubernetes, this can cause service discovery issues unless ndots is configured correctly in the pod spec.
  • Performance: musl's malloc implementation is simpler than glibc's. For memory-intensive workloads, benchmark before committing.
FROM node:20-alpine AS builder
RUN apk add --no-cache python3 make g++  # For native modules
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
# Native modules are compiled against musl here

Distroless

Google's distroless images contain only the language runtime and its dependencies. No package manager, no shell, no ls, no cat. You cannot docker exec -it container sh into a distroless container — there is no shell.

What is included in gcr.io/distroless/nodejs20-debian12:

  • Node.js 20 binary
  • Required shared libraries (libc, libstdc++, etc.)
  • CA certificates
  • /etc/passwd with a nonroot user

What is NOT included:

  • Shell (bash, sh)
  • Package manager (apt, apk)
  • Coreutils (ls, cat, cp, mv)
  • curl, wget, netcat
  • Compilers, interpreters

Chainguard Images

Chainguard provides hardened base images rebuilt nightly with the latest package versions. They claim zero known CVEs at build time.

# Chainguard Node.js image
FROM cgr.dev/chainguard/node:latest
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["dist/index.js"]

Chainguard images are slightly smaller than Google distroless and are updated more frequently. The tradeoff: they are a third-party dependency with a commercial model (free tier is limited).

Scratch

For statically compiled binaries (Go with CGO_ENABLED=0, Rust):

FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /server ./cmd/server

FROM scratch
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /server /server
USER 65534:65534
ENTRYPOINT ["/server"]

The resulting image contains exactly one file (plus CA certs). Image size is typically 5-20MB.

Multi-Stage Builds

The key principle: build dependencies should never appear in the production image.

# Stage 1: Install ALL dependencies (including devDependencies for build tools)
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci

# Stage 2: Build the application
FROM deps AS builder
COPY tsconfig.json ./
COPY src/ ./src/
RUN npm run build
# Prune devDependencies for the runtime stage
RUN npm prune --production

# Stage 3: Production runtime
FROM gcr.io/distroless/nodejs20-debian12 AS runtime
WORKDIR /app
# Copy only production artifacts
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./

EXPOSE 3000
USER nonroot:nonroot
CMD ["dist/index.js"]

What stays in the builder (not in production):

  • TypeScript compiler (typescript package)
  • Build tools (webpack, esbuild, swc)
  • Type definition packages (@types/*)
  • Test frameworks (jest, vitest)
  • Linters (eslint, prettier)

Non-Root Users

By default, Docker containers run as root (UID 0). If an attacker exploits a vulnerability in your application, they have root access inside the container. With certain misconfigurations (privileged mode, host PID namespace), this can escalate to root on the host.

Creating and Using a Non-Root User

# For Alpine-based images
FROM node:20-alpine
RUN addgroup -g 1001 -S appgroup && \
    adduser -u 1001 -S appuser -G appgroup
WORKDIR /app
COPY --chown=appuser:appgroup . .
USER appuser:appgroup
CMD ["node", "dist/index.js"]
# For Debian-based images
FROM node:20-slim
RUN groupadd -g 1001 appgroup && \
    useradd -u 1001 -g appgroup -m -s /bin/false appuser
WORKDIR /app
COPY --chown=appuser:appgroup . .
USER appuser:appgroup
CMD ["node", "dist/index.js"]

Common File Permission Issues

Problem: Application writes to /app/logs or /app/uploads at runtime, but these directories are owned by root.

# Create directories with correct ownership before switching user
RUN mkdir -p /app/logs /app/data && \
    chown -R appuser:appgroup /app/logs /app/data
USER appuser:appgroup

Problem: npm packages install global binaries to /usr/local/bin, which requires root.

Solution: Do not install global packages in the runtime image. Everything should be a local dependency in node_modules/.bin.

Problem: Application binds to port 80 or 443, which requires root.

Solution: Bind to a high port (3000, 8080) and use a Kubernetes Service or ingress controller for port mapping. There is no reason to run on privileged ports inside a container.

Distroless Already Provides Non-Root

Distroless images include a nonroot user (UID 65532):

FROM gcr.io/distroless/nodejs20-debian12
USER nonroot:nonroot
# That's it — the user already exists in the image

Layer Optimization

COPY Order Matters

Docker caches layers. When a layer's input changes, that layer and all subsequent layers are rebuilt. Order your instructions from least-frequently-changing to most-frequently-changing:

# GOOD: Dependencies change less often than source code
COPY package.json package-lock.json ./
RUN npm ci
COPY src/ ./src/
RUN npm run build

# BAD: Any source code change invalidates the npm install cache
COPY . .
RUN npm ci
RUN npm run build

Combine RUN Statements

Each RUN instruction creates a layer. Combining related commands reduces layer count and avoids caching deleted files:

# BAD: 3 layers. The apt cache from layer 1 persists in the image even
# though it's deleted in layer 3.
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*

# GOOD: 1 layer. The apt cache is created and deleted in the same layer.
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl && \
    rm -rf /var/lib/apt/lists/*

.dockerignore

Prevent unnecessary files from entering the build context:

# .dockerignore
node_modules
.git
.github
*.md
docs/
tests/
coverage/
.env
.env.*
dist/
*.log
playwright-report/
test-results/
.vscode
.idea

Without .dockerignore, the entire directory (including node_modules, .git, and test artifacts) is sent to the Docker daemon as build context. For a typical Node.js project, this can be 500MB+.

Dependency Management

Pin Versions

# BAD: What version of curl is this? Will it change on next build?
RUN apk add curl

# GOOD: Pinned version. Reproducible builds.
RUN apk add --no-cache curl=8.5.0-r0

For Node.js dependencies, package-lock.json (used with npm ci) already ensures deterministic installs. For system packages, pin to specific versions.

Scanning with Trivy

# Scan an image
trivy image ghcr.io/org/payment-api:latest

# Scan with severity filter
trivy image --severity CRITICAL,HIGH ghcr.io/org/payment-api:latest

# Output as JSON for CI processing
trivy image --format json --output results.json ghcr.io/org/payment-api:latest

# Scan a Dockerfile (pre-build)
trivy config Dockerfile

Example Trivy output:

ghcr.io/org/payment-api:latest (debian 12.4)

Total: 0 (CRITICAL: 0, HIGH: 0)

Node.js (node_modules/package-lock.json)

Total: 2 (CRITICAL: 0, HIGH: 0, MEDIUM: 2)

┌──────────────┬───────────────┬──────────┬─────────┬──────────────────┐
│   Library    │ Vulnerability │ Severity │ Version │  Fixed Version   │
├──────────────┼───────────────┼──────────┼─────────┼──────────────────┤
│ semver       │ CVE-2022-xxxx │ MEDIUM   │ 7.3.7   │ 7.5.2            │
│ json5        │ CVE-2022-xxxx │ MEDIUM   │ 1.0.1   │ 1.0.2            │
└──────────────┴───────────────┴──────────┴─────────┴──────────────────┘

SBOM Generation with Syft

Software Bill of Materials (SBOM) lists every component in your image:

# Generate SBOM in SPDX format
syft ghcr.io/org/payment-api:latest -o spdx-json > sbom.spdx.json

# Generate SBOM in CycloneDX format
syft ghcr.io/org/payment-api:latest -o cyclonedx-json > sbom.cdx.json

SBOMs enable downstream consumers to audit your dependencies without access to your source code. Some government contracts and enterprise procurement processes now require SBOMs.

Secrets in Docker Builds

The Wrong Way

# NEVER do this — the secret is baked into an image layer
COPY .env /app/.env
ENV DATABASE_URL=postgres://user:password@host/db
ARG NPM_TOKEN=abc123
RUN echo "//registry.npmjs.org/:_authToken=${NPM_TOKEN}" > .npmrc

Even if you delete the file in a later layer, it exists in the previous layer and can be extracted with docker history or by inspecting the image filesystem.

BuildKit Secret Mounts

# syntax=docker/dockerfile:1
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
# Mount the secret at build time — it is NOT stored in any layer
RUN --mount=type=secret,id=npmrc,target=/app/.npmrc \
    npm ci
# Build with the secret
docker build --secret id=npmrc,src=$HOME/.npmrc -t payment-api .

The secret is mounted into the build container's filesystem during that specific RUN instruction. It is never written to a layer.

In CI

# GitHub Actions
- name: Build with secrets
  uses: docker/build-push-action@v5
  with:
    context: .
    push: true
    tags: ghcr.io/org/payment-api:latest
    secrets: |
      npmrc=${{ secrets.NPM_RC }}

Image Signing with Cosign

Image signing proves that an image was built by your CI system and has not been tampered with.

Keyless Signing with Sigstore

# Install cosign
go install github.com/sigstore/cosign/v2/cmd/cosign@latest

# Sign an image (keyless — uses OIDC identity)
cosign sign ghcr.io/org/payment-api:latest

# Verify a signature
cosign verify \
  --certificate-identity=https://github.com/org/repo/.github/workflows/ci.yaml@refs/heads/main \
  --certificate-oidc-issuer=https://token.actions.githubusercontent.com \
  ghcr.io/org/payment-api:latest

In CI (GitHub Actions):

- name: Sign image with Cosign
  env:
    COSIGN_EXPERIMENTAL: "1"
  run: |
    cosign sign --yes ghcr.io/org/payment-api@${{ steps.build.outputs.digest }}

Keyless signing uses your CI system's OIDC token as identity. No private keys to manage — the signature attests that the image was built by a specific GitHub Actions workflow.

Admission Control with Sigstore Policy Controller

Enforce that only signed images can run in your cluster:

apiVersion: policy.sigstore.dev/v1beta1
kind: ClusterImagePolicy
metadata:
  name: require-signed-images
spec:
  images:
    - glob: "ghcr.io/org/**"
  authorities:
    - keyless:
        url: https://fulcio.sigstore.dev
        identities:
          - issuer: https://token.actions.githubusercontent.com
            subject: https://github.com/org/repo/.github/workflows/ci.yaml@refs/heads/main

Runtime Security

Read-Only Filesystem

# Kubernetes pod spec
apiVersion: v1
kind: Pod
spec:
  containers:
    - name: payment-api
      image: ghcr.io/org/payment-api:abc123
      securityContext:
        readOnlyRootFilesystem: true
        runAsNonRoot: true
        runAsUser: 65532
        allowPrivilegeEscalation: false
        capabilities:
          drop:
            - ALL
      volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: logs
          mountPath: /app/logs
  volumes:
    - name: tmp
      emptyDir:
        sizeLimit: 100Mi
    - name: logs
      emptyDir:
        sizeLimit: 500Mi

readOnlyRootFilesystem: true prevents writing anywhere in the container filesystem. Mount emptyDir volumes for directories that need writes (temp files, logs).

Seccomp Profiles

Restrict which system calls the container can make:

{
  "defaultAction": "SCMP_ACT_ERRNO",
  "architectures": ["SCMP_ARCH_X86_64"],
  "syscalls": [
    {
      "names": [
        "accept4", "bind", "clone", "close", "connect",
        "epoll_create1", "epoll_ctl", "epoll_wait",
        "exit", "exit_group", "fcntl", "fstat",
        "futex", "getpid", "getsockopt", "ioctl",
        "listen", "mmap", "mprotect", "munmap",
        "nanosleep", "openat", "pipe2", "read",
        "recvfrom", "rt_sigaction", "rt_sigprocmask",
        "sendto", "setsockopt", "socket", "write",
        "writev", "brk", "clock_gettime", "getuid",
        "getgid", "geteuid", "getegid"
      ],
      "action": "SCMP_ACT_ALLOW"
    }
  ]
}

Apply in the pod spec:

securityContext:
  seccompProfile:
    type: Localhost
    localhostProfile: profiles/node-api.json

Scanning in CI: Full Integration

# .github/workflows/security.yaml
name: Security Scan

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
  schedule:
    - cron: '0 6 * * 1'  # Weekly scan of existing images

jobs:
  scan-image:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Build image
        run: docker build -t payment-api:scan .

      - name: Run Trivy vulnerability scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: payment-api:scan
          format: table
          exit-code: 1
          severity: CRITICAL,HIGH
          ignore-unfixed: true

      - name: Run Trivy for SARIF (always, for GitHub Security tab)
        uses: aquasecurity/trivy-action@master
        if: always()
        with:
          image-ref: payment-api:scan
          format: sarif
          output: trivy-results.sarif

      - name: Upload SARIF
        uses: github/codeql-action/upload-sarif@v3
        if: always()
        with:
          sarif_file: trivy-results.sarif

      - name: Check image size
        run: |
          SIZE=$(docker image inspect payment-api:scan --format='{{.Size}}')
          SIZE_MB=$((SIZE / 1024 / 1024))
          echo "Image size: ${SIZE_MB}MB"
          if [ "$SIZE_MB" -gt 200 ]; then
            echo "::error::Image size ${SIZE_MB}MB exceeds 200MB budget"
            exit 1
          fi

The ignore-unfixed: true flag is important: it prevents failing builds on CVEs that have no available fix. You cannot patch what has not been patched upstream.

Image Size Budgets

Track image size over time to prevent regression:

#!/bin/bash
# scripts/check-image-size.sh
IMAGE=$1
MAX_SIZE_MB=${2:-200}

SIZE_BYTES=$(docker image inspect "$IMAGE" --format='{{.Size}}')
SIZE_MB=$((SIZE_BYTES / 1024 / 1024))

echo "Image: $IMAGE"
echo "Size: ${SIZE_MB}MB"
echo "Budget: ${MAX_SIZE_MB}MB"

if [ "$SIZE_MB" -gt "$MAX_SIZE_MB" ]; then
  echo "FAIL: Image exceeds size budget by $((SIZE_MB - MAX_SIZE_MB))MB"
  exit 1
fi

echo "PASS: Image is within size budget"

Case Study: Hardening a Node.js API Service

A Node.js API service for a financial data platform had been running in production for 18 months with an unhardened Docker image.

Before

# Original Dockerfile
FROM node:18
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["node", "src/index.js"]

Problems:

  • Image size: 1.2GB (node:18 base + all dependencies including devDependencies)
  • CVE count: 47 critical, 182 high (mostly in base image packages)
  • Running as root: UID 0 — any code execution vulnerability gives root access
  • Full toolkit available: bash, curl, wget, apt, gcc — useful for attackers
  • Secrets in history: .env file was COPY'd into the image in an earlier build iteration; the layer persisted
  • No .dockerignore: Build context included .git (400MB), node_modules (300MB), test fixtures

After

The Stripe Systems engineering team rewrote the Dockerfile:

# Hardened Dockerfile
# syntax=docker/dockerfile:1

# Stage 1: Dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production && npm cache clean --force

# Stage 2: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json tsconfig.json ./
RUN npm ci
COPY src/ ./src/
RUN npm run build

# Stage 3: Production
FROM gcr.io/distroless/nodejs20-debian12

LABEL org.opencontainers.image.source="https://github.com/org/payment-api"
LABEL org.opencontainers.image.description="Payment API Service"

WORKDIR /app

# Copy only production dependencies and compiled output
COPY --from=deps /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package.json ./

EXPOSE 3000
USER nonroot:nonroot
CMD ["dist/index.js"]

With .dockerignore:

node_modules
.git
.github
tests/
coverage/
*.md
.env*
docker-compose*.yml
.vscode
.idea
playwright-report/
test-results/
src/     # Source is not needed — we copy dist/

Results

MetricBeforeAfterChange
Image size1.2GB89MB-93%
Critical CVEs470-100%
High CVEs1820-100%
Medium CVEs2(npm deps, no fix available)
Running asroot (UID 0)nonroot (UID 65532)Non-root
Shell availableYes (bash)NoRemoved
Package managerYes (apt)NoRemoved
Build context size1.1GB2.3MB-99.8%
Cold start (K8s)2.1s1.7s-400ms

The cold start improvement comes from two factors: smaller image means faster pull from the container registry, and the distroless runtime has less filesystem to initialize.

Trivy Scan Comparison

Before:

payment-api:before (debian 11.8)
Total: 229 (CRITICAL: 47, HIGH: 182)

┌──────────────┬────────────────┬──────────┬─────────────┐
│   Library    │ Vulnerability  │ Severity │   Status     │
├──────────────┼────────────────┼──────────┼─────────────┤
│ openssl      │ CVE-2023-xxxxx │ CRITICAL │ fixed        │
│ curl         │ CVE-2023-xxxxx │ CRITICAL │ fixed        │
│ glibc        │ CVE-2023-xxxxx │ HIGH     │ fixed        │
│ ... (226 more rows)                                     │
└──────────────┴────────────────┴──────────┴─────────────┘

After:

payment-api:after (distroless)
Total: 0 (CRITICAL: 0, HIGH: 0)

Node.js (package-lock.json)
Total: 2 (MEDIUM: 2)

┌──────────────┬────────────────┬──────────┬─────────┬──────────────┐
│   Library    │ Vulnerability  │ Severity │ Version │ Fixed Version│
├──────────────┼────────────────┼──────────┼─────────┼──────────────┤
│ semver       │ CVE-2022-25883 │ MEDIUM   │ 7.3.7   │ 7.5.2        │
│ json5        │ CVE-2022-46175 │ MEDIUM   │ 1.0.1   │ 1.0.2        │
└──────────────┴────────────────┴──────────┴─────────┴──────────────┘

CI Pipeline Integration

The final CI step for every build:

- name: Build and push
  id: build
  uses: docker/build-push-action@v5
  with:
    context: .
    push: true
    tags: ghcr.io/org/payment-api:${{ github.sha }}
    cache-from: type=gha
    cache-to: type=gha,mode=max

- name: Trivy scan
  uses: aquasecurity/trivy-action@master
  with:
    image-ref: ghcr.io/org/payment-api:${{ github.sha }}
    exit-code: 1
    severity: CRITICAL,HIGH
    ignore-unfixed: true

- name: Check image size budget
  run: |
    docker pull ghcr.io/org/payment-api:${{ github.sha }}
    SIZE=$(docker image inspect ghcr.io/org/payment-api:${{ github.sha }} --format='{{.Size}}')
    SIZE_MB=$((SIZE / 1024 / 1024))
    echo "Image size: ${SIZE_MB}MB"
    if [ "$SIZE_MB" -gt 150 ]; then
      echo "::error::Image size exceeds 150MB budget"
      exit 1
    fi

- name: Sign image
  env:
    COSIGN_EXPERIMENTAL: "1"
  run: cosign sign --yes ghcr.io/org/payment-api@${{ steps.build.outputs.digest }}

- name: Generate SBOM
  uses: anchore/sbom-action@v0
  with:
    image: ghcr.io/org/payment-api:${{ github.sha }}
    format: spdx-json
    output-file: sbom.spdx.json

Every image that reaches production is: scanned for vulnerabilities (build fails on critical/high), checked against a size budget, signed with a verifiable identity, and accompanied by an SBOM. This is not security theater — each step addresses a specific threat. Scanning catches known vulnerabilities before deployment. Size budgets prevent accidental inclusion of build tools. Signing prevents deployment of tampered images. SBOMs enable rapid response when a new CVE is disclosed in a transitive dependency.

The total effort to harden the image and integrate scanning into CI was approximately 2 days of engineering work. The ongoing cost is near zero — the pipeline runs automatically, and alerts fire only when action is needed.

Ready to discuss your project?

Get in Touch →
← Back to Blog

More Articles