Docker Deployment Best Practices

A practical guide to secure, maintainable Docker deployments for self-hosted infrastructure

Overview

Docker enables reproducible deployments through containerization, but production usage requires careful attention to security, reliability, and maintainability. This guide covers best practices for teams running self-hosted infrastructure with custom builds and private registries—balancing simplicity with safety.

Key themes:

  • Build secure, minimal images
  • Deploy predictably with rollback capability
  • Manage secrets properly (never in environment variables)
  • Monitor health and enforce resource limits
  • Balance automation with control

Image Building

Multi-Stage Builds

Multi-stage builds are the foundation of secure, efficient images. They separate build-time dependencies from runtime, resulting in:

  • ~50% smaller images1 (fewer layers, no build tools)
  • Faster builds (better caching, parallel stages)
  • Reduced attack surface (no compilers, dev tools, or secrets in final image)

Pattern:

# Stage 1: Build dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
 
# Stage 2: Build application
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
 
# Stage 3: Production runtime (minimal)
FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY package.json ./
RUN npm ci --only=production && npm cache clean --force
EXPOSE 3000
CMD ["node", "dist/index.js"]

Key techniques:

  • Name stages clearly (AS builder, AS runtime) for readability
  • Copy stable files first (package.json before source code) to maximize cache hits
  • Use minimal base images (alpine, slim variants, or distroless)
  • Combine cleanup in same layer: RUN npm install && npm cache clean --force
  • Use .dockerignore to exclude tests, docs, temp files

For compiled languages (Go, Rust), the final stage can be even leaner:

# Builder
FROM golang:1.24-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /app/bin/app
 
# Runtime (no Go toolchain)
FROM alpine:latest
WORKDIR /root/
COPY --from=builder /app/bin/app .
CMD ["./app"]

Security Scanning

Scan images regularly for vulnerabilities using tools like Trivy, Docker Scout, or Clair. Integrate into CI/CD pipelines to fail builds on high-severity CVEs.

Pattern:

docker buildx build --target runtime -t myapp:latest .
trivy image --severity HIGH,CRITICAL myapp:latest
  • Scan all stages (builder and final) during development
  • Use minimal base images to reduce known CVEs
  • Rebuild frequently to incorporate upstream security patches
  • Consider image signing (Docker Content Trust, Cosign) for supply chain verification

Non-Root Users

Run containers as non-root users to limit privilege escalation risks.

FROM alpine:latest
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
WORKDIR /home/appuser
COPY --chown=appuser:appgroup . .
CMD ["./app"]

Deployment Strategies

Version Pinning

Never use latest tags in production. Pin specific versions with tags or digests for reproducibility and rollback capability.

Bad:

services:
  web:
    image: nginx:latest  # unpredictable updates

Good:

services:
  web:
    image: nginx:1.28.2  # stable as of Feb 2026, or use SHA256 digest for absolute guarantee

Keep separate compose files for environments:

  • docker-compose.dev.yml (latest tags, fast iteration)
  • docker-compose.prod.yml (pinned versions, stability)

Watchtower (Automated Updates) ⚠️ DISCONTINUED

⚠️ Important: The original containrrr/watchtower project is being archived2 (development slowed through 2024, formal archival announced 2025). This section is kept for reference, but use one of the alternatives below for new deployments.

Legacy Watchtower Overview

Watchtower automated container updates by monitoring registries and restarting containers with newer images. It was useful for low-stakes environments, but required careful configuration.

Best practices (historical):

  • Exclude critical services (databases, stateful apps) using labels: com.centurylinklabs.watchtower.enable=false
  • Schedule updates during low-traffic windows: WATCHTOWER_SCHEDULE=0 0 4 * * SUN (4 AM Sundays)
  • Enable cleanup: WATCHTOWER_CLEANUP=true to remove old images
  • Use label-based filtering: WATCHTOWER_LABEL_ENABLE=true to update only tagged containers
  • Consider notification-only mode for high-stakes setups—get alerts, apply manually

Example (selective updates):

version: '3.8'
services:
  watchtower:
    image: containrrr/watchtower  # DEPRECATED
    container_name: watchtower
    restart: unless-stopped
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    environment:
      - WATCHTOWER_SCHEDULE=0 0 4 * * 0  # Sundays at 4 AM
      - WATCHTOWER_CLEANUP=true
      - WATCHTOWER_LABEL_ENABLE=true
  
  web:
    image: nginx:1.28.2
    labels:
      - "com.centurylinklabs.watchtower.enable=true"  # auto-update OK
  
  database:
    image: postgres:15.16
    labels:
      - "com.centurylinklabs.watchtower.enable=false"  # manual only

Security concerns:

  • Mounting /var/run/docker.sock gives full Docker control—isolate Watchtower in its own network
  • Auto-pulling from public registries risks supply chain attacks—prefer private registries with verified images
  • No built-in vulnerability scanning—pair with Trivy or Docker Scout

Active Alternatives

For notification-only (recommended for safety):

Diun (Docker Image Update Notifier)

  • Lightweight (~10MB image)
  • Checks registries for updates without auto-pulling
  • Sends notifications (email, Slack, Discord, webhooks)
  • Good for control-focused deployments where updates require approval
  • Actively maintained (4.4k stars on GitHub as of Feb 2026)3
version: '3.8'
services:
  diun:
    image: crazymax/diun:latest
    container_name: diun
    restart: unless-stopped
    volumes:
      - "./data:/data"
      - "/var/run/docker.sock:/var/run/docker.sock"
    environment:
      - "TZ=America/Los_Angeles"
      - "LOG_LEVEL=info"
      - "DIUN_WATCH_WORKERS=20"
      - "DIUN_WATCH_SCHEDULE=0 */6 * * *"  # Check every 6 hours
      - "DIUN_PROVIDERS_DOCKER=true"
      - "DIUN_NOTIF_DISCORD_WEBHOOKURL=https://discord.com/api/webhooks/..."
    labels:
      - "diun.enable=true"

For auto-update with web UI:

What’s Up Docker (WUD)

  • Actively maintained (2024-2025 commits)
  • Web interface for version tracking and management
  • Label-based update policies
  • Supports multiple registries (Docker Hub, GHCR, private)
  • Triggers (webhooks, MQTT) for custom automation
version: '3.8'
services:
  wud:
    image: fmartinou/whats-up-docker:latest
    container_name: wud
    restart: unless-stopped
    ports:
      - "3000:3000"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    environment:
      - WUD_WATCHER_LOCAL_SOCKET=/var/run/docker.sock

Access web UI at http://localhost:3000 to see available updates and trigger manually.

For drop-in Watchtower replacement:

nickfedor/watchtower

  • Active community fork (2024-2025 commits)
  • Same behavior and config as original Watchtower
  • Drop-in replacement: just change image name
services:
  watchtower:
    image: nickfedor/watchtower:latest  # Active fork
    # ... rest of config identical to original

Recommendation: For production or critical environments, use Diun (notification-only) to alert you of updates, then apply manually with CI/CD. For low-stakes/homelab setups, WUD offers a good balance of automation and control.

CI/CD Deploy Pattern

For production systems, prefer controlled CI/CD over full automation. See Forgejo for our self-hosted CI/CD workflows and deployment patterns.

SSH with restricted keys:

  1. Generate deploy-only SSH key (no shell access)
  2. Restrict with authorized_keys options:
    no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty ssh-rsa AAAAB3...
    
  3. Allow only specific commands: docker compose pull && docker compose up -d

Pipeline pattern:

# CI builds and pushes to registry
docker build -t registry.example.com/myapp:$GIT_SHA .
docker push registry.example.com/myapp:$GIT_SHA
 
# Deploy job (SSH to host)
ssh deploy@production.example.com << 'EOF'
  cd /opt/myapp
  docker compose pull
  docker compose up -d
EOF

Benefits over automated update tools:

  • Explicit deployment events (audit trail)
  • Test in staging first
  • Controlled rollout timing
  • Easy rollback (just deploy previous SHA)

See Static Site Hosting for examples of CI/CD deployment to self-hosted infrastructure.


Security Hardening

Secrets Management

Never use environment variables for secrets. They leak via docker inspect, logs, process lists, and image metadata.

Use file-based mounts instead:

MethodUse CaseImplementation
Docker Secrets (Swarm)Production clustersdocker secret create db_pass db_pass.txt
--secret db_pass mounts at /run/secrets/db_pass
Docker Compose SecretsDevelopment/stacksDefine in compose: secrets: db_password: file: ./secrets/db_pass.txt
BuildKit Build SecretsBuild-time credentialsdocker build --secret id=mysecret,src=secret.txt . (temporary mount)
External managersAdvanced (Vault, etc.)Sidecar fetches secrets as files

Example (Compose secrets):

version: '3.8'
services:
  web:
    image: myapp:latest
    secrets:
      - db_password
    environment:
      - DB_PASSWORD_FILE=/run/secrets/db_password  # app reads file
secrets:
  db_password:
    file: ./secrets/db_password.txt  # gitignored

Rules:

  • Never hard-code secrets in Dockerfiles or compose files
  • Exclude secret files from Git (.gitignore)
  • Rotate secrets regularly
  • Grant secrets only to services that need them
  • Scan for leaks with tools like gitleaks or trufflehog

For broader credential management patterns and integration with self-hosted Vaultwarden, see Credential Management. For a real-world deployment using these patterns, see MCP Gateway.

Network Isolation

Use Docker networks to isolate services:

services:
  web:
    networks:
      - frontend
  api:
    networks:
      - frontend
      - backend
  database:
    networks:
      - backend  # not exposed to web
 
networks:
  frontend:
  backend:

Resource Limits

Prevent runaway containers from exhausting host resources:

services:
  web:
    image: nginx:1.28.2
    deploy:
      resources:
        limits:
          cpus: '1.5'
          memory: 512M
        reservations:
          cpus: '0.5'
          memory: 256M

Or via command line:

docker run -d \
  --name prod-app \
  --memory=512m \
  --memory-reservation=256m \
  --cpus=2 \
  nginx:latest

Guidelines:

  • Always set memory limits to prevent OOM crashes
  • Set reservation (typical usage) and limit (max allowed)
  • Disable swap for predictable performance: --memory-swap=512m (same as --memory)
  • Reserve 1+ core for host OS
  • Monitor with docker stats and adjust based on real usage
  • Load test under stress before production

Read-Only Filesystems

Use read-only root filesystems where possible:

services:
  web:
    image: myapp:latest
    read_only: true
    volumes:
      - /app/data  # writable exception
    tmpfs:
      - /tmp
      - /var/run

Self-Hosted Registry

Authentication

Secure your private registry with authentication:

Basic auth (simple):

# Generate htpasswd file
htpasswd -Bbn username password > /path/to/htpasswd
 
# Run registry with auth
docker run -d -p 5000:5000 \
  -v /path/to/htpasswd:/auth/htpasswd \
  -e REGISTRY_AUTH=htpasswd \
  -e REGISTRY_AUTH_HTPASSWD_REALM="Registry Realm" \
  -e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd \
  registry:2

Token-based auth (advanced): For RBAC and fine-grained permissions, implement JWT token service with REGISTRY_AUTH=token.

TLS Encryption

Always use TLS to encrypt registry traffic:

docker run -d -p 5000:5000 \
  -v /certs:/certs \
  -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
  -e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
  registry:2

Use valid certificates (Let’s Encrypt) for production; self-signed certs work for internal networks but require client configuration.

Image Signing (Docker Content Trust)

Verify image integrity with content trust:

# Enable for all pulls
export DOCKER_CONTENT_TRUST=1
 
# Sign on push
docker trust signer add --key public.key myregistry.com/myimage
docker push myregistry.com/myimage:v1.0.0

Prevents MITM attacks and ensures images haven’t been tampered with.

Registry Hardening

  • Run registry container as non-root (-u 1000)
  • Use read-only filesystem where possible
  • Isolate with firewall (expose only port 5000)
  • Audit regularly: delete unused images, monitor access logs
  • Enable automated vulnerability scanning on push (JFrog Xray, etc.)

Health Checks & Monitoring

Health Checks

Define health checks for automatic restarts and orchestration awareness:

In Dockerfile:

HEALTHCHECK --interval=30s --timeout=3s --start-period=30s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

In Docker Compose:

services:
  web:
    image: myapp:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 120s

Guidelines:

  • Test real dependencies: HTTP endpoints, database connections, critical files
  • Tune parameters: 10-30s interval (avoid <10s overhead), 3-5s timeout, 3 retries
  • Use start_period for slow-starting apps (30-120s)
  • Monitor status: docker ps shows healthy/unhealthy, docker inspect for details
  • Don’t rely on superficial checks—simulate user operations

Note: Health checks don’t apply in Kubernetes (use liveness/readiness probes instead).

Logging

Configure log rotation to prevent disk exhaustion:

services:
  web:
    image: nginx:latest
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

For centralized logging, consider:

  • Log drivers: syslog, journald, fluentd, gelf
  • External aggregators: ELK stack, Grafana Loki, Seq
  • Health check log inspection: grep -q "FATAL" /var/log/app.log && exit 1

Rollback Strategies

Version Pinning Foundation

Maintain detailed deployment records for rapid recovery:

  • Docker image SHAs (exact versions)
  • Database schema versions (migration state)
  • Dependency hashes (lock files)
  • Infrastructure state (Terraform, Ansible)

This prevents environmental drift (the common scenario where production diverges from testing/staging due to inconsistent versions, untracked config changes, or manual interventions).

Rollback Approaches

StrategyRollback TimeUse CasePros/Cons
Blue/Green<2 minutesMission-criticalInstant switch; requires 2x resources
Canary<30 secondsMicroservicesGradual rollout; complex routing
Shadow TrafficInstantHigh-risk changesZero user impact; requires duplication
Direct rollbackVariableSimple deploymentsEasy; may need data rollback

Docker Swarm rollback:

docker service rollback my_service

Compose rollback (manual):

# Update compose file to previous image version
docker compose pull
docker compose up -d

Pre-Deployment Preparation

  • Database snapshots: Point-in-time recovery with Liquibase/Flyway
  • Tag metrics with release versions for correlation
  • Pre-configure rollback dashboards for rapid decision-making
  • Test rollback quarterly (“rollback fire drills”)
  • Validate post-rollback: Ensure database migrations rolled back correctly

Common pitfall: Application code rolls back before database changes, causing data corruption. Always coordinate schema and code rollbacks.


Production Checklist

Do:

  • ✅ Multi-stage builds with minimal base images
  • ✅ Pin image versions (tags or SHAs, never latest)
  • ✅ Run as non-root users
  • ✅ Use secrets files (not environment variables)
  • ✅ Enable health checks
  • ✅ Set resource limits (CPU, memory)
  • ✅ Scan images for vulnerabilities
  • ✅ Use TLS for registries
  • ✅ Log rotation configured
  • ✅ Test rollback procedures
  • ✅ Network isolation
  • ✅ Read-only filesystems where possible

Don’t:

  • ❌ Use latest tags in production
  • ❌ Store secrets in ENV variables or images
  • ❌ Run containers as root
  • ❌ Give CI/CD full Docker daemon access on prod servers
  • ❌ Skip health checks
  • ❌ Ignore resource limits
  • ❌ Deploy without rollback plan
  • ❌ Use privileged mode (--privileged)
  • ❌ Mount Docker socket without isolation
  • ❌ Forget to rotate secrets

Context: Commune Sandbox Deployment

This guide was written in response to discussion on commune/sandbox issue #25 about deploying sandbox environments to hosts. Key considerations for our use case:

  • Custom builds: We build our own images, so multi-stage builds and caching matter
  • Self-hosted registry: Authentication, TLS, and access control are our responsibility
  • Minimal infrastructure: Simplicity matters—Watchtower is reasonable for non-critical services
  • Controlled deployments: For production, CI/CD with explicit deploy steps beats full automation

Watchtower decision: Brad chose Watchtower for simplicity, with host-side registry pulls for safety. This is appropriate for the sandbox environment with proper label filtering.


Sources

Primary Sources

Secondary Sources

Further Reading

Footnotes

  1. Empirical examples show Flask apps shrinking from 523MB to 273MB (~48%), Django apps achieving similar reductions, and Go apps compiling to <20MB from much larger build images. Source: NickJanetakis.com, OneUpTime Docker optimization guide (2026-01-25)

  2. See containrrr/watchtower#2135 (archival discussion) and containrrr/watchtower#1993 (inactive development concerns, July 2024). Community fork maintained by nicholas-fedor remains active as of May 2025.

  3. Diun maintained by crazy-max with regular releases. See crazy-max/diun on GitHub.