Docker Deployment Best Practices
A practical guide to secure, maintainable Docker deployments for self-hosted infrastructure
Overview
Docker enables reproducible deployments through containerization, but production usage requires careful attention to security, reliability, and maintainability. This guide covers best practices for teams running self-hosted infrastructure with custom builds and private registries—balancing simplicity with safety.
Key themes:
- Build secure, minimal images
- Deploy predictably with rollback capability
- Manage secrets properly (never in environment variables)
- Monitor health and enforce resource limits
- Balance automation with control
Image Building
Multi-Stage Builds
Multi-stage builds are the foundation of secure, efficient images. They separate build-time dependencies from runtime, resulting in:
- ~50% smaller images1 (fewer layers, no build tools)
- Faster builds (better caching, parallel stages)
- Reduced attack surface (no compilers, dev tools, or secrets in final image)
Pattern:
# Stage 1: Build dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
# Stage 2: Build application
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
# Stage 3: Production runtime (minimal)
FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY package.json ./
RUN npm ci --only=production && npm cache clean --force
EXPOSE 3000
CMD ["node", "dist/index.js"]Key techniques:
- Name stages clearly (
AS builder,AS runtime) for readability - Copy stable files first (package.json before source code) to maximize cache hits
- Use minimal base images (alpine, slim variants, or distroless)
- Combine cleanup in same layer:
RUN npm install && npm cache clean --force - Use .dockerignore to exclude tests, docs, temp files
For compiled languages (Go, Rust), the final stage can be even leaner:
# Builder
FROM golang:1.24-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /app/bin/app
# Runtime (no Go toolchain)
FROM alpine:latest
WORKDIR /root/
COPY --from=builder /app/bin/app .
CMD ["./app"]Security Scanning
Scan images regularly for vulnerabilities using tools like Trivy, Docker Scout, or Clair. Integrate into CI/CD pipelines to fail builds on high-severity CVEs.
Pattern:
docker buildx build --target runtime -t myapp:latest .
trivy image --severity HIGH,CRITICAL myapp:latest- Scan all stages (builder and final) during development
- Use minimal base images to reduce known CVEs
- Rebuild frequently to incorporate upstream security patches
- Consider image signing (Docker Content Trust, Cosign) for supply chain verification
Non-Root Users
Run containers as non-root users to limit privilege escalation risks.
FROM alpine:latest
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
WORKDIR /home/appuser
COPY --chown=appuser:appgroup . .
CMD ["./app"]Deployment Strategies
Version Pinning
Never use latest tags in production. Pin specific versions with tags or digests for reproducibility and rollback capability.
Bad:
services:
web:
image: nginx:latest # unpredictable updatesGood:
services:
web:
image: nginx:1.28.2 # stable as of Feb 2026, or use SHA256 digest for absolute guaranteeKeep separate compose files for environments:
docker-compose.dev.yml(latest tags, fast iteration)docker-compose.prod.yml(pinned versions, stability)
Watchtower (Automated Updates) ⚠️ DISCONTINUED
⚠️ Important: The original containrrr/watchtower project is being archived2 (development slowed through 2024, formal archival announced 2025). This section is kept for reference, but use one of the alternatives below for new deployments.
Legacy Watchtower Overview
Watchtower automated container updates by monitoring registries and restarting containers with newer images. It was useful for low-stakes environments, but required careful configuration.
Best practices (historical):
- Exclude critical services (databases, stateful apps) using labels:
com.centurylinklabs.watchtower.enable=false - Schedule updates during low-traffic windows:
WATCHTOWER_SCHEDULE=0 0 4 * * SUN(4 AM Sundays) - Enable cleanup:
WATCHTOWER_CLEANUP=trueto remove old images - Use label-based filtering:
WATCHTOWER_LABEL_ENABLE=trueto update only tagged containers - Consider notification-only mode for high-stakes setups—get alerts, apply manually
Example (selective updates):
version: '3.8'
services:
watchtower:
image: containrrr/watchtower # DEPRECATED
container_name: watchtower
restart: unless-stopped
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
- WATCHTOWER_SCHEDULE=0 0 4 * * 0 # Sundays at 4 AM
- WATCHTOWER_CLEANUP=true
- WATCHTOWER_LABEL_ENABLE=true
web:
image: nginx:1.28.2
labels:
- "com.centurylinklabs.watchtower.enable=true" # auto-update OK
database:
image: postgres:15.16
labels:
- "com.centurylinklabs.watchtower.enable=false" # manual onlySecurity concerns:
- Mounting
/var/run/docker.sockgives full Docker control—isolate Watchtower in its own network - Auto-pulling from public registries risks supply chain attacks—prefer private registries with verified images
- No built-in vulnerability scanning—pair with Trivy or Docker Scout
Active Alternatives
For notification-only (recommended for safety):
Diun (Docker Image Update Notifier)
- Lightweight (~10MB image)
- Checks registries for updates without auto-pulling
- Sends notifications (email, Slack, Discord, webhooks)
- Good for control-focused deployments where updates require approval
- Actively maintained (4.4k stars on GitHub as of Feb 2026)3
version: '3.8'
services:
diun:
image: crazymax/diun:latest
container_name: diun
restart: unless-stopped
volumes:
- "./data:/data"
- "/var/run/docker.sock:/var/run/docker.sock"
environment:
- "TZ=America/Los_Angeles"
- "LOG_LEVEL=info"
- "DIUN_WATCH_WORKERS=20"
- "DIUN_WATCH_SCHEDULE=0 */6 * * *" # Check every 6 hours
- "DIUN_PROVIDERS_DOCKER=true"
- "DIUN_NOTIF_DISCORD_WEBHOOKURL=https://discord.com/api/webhooks/..."
labels:
- "diun.enable=true"For auto-update with web UI:
- Actively maintained (2024-2025 commits)
- Web interface for version tracking and management
- Label-based update policies
- Supports multiple registries (Docker Hub, GHCR, private)
- Triggers (webhooks, MQTT) for custom automation
version: '3.8'
services:
wud:
image: fmartinou/whats-up-docker:latest
container_name: wud
restart: unless-stopped
ports:
- "3000:3000"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
- WUD_WATCHER_LOCAL_SOCKET=/var/run/docker.sockAccess web UI at http://localhost:3000 to see available updates and trigger manually.
For drop-in Watchtower replacement:
- Active community fork (2024-2025 commits)
- Same behavior and config as original Watchtower
- Drop-in replacement: just change image name
services:
watchtower:
image: nickfedor/watchtower:latest # Active fork
# ... rest of config identical to originalRecommendation: For production or critical environments, use Diun (notification-only) to alert you of updates, then apply manually with CI/CD. For low-stakes/homelab setups, WUD offers a good balance of automation and control.
CI/CD Deploy Pattern
For production systems, prefer controlled CI/CD over full automation. See Forgejo for our self-hosted CI/CD workflows and deployment patterns.
SSH with restricted keys:
- Generate deploy-only SSH key (no shell access)
- Restrict with
authorized_keysoptions:no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty ssh-rsa AAAAB3... - Allow only specific commands:
docker compose pull && docker compose up -d
Pipeline pattern:
# CI builds and pushes to registry
docker build -t registry.example.com/myapp:$GIT_SHA .
docker push registry.example.com/myapp:$GIT_SHA
# Deploy job (SSH to host)
ssh deploy@production.example.com << 'EOF'
cd /opt/myapp
docker compose pull
docker compose up -d
EOFBenefits over automated update tools:
- Explicit deployment events (audit trail)
- Test in staging first
- Controlled rollout timing
- Easy rollback (just deploy previous SHA)
See Static Site Hosting for examples of CI/CD deployment to self-hosted infrastructure.
Security Hardening
Secrets Management
Never use environment variables for secrets. They leak via docker inspect, logs, process lists, and image metadata.
Use file-based mounts instead:
| Method | Use Case | Implementation |
|---|---|---|
| Docker Secrets (Swarm) | Production clusters | docker secret create db_pass db_pass.txt--secret db_pass mounts at /run/secrets/db_pass |
| Docker Compose Secrets | Development/stacks | Define in compose: secrets: db_password: file: ./secrets/db_pass.txt |
| BuildKit Build Secrets | Build-time credentials | docker build --secret id=mysecret,src=secret.txt . (temporary mount) |
| External managers | Advanced (Vault, etc.) | Sidecar fetches secrets as files |
Example (Compose secrets):
version: '3.8'
services:
web:
image: myapp:latest
secrets:
- db_password
environment:
- DB_PASSWORD_FILE=/run/secrets/db_password # app reads file
secrets:
db_password:
file: ./secrets/db_password.txt # gitignoredRules:
- Never hard-code secrets in Dockerfiles or compose files
- Exclude secret files from Git (
.gitignore) - Rotate secrets regularly
- Grant secrets only to services that need them
- Scan for leaks with tools like
gitleaksortrufflehog
For broader credential management patterns and integration with self-hosted Vaultwarden, see Credential Management. For a real-world deployment using these patterns, see MCP Gateway.
Network Isolation
Use Docker networks to isolate services:
services:
web:
networks:
- frontend
api:
networks:
- frontend
- backend
database:
networks:
- backend # not exposed to web
networks:
frontend:
backend:Resource Limits
Prevent runaway containers from exhausting host resources:
services:
web:
image: nginx:1.28.2
deploy:
resources:
limits:
cpus: '1.5'
memory: 512M
reservations:
cpus: '0.5'
memory: 256MOr via command line:
docker run -d \
--name prod-app \
--memory=512m \
--memory-reservation=256m \
--cpus=2 \
nginx:latestGuidelines:
- Always set memory limits to prevent OOM crashes
- Set reservation (typical usage) and limit (max allowed)
- Disable swap for predictable performance:
--memory-swap=512m(same as--memory) - Reserve 1+ core for host OS
- Monitor with
docker statsand adjust based on real usage - Load test under stress before production
Read-Only Filesystems
Use read-only root filesystems where possible:
services:
web:
image: myapp:latest
read_only: true
volumes:
- /app/data # writable exception
tmpfs:
- /tmp
- /var/runSelf-Hosted Registry
Authentication
Secure your private registry with authentication:
Basic auth (simple):
# Generate htpasswd file
htpasswd -Bbn username password > /path/to/htpasswd
# Run registry with auth
docker run -d -p 5000:5000 \
-v /path/to/htpasswd:/auth/htpasswd \
-e REGISTRY_AUTH=htpasswd \
-e REGISTRY_AUTH_HTPASSWD_REALM="Registry Realm" \
-e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd \
registry:2Token-based auth (advanced):
For RBAC and fine-grained permissions, implement JWT token service with REGISTRY_AUTH=token.
TLS Encryption
Always use TLS to encrypt registry traffic:
docker run -d -p 5000:5000 \
-v /certs:/certs \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
registry:2Use valid certificates (Let’s Encrypt) for production; self-signed certs work for internal networks but require client configuration.
Image Signing (Docker Content Trust)
Verify image integrity with content trust:
# Enable for all pulls
export DOCKER_CONTENT_TRUST=1
# Sign on push
docker trust signer add --key public.key myregistry.com/myimage
docker push myregistry.com/myimage:v1.0.0Prevents MITM attacks and ensures images haven’t been tampered with.
Registry Hardening
- Run registry container as non-root (
-u 1000) - Use read-only filesystem where possible
- Isolate with firewall (expose only port 5000)
- Audit regularly: delete unused images, monitor access logs
- Enable automated vulnerability scanning on push (JFrog Xray, etc.)
Health Checks & Monitoring
Health Checks
Define health checks for automatic restarts and orchestration awareness:
In Dockerfile:
HEALTHCHECK --interval=30s --timeout=3s --start-period=30s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1In Docker Compose:
services:
web:
image: myapp:latest
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 10s
timeout: 5s
retries: 3
start_period: 120sGuidelines:
- Test real dependencies: HTTP endpoints, database connections, critical files
- Tune parameters: 10-30s interval (avoid <10s overhead), 3-5s timeout, 3 retries
- Use start_period for slow-starting apps (30-120s)
- Monitor status:
docker psshows healthy/unhealthy,docker inspectfor details - Don’t rely on superficial checks—simulate user operations
Note: Health checks don’t apply in Kubernetes (use liveness/readiness probes instead).
Logging
Configure log rotation to prevent disk exhaustion:
services:
web:
image: nginx:latest
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"For centralized logging, consider:
- Log drivers:
syslog,journald,fluentd,gelf - External aggregators: ELK stack, Grafana Loki, Seq
- Health check log inspection:
grep -q "FATAL" /var/log/app.log && exit 1
Rollback Strategies
Version Pinning Foundation
Maintain detailed deployment records for rapid recovery:
- Docker image SHAs (exact versions)
- Database schema versions (migration state)
- Dependency hashes (lock files)
- Infrastructure state (Terraform, Ansible)
This prevents environmental drift (the common scenario where production diverges from testing/staging due to inconsistent versions, untracked config changes, or manual interventions).
Rollback Approaches
| Strategy | Rollback Time | Use Case | Pros/Cons |
|---|---|---|---|
| Blue/Green | <2 minutes | Mission-critical | Instant switch; requires 2x resources |
| Canary | <30 seconds | Microservices | Gradual rollout; complex routing |
| Shadow Traffic | Instant | High-risk changes | Zero user impact; requires duplication |
| Direct rollback | Variable | Simple deployments | Easy; may need data rollback |
Docker Swarm rollback:
docker service rollback my_serviceCompose rollback (manual):
# Update compose file to previous image version
docker compose pull
docker compose up -dPre-Deployment Preparation
- Database snapshots: Point-in-time recovery with Liquibase/Flyway
- Tag metrics with release versions for correlation
- Pre-configure rollback dashboards for rapid decision-making
- Test rollback quarterly (“rollback fire drills”)
- Validate post-rollback: Ensure database migrations rolled back correctly
Common pitfall: Application code rolls back before database changes, causing data corruption. Always coordinate schema and code rollbacks.
Production Checklist
Do:
- ✅ Multi-stage builds with minimal base images
- ✅ Pin image versions (tags or SHAs, never
latest) - ✅ Run as non-root users
- ✅ Use secrets files (not environment variables)
- ✅ Enable health checks
- ✅ Set resource limits (CPU, memory)
- ✅ Scan images for vulnerabilities
- ✅ Use TLS for registries
- ✅ Log rotation configured
- ✅ Test rollback procedures
- ✅ Network isolation
- ✅ Read-only filesystems where possible
Don’t:
- ❌ Use
latesttags in production - ❌ Store secrets in ENV variables or images
- ❌ Run containers as root
- ❌ Give CI/CD full Docker daemon access on prod servers
- ❌ Skip health checks
- ❌ Ignore resource limits
- ❌ Deploy without rollback plan
- ❌ Use privileged mode (
--privileged) - ❌ Mount Docker socket without isolation
- ❌ Forget to rotate secrets
Context: Commune Sandbox Deployment
This guide was written in response to discussion on commune/sandbox issue #25 about deploying sandbox environments to hosts. Key considerations for our use case:
- Custom builds: We build our own images, so multi-stage builds and caching matter
- Self-hosted registry: Authentication, TLS, and access control are our responsibility
- Minimal infrastructure: Simplicity matters—Watchtower is reasonable for non-critical services
- Controlled deployments: For production, CI/CD with explicit deploy steps beats full automation
Watchtower decision: Brad chose Watchtower for simplicity, with host-side registry pulls for safety. This is appropriate for the sandbox environment with proper label filtering.
Sources
Primary Sources
- Docker official documentation: Multi-stage builds, Best practices, Swarm secrets, Resource constraints
- Docker Hub official images: nginx (1.28.2 stable, 1.29.5 mainline as of Feb 2026), node (20.20.0-alpine), postgres (15.16 latest in 15.x series)
- Watchtower GitHub: containrrr/watchtower (archived), nickfedor/watchtower (community fork)
- Diun (Docker Image Update Notifier): crazy-max/diun
Secondary Sources
- Multi-stage build size reduction: Nick Janetakis “Shrink Your Docker Images by 50 Percent” (Flask 523MB→273MB example), OneUpTime “Optimize Docker Image Size” (Jan 2026)
- Docker deployment best practices: Blog posts from 2024-2025 on Docker security, deployment patterns, monitoring
- Container security: SentinelOne, GitGuardian, Wiz, Last9, Lumigo guides (2024-2025)
- Rollback strategies: QABrains “Ultimate Guide to Rollback Strategy” (blue/green, canary patterns)
Further Reading
- Docker 2024 highlights (Build Cloud, MSI/PKG installers)
- Docker secrets management (file-based approaches)
- Container security best practices (runtime hardening)
Footnotes
-
Empirical examples show Flask apps shrinking from 523MB to 273MB (~48%), Django apps achieving similar reductions, and Go apps compiling to <20MB from much larger build images. Source: NickJanetakis.com, OneUpTime Docker optimization guide (2026-01-25) ↩
-
See containrrr/watchtower#2135 (archival discussion) and containrrr/watchtower#1993 (inactive development concerns, July 2024). Community fork maintained by nicholas-fedor remains active as of May 2025. ↩
-
Diun maintained by crazy-max with regular releases. See crazy-max/diun on GitHub. ↩