Best Way To Run Docker On Proxmox

Best Way To Run Docker On Proxmox

Reading time1 min
#Virtualization#Containers#Cloud#Docker#Proxmox#LXC

Optimizing Docker Deployment on Proxmox: Practical Approaches for Performance and Reliability

A recurring scenario: a team aims to consolidate bare-metal workloads, but the Docker-on-Proxmox debate resurfaces. LXC for footprint, KVM for true isolation—but what’s the optimal setup?

Below—field-tested strategies, recommended configurations, and relevant caveats for running Docker under Proxmox VE (≥7.4) in production-like environments. Skip generic walkthroughs; focus on technical trade-offs, non-obvious tips, and operational best practices.


Why Layer Docker on Proxmox?

On physical hosts, Docker’s own container isolation suffices. Under Proxmox VE, running Docker in a VM or an LXC brings:

  • Process isolation for multi-tenant, regulated, or lab environments.
  • Resource capping and guarantees via Proxmox’s CPU/mem limits.
  • Backup and snapshot management (notably, near-instant ZFSL snapshots).
  • Mixed kernel stacks: critical for workloads such as Kubernetes-in-Docker or GPU-accelerated ML.

Running Docker “bare-metal" on Proxmox (i.e., directly on the host), while possible, is discouraged: it complicates upgrades, backup strategies, and node composability.


LXC vs. KVM VM for Docker – Technical Analysis

LXC Containerized Docker (Linux ≥5.15)

Advantages:

  • Minimal overhead; LXC containers share Proxmox’s kernel.
  • Fast startup, efficient resource sharing.
  • Quicker snapshots/restore (good with ZFS/BTRFS).

Drawbacks:

  • Kernel is fixed by Proxmox host—limits use of certain modules, AppArmor, or custom drivers (e.g., NVIDIA kernel DKMS).
  • Nested containerization is finicky; unprivileged LXCs block many Docker subfeatures due to user namespace mapping.

KVM VM

Advantages:

  • Full guest kernel flexibility—run any distro, enable custom drivers (SR-IOV, GPU passthrough, etc.).
  • Better isolation: mitigates “noisy neighbor" effects and secures against LXC kernel exploits.
  • Friendlier for commercial support scenarios (VM-level DR, migration, etc.).

Drawbacks:

  • Overhead: ≈10% CPU/memory penalty vs LXC in synthetic testing.
  • Slower backup and restore, though mitigated by VM snapshots.

Conclusion:
Routine containerized services (CI runners, stateless apps, minor databases): LXC (privileged).
Demanding workloads, complex network/storage, or device passthrough (CUDA, ROCm): use KVM-based VMs.


Practical: Configuring Docker in LXC

Proxmox caveats:

  • Unprivileged LXC (unprivileged=1) + Docker = permission errors (failed to create shim task: OCI runtime error: permission denied).
  • Use privileged LXC (unprivileged=0) with nesting enabled.

Example LXC Config

# Create privileged container from Ubuntu 22.04 template
pct create 122 local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst -unprivileged 0 -features nesting=1 -cores 4 -memory 8192 -net0 name=eth0,bridge=vmbr0
pct start 122
pct exec 122 -- bash

Install Docker (inside LXC):

apt update
apt install -y docker.io
systemctl enable --now docker
docker run --rm hello-world

Note:
docker-compose (legacy) or docker compose (plugin) may require additional packages.

Known Issue:
Running LXC with apparmor on strict may block Docker’s networking. Disable AppArmor or loosen profiles as needed, but document this change for auditing.


VM-Based Docker: When More Isolation is Needed

Provision VM (Debian or Ubuntu 22.04 LTS) via Proxmox GUI/qm CLI. Ensure:

  • VirtIO SCSI storage controller.
  • Sufficient CPU flags passed for nested virtualization if running dind (docker-in-docker) or Kubernetes.
  • Optional: PCI(e) passthrough for GPU/DPU use.

Sample VM cloud-init drive for seamless CI deployments.


Storage Strategy: Data Resiliency and I/O Optimization

  • Data directory separation: Always mount /var/lib/docker on a dedicated disk or ZFS dataset.
  • ZFS examples:
# On Proxmox host:
zfs create rpool/data/docker-lxc-122
pct set 122 -mp0 /rpool/data/docker-lxc-122,mp=/var/lib/docker
  • Avoid overlaying Docker’s storage driver on Proxmox’s ZFS dataset-over-directory, to prevent performance issues.

  • Note: Docker’s default configuration (overlay2) is adequate for most use cases, but be ready to tune it for overlay/aufs, depending on kernel versions and workload mix.


Network Considerations – Deconflicting Bridges and Subnets

  • Default Docker bridge (docker0) may clash with Proxmox’s NAT bridges (vmbr0, etc).
  • For production: define a user macvlan network so containers are routable on the physical LAN, bypassing VM-level NAT.
docker network create -d macvlan \
  --subnet=10.10.10.0/24 --gateway=10.10.10.1 \
  -o parent=eth0 pub_net
  • Alternatively, assign a static IP to the LXC/VM and rely on Proxmox’s firewall/SDN for macro-segmentation.

Gotcha:
TCP port forwarding at multiple layers (Proxmox NAT, LXC, Docker) creates convoluted network traces—document all mappings early.


Backup, Snapshot, and DR

  • Use Proxmox’s native live snapshot (for LXC/VM) in tandem with application-aware Docker volume backups.
  • For persistent data (Postgres, etc.), regularly export volumes out-of-band (docker cp or bind-mount external volumes).
  • Avoid relying only on filesystem-level snapshots—restoring a “running” database snapshot yields corruption.

Monitoring and Troubleshooting

Host:

  • pveperf, proxmox-backup-client, zfs list for I/O health.
  • “Disk full” in Proxmox presents as:
    failed to start container: mkdir /var/lib/docker/containers/xxxx: no space left on device
    

LXC/VM:

  • docker stats, journalctl -u docker, htop.
  • For high inode churn (CI runners, artifact builds), monitor df -i.

Non-obvious tip:
Track Proxmox CT/VM disk images (qcow2, raw, or ZFS volumes) and trim regularly to avoid silent disk ballooning.


Automation: IaC for Repeatable Deployments

Leverage Terraform providers for Proxmox (danitso/terraform-provider-proxmox), paired with Ansible playbooks for Docker provisioning.
Parameterize LXC/VM resource sizing and cloud-init SSH keys.
Example:

resource "proxmox_lxc" "dockerhost" {
  hostname = "docker-lxc"
  ...
  features = {
    nesting = true
  }
}

Then in Ansible:

- name: Install Docker
  apt:
    name: docker.io
    state: present

Physical network IDs, bridge assignments, and storage constraints should be codified, not left implicit.


Summary Table

ScenarioRecommendation
Routine apps, light servicesLXC, privileged, nesting enabled
BYOKernel, GPU, custom modulesKVM VM, hardware passthrough
Data durability, migrationDedicated ZFS/ext4 for Docker data
NetworkingMacvlan or routed static IPs
Monitoringdocker stats, log aggregation
BackupsSnapshots + explicit volume export
AutomationTerraform Proxmox + Ansible

Final Notes

  • Privileged LXC is best for simplicity in homelabs, but audit implications in enterprise.
  • Always verify kernel version compatibility: Docker's overlay2 driver is impacted by older Proxmox base images.
  • Consider alternatives: Containerd or Podman run comparably, but not all orchestration stacks are ready for them.
  • For GPU workloads, passthrough plus nvidia-docker in a KVM VM is most reliable as of Docker 24.x and Proxmox 8.

For advanced deployment patterns—such as ephemeral CI/CD runners, or GPU batch compute nodes—additional nuances apply. Further optimization: avoid storing image caches in the same ZFS pool as containers’ writable layers if snapshot frequency is high.