Integrating Datadog Agent with Dockerized Workloads

Operational visibility is non-negotiable for containerized applications, especially with the ephemeral nature of Docker containers. Datadog provides a robust telemetry pipeline for both system and application metrics, logs, and distributed traces. But collecting telemetry from containers isn’t plug-and-play—specific agent strategies are required to avoid blind spots or insecure practices.

Below, I’ll demonstrate a pragmatic approach: embedding the Datadog Agent directly within an Ubuntu-based container image, with pointers on clean runtime configuration, integration enabling, and operational gotchas.

Approaches for Container Monitoring

A quick breakdown:

Approach	Isolation	Complexity	Image Size	Coverage
Sidecar Agent	High	Moderate	Low	Per host/node
Embedded in App Img	Low	Simple	Higher	Per container

Why this matters: Sidecars eliminate code/agent coupling and capture host-wide telemetry. Embedding the agent trades off clean separation for easier single-container deployment or legacy scenarios. The focus here is on the embedded pattern—useful for legacy monoliths or images already customized for ops.

Core Prerequisites

Datadog API key. Use a secure secret store (Vault, SSM, etc.) in production; avoid hardcoding.
Docker (20.10+ recommended—recent versions fix mount/volume permission quirks).
Familiarity with Dockerfile and multi-stage builds.

Example: Embedding Datadog Agent (Ubuntu 20.04)

Dockerfile fragment:

FROM ubuntu:20.04

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update \
    && apt-get install -y --no-install-recommends \
         curl gnupg apt-transport-https \
    && curl -sSL https://keys.datadoghq.com/DATADOG_APT_KEY.public | apt-key add - \
    && echo "deb https://apt.datadoghq.com/ stable 7" > /etc/apt/sources.list.d/datadog.list \
    && apt-get update \
    && apt-get install -y --no-install-recommends datadog-agent \
    && rm -rf /var/lib/apt/lists/*

# Placeholder: Copy app code after agent is installed
COPY . /app
WORKDIR /app

# Entrypoint: start agent, then your app
CMD service datadog-agent start && exec ./start-my-app.sh

Notes:

The above runs Datadog Agent v7; adjust if your integration policy requires v6 or datadog-agent-base only.
service datadog-agent start backgrounds the agent. If you require fine-grained agent process management (systemd, supervisord), additional tweaks are necessary.
Linux capabilities (--cap-add=SYS_PTRACE, etc.) may be required in strict container runtimes for full process monitoring (pyroscope via dd-agent, etc.).

Runtime: Secure Environment Configuration

Never bake the API key into your image. Use environment variables at runtime. In production, use orchestrator secrets, not -e DD_API_KEY=... in a compose file.

Sample docker-compose snippet:

version: '3.8'
services:
  myapp:
    build: .
    environment:
      - DD_API_KEY=${DD_API_KEY}
      - DD_SITE=datadoghq.com
    # Optional: mount additional agent configurations or logs
    # volumes:
    #   - ./datadog-conf:/etc/datadog-agent/conf.d

Runtime invocation:

docker run -d \
  -e DD_API_KEY=$DD_API_KEY \
  my-app-with-datadog

Common mistake: Containers started with no API key show in logs:

[ERROR] API key is missing - datadog-agent will not report data

If you see this, check secret injection or environment variable spelling.

Extending with Integrations

Datadog's modular agent supports pluggable integrations: customize via /etc/datadog-agent/conf.d/<integration>.d/conf.yaml.

Pattern:

COPY nginx.d/conf.yaml /etc/datadog-agent/conf.d/nginx.d/conf.yaml

Minimal Nginx integration config:

# File: nginx.d/conf.yaml
init_config:
instances:
  - nginx_status_url: http://localhost/nginx_status/

Gotcha: The agent container needs network access to the integration’s endpoint. For bridge networking, localhost works; with overlay or custom networks, you may need to adapt hostnames.

Inspection & Diagnostics

After boot, exec into the container. Run:

datadog-agent status

You should see real-time plugin health and host metrics. Errors often relate to host volumes, API key, or misconfigured network.

Log trace:

Starting Datadog Agent (v7.53.1)...
Conf: Loaded 89 check configurations
Forwarder: successfully posted payload to https://app.datadoghq.com

No metrics visible? Examine /var/log/datadog/agent.log inside the container.

Alternative: Host-level Agent (Recommended for Multi-container Hosts)

The embedded pattern increases image size and can duplicate agent processes. For a cleaner approach, especially with multiple containers, run a dedicated agent as a sidecar or host service. This is standard in Kubernetes (via DaemonSet) or as a long-running Docker-managed container:

      - /var/run/docker.sock:/var/run/docker.sock:ro

Required for live container metadata collection.

Sample service definition:

datadog-agent:
  image: gcr.io/datadoghq/agent:latest
  environment:
    - DD_API_KEY=${DD_API_KEY}
    - DD_SITE=datadoghq.com
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock:ro
    - /proc/:/host/proc/:ro
    - /sys/fs/cgroup/:/host/sys/fs/cgroup:ro

Known issue: If Docker daemon uses user namespaces, you may hit permission errors mounting /proc or /sys/fs/cgroup.

Trade-offs, Edge cases, and Recommendations

Image Bloat: The agent adds ~200MB+ to your image, depending on base layer and dependencies.
PID Namespace: Embedded agents see the container’s PID namespace only—system-wide process checks require privileged deployment or host agent.
Telemetry Duplication: Multiple agents in a single host can duplicate host metrics unless confined with DD_DOGSTATSD_NON_LOCAL_TRAFFIC.
Log Collection: Use explicit mounts for application logs. Sidecar agents see host logs; embedded agents see only their own.

For broader fleet monitoring and minimal overhead, prefer the sidecar or host agent pattern. Embed only when isolated, self-reporting containers are mandatory.

Reference

Datadog official container agent docs: https://docs.datadoghq.com/agent/docker/
Example agent config repo: https://github.com/DataDog/integrations-core

For advanced workflows (e.g., Alpine, distroless, dist-specific integrations), adapt the install process—official agent packages target Debian/Ubuntu/CentOS by default. Building a minimal custom agent layer for slim containers? Not trivial: see Datadog Agent base images for baselines.

Add Datadog Agent To Docker Container