Copying Files from Docker Containers to the Host: Engineer’s Reference
Extracting files from a running or stopped Docker container is a routine necessity: troubleshooting log extraction, data artifact retrieval after a job finishes, or copying application results for further analysis outside of isolation. Bind mounts are ideal for repeatable workflows, but sometimes you discover a need to grab data only after container execution—typically, when the job didn’t finish as planned.
Consider the following example. An analytics service (my-analytics
) writes its CSV report to /reports/latest/results.csv
inside the container. Post-run inspection is required, but the container has no preconfigured shared volume.
docker cp my-analytics:/reports/latest/results.csv ./out/results.csv
Result: The file is now available on the host, permissions and all.
Note: The container does not need to be running—Docker accesses the filesystem layer directly.
docker cp
: Core Command Details
The docker cp
utility effectively bridges the container’s filesystem namespace and your host, supporting both copy-in and copy-out.
Syntax:
docker cp [OPTIONS] CONTAINER:SRC_PATH DEST_PATH|-
docker cp [OPTIONS] SRC_PATH|- CONTAINER:DEST_PATH
- Works both ways: container → host, and host → container.
- Requires Docker Engine v1.8+ (recommended: Docker 20.10+ for stability).
Real-World Usage Patterns
Audit logs for a crashed Nginx container:
docker cp nginx-prod:/var/log/nginx/error.log /tmp/web-error.log
Check /tmp/web-error.log
for hints on the failure. File is owned by the UID running the container’s process.
Bulk copy: capture a result directory from a data pipeline
docker cp pipeline-batch:/data/output /var/tmp/pipeline-results
ls /var/tmp/pipeline-results/output/
Directory structure and files are recursively transferred.
Offload selective files, but avoid lack of wildcard support
docker cp
does not support shell globs. To work around:
docker exec pipeline-batch sh -c "tar czf /tmp/all-json.tar.gz /data/output/*.json"
docker cp pipeline-batch:/tmp/all-json.tar.gz .
tar xf all-json.tar.gz
Compress in-place, copy archive, expand locally. Not elegant, but robust.
Practical Issues and Gotchas
Concern | Impact/Remediation |
---|---|
File ownership/UID | Files may appear as root or mapped UID. Use chown , if needed. |
Path does not exist | Fails with: no such file or directory . Verify paths; docker exec ... ls ... before copying is recommended. |
Symbolic links | Copied target, not link, unless using directories. Edge case: relative symlinks can break. |
Container state | Stopped containers are fine. Removed containers are not—layer disappears after docker rm . |
Alternate Approaches (and their pitfalls)
- Bind volumes
- Easiest for anticipated export, but not feasible post hoc.
- Docker exec + output redirection
- Good for stdout, impractical for binary blobs or directories.
docker export
- Dumps whole filesystem; heavy-handed for targeted file transfer. Not covered here for brevity.
Non-obvious Tips
- Copying with
-L
flag (Docker 20.10+):
Follows symlinks inside the container, copying the target—not just the link.docker cp -L container:/path/to/link /host/path
- Inspect archives inside the container (
tar
,gzip
) if you need atomic multi-file transfers, especially when targeting hosts with different OS users or permissions. - On some systems (notably macOS with older Docker Desktop), ACLs and extended file attributes may be lost during copy. Not always an issue for ephemeral artifacts, but critical for security-sensitive files.
Summary
When rapid file extraction from containers is needed, docker cp
is precise and robust—provided you account for path accuracy, permission quirks, and the lack of shell globbing. For all cases where logging volumes or persistent data volumes aren’t preconfigured, this workflow is essential. There’s no substitute for knowing how to triage containers by reaching in and extracting exactly what you need, when you need it.
If you routinely work with short-lived or one-shot containers, consider integrating docker cp
as a post-processing step in your CI pipeline. It’s not perfect, but it avoids the overhead of volume configuration for transient jobs.