Mastering Directory Migration: Efficient Linux Directory Moves Without Data Loss
Moving directories on a live Linux system isn’t as simple as running mv
. Poorly executed migrations interrupt running processes, break symlinks, and risk losing hidden configs. Production incidents often trace back to a naive directory move.
Problem Scenario
Consider a running PostgreSQL instance with active WAL logging: moving /var/lib/pgsql
to a bigger disk means that even a brief inconsistency can break the database. Are you really willing to gamble on mv
alone?
Evaluate What Needs Moving
Minimum prep:
- Identify hidden files (
ls -la
) - Inventory symlinks and special files (sockets, devices)
- Audit open files:
sudo lsof +D /data/legacy-metrics/
Don’t skip this. An open file handle during migration is a textbook cause of corruption, especially with databases or logroots. Stale symlinks are another migration hazard, rapidly becoming maintenance debt.
Filesystem Boundaries Change the Rules
Within a Single Filesystem
Ext4, XFS, ZFS — it doesn’t matter: if both source and destination are on the same block device, mv
simply updates inode pointers. The process is atomic and nearly instantaneous, <1s even for large trees:
mv /srv/images /mnt/newspace/
Validation (after changing directory):
ls -la /mnt/newspace/images
Actual data movement happens only if the move crosses device boundaries. Filesystem moves are safe, provided nobody is writing to the source during the operation.
Across Different Filesystems
A surprise for some: mv
can silently degrade into a copy-then-delete operation. Larger trees, e.g. /opt/jenkins_home
with gigabytes of artifacts, dramatically increase the risk of a partial state on CTRL-C or failure.
Preferred: Use rsync
rsync -aHAX --info=progress2 --delete /var/old_disks/jenkins/ /data/jenkins/
-a
: ensures recursive copy with permissions, timestamps, and symlinks.-HAX
: preserves hard links, ACLs, and extended attributes (critical for SELinux, e.g. on RHEL ≥ 8).--delete
: keeps destination in sync if repeated.--info=progress2
: shows overall copy progress.
If interrupted, rerun the same rsync
to resume efficiently. Tradeoff: temporary duplication of disk use, but worth the integrity.
After successful sync, verify:
diff -rq /var/old_disks/jenkins/ /data/jenkins/
# or for large sets
find /var/old_disks/jenkins | wc -l
find /data/jenkins | wc -l
Once confirmed, and only then:
rm -rf /var/old_disks/jenkins
Symlinks and Edge Cases
Relative symlinks (./data/logs -> ../logs
) usually remain valid after directory moves. Watch for absolute symlinks, which break if the referenced path moves:
find /srv/migrating_dir -type l -exec ls -l {} +
Noticed issues? Use readlink -f
to audit external link destinations:
find /srv/migrating_dir -type l -exec bash -c \
'T=$(readlink -f "{}"); [ ! -e "$T" ] && echo "{} -> $T"' \;
Recreate broken links or consider a temporary bind-mount if you need a seamless transition.
Dealing With Open or Live Files
Services such as systemd units or databases often lock key files. Failing to stop these processes leads to incomplete state:
sudo systemctl stop postgresql
# or for Nginx
sudo systemctl stop nginx
Verify closure:
sudo lsof +D /var/lib/pgsql
Note: Some daemons (e.g., rsyslog) reopen files on SIGHUP. A full stop is safer.
Validation (Don’t Skip)
- File count parity — misses hidden or device loss? Spot it now.
- Directory tree hashes (
find | sort | sha256sum
) - For mission-critical data: full content hash audit.
Example for checksums (SHA256):
cd /var/app/
find . -type f -exec sha256sum '{}' \; | sort > /tmp/src.sha
cd /data/newspace/app/
find . -type f -exec sha256sum '{}' \; | sort > /tmp/dest.sha
diff /tmp/src.sha /tmp/dest.sha
Checklist: Engineering-grade Linux Directory Migration
Step | Notes / Gotchas |
---|---|
Full contents audit | Include dotfiles, symlinks, device files |
Check for open file handles | lsof +D before/after |
Use native mv within FS | Instant, but only if not live-written |
Use rsync , not mv , cross-FS | Preserves permissions, ACLs, safer resumes |
Validate symlink targets | Manually repair broken links as needed |
Compare counts/hashes | No verification, no delete |
Coordinate with live processes | Plan application/service downtime if needed |
Known Issues and Pro Tips
- SELinux contexts often cause permission errors post-migration (
ls -Z
helps debug; always copy withrsync -X
or applyrestorecon -Rv
). - For large servers, GIDs/uids must match on both source and destination. Mismatches create subtle bugs (seen on RHEL, Ubuntu 20.04).
rsync
version matters: features like--info=progress2
require 3.1.0+; always checkrsync --version
.
The reality: directory moves are a risk multiplier inside production, not a routine. Thorough audit, choosing the right tool, and validating data after migration are non-negotiable. For critical migrations, always schedule maintenance windows and keep a recent backup snap — anything else is gambling.
No command is perfect; prudence makes the difference.