Seamless CentOS-to-AlmaLinux Server Migrations: Zero Downtime Tactics
CentOS’s abrupt move from stable downstream to a rolling upstream shook enterprise Linux. Security and regulatory teams suddenly faced unsupported nodes, while platform teams needed a replacement that didn’t disrupt business-critical workloads. Reactive patching isn't a sustainable model once trusted patch cycles vanish.
AlmaLinux: Pragmatic RHEL Parity
AlmaLinux v8.x, fully binary-compatible with RHEL, stepped in as a community-maintained drop-in. Key features:
- Continuous RHEL parity—package for package, ABI for ABI. No strange deltas after
yum update
. - Maintained free of license encumbrances or vendor audit clauses.
- Accelerating ecosystem adoption—ISVs, managed hosting, and CSPs now routinely certify AlmaLinux builds.
Note: as of Q3 2023, most RHEL8-era workloads (kernel-4.18
, systemd-239
, openssl-1.1.1
) run unmodified, but always check your stack for hard-coded release checks.
Migration Blueprint: Preserve Uptime, Ensure Integrity
High-Level Sequence
Stage | Purpose | Key Tools/Notes |
---|---|---|
1. Inventory | Understand current workload/package landscape | rpm , systemctl |
2. Parallel Alma nodes | Build new systems that mirror prod configuration | VM snapshots, IaC templates |
3. Live sync/dry runs | Replicate state without stopping traffic | rsync , DB replication |
4. Cutover | Shift traffic using LB or DNS, preserve sessions | HAProxy, DNS TTL management |
5. Validation/rollback | Confirm with health checks, prep for fallback | Monitoring, backup snapshots |
1. Detailed Inventory—No Surprises
Dump all installed packages and running services. Miss a custom binary, and you risk post-migration support tickets.
rpm -qa > /root/rpms-centos.txt
systemctl list-units --type=service --state=running > /root/services-centos.txt
crontab -l > /root/cron-centos.txt
ss -tulpen > /root/ports-centos.txt
Extract /etc/*
, DB config, non-standard drivers, and application state. Note: Some workloads have dependencies not captured by RPM (pip, NPM). Search /usr/local
and /opt
, too.
2. Build Mirror AlmaLinux Nodes—Don’t Upgrade in Place
Clone to dedicated hardware, clone VMs, or deploy to cloud—platform-agnostic. Match CPU arch, kernel line, storage types, and interface names whenever possible.
IaC Example: For VMware, compare output from lshw -short
and replicate disk layout via Terraform or Ansible.
For classic rpm-based reprovisioning:
scp /root/rpms-centos.txt alma:/root/
xargs -a /root/rpms-centos.txt yum install -y # Will throw errors; resolve manual package renames if needed
Caveat: Not all RPMs have direct analogs between CentOS and AlmaLinux, especially for EPEL or custom builds—run a dry diff.
3. State Sync and Smoke Tests
Example: rsync
for Application Data
rsync -aAXvz --delete --stats /var/www/ alma:/var/www/
- Flags explained:
-A
preserves ACLs;-X
preserves extended attributes. - Watch for inode exhaustion if crossing between filesystems (e.g., ext4→xfs).
For MySQL/Postgres, set up replication:
- MySQL 5.7: initialize as replica (
CHANGE MASTER TO ...
), allow lag to <10s. - Validate with
SHOW SLAVE STATUS\G
or Postgrespg_stat_replication
. - Application-level smoke tests: curl endpoints or trigger synthetic user flows.
Note: Filesystem timestamps can break differential syncs in some web frameworks. Run stat
post-rsync, and fix as needed.
4. Cutover: Session Integrity, Traffic Draining
If using HAProxy:
backend appservers
balance source
server centosnode 10.10.10.5:80 check backup
server almanode 10.10.10.6:80 check
-
Mark CentOS node as
backup
(live draining). -
Monitor active connections via HAProxy stats socket:
echo "show stat" | socat stdio /var/run/haproxy.sock
For static IP migration, beware ARP cache on L2 networks—flush switches or use a short failover interval.
DNS Cutover:
Lower TTL well in advance:
old-record.example.com. 60 IN A 203.0.113.1
Update to new IP during a known low-traffic period, monitor propagation with tools like dig +trace
. Plan for up to the previous TTL in odd cases.
5. Post-Migration Validation & Rollback
After traffic moves:
-
Scrutinize logs:
journalctl -p err..alert -b tail -n100 /var/log/nginx/error.log tail -n100 /var/log/audit/audit.log
-
Confirm all monitoring/alerting recovers to green.
-
Validate backup jobs and scheduled tasks (restore test for at least one DB dump).
-
Keep the CentOS nodes isolated but powered for 24–48 hours in case of regression.
Known issue: Certain kernel modules (especially DKMS-managed drivers) can behave differently between CentOS and AlmaLinux. Test on lab hardware first if you require out-of-tree modules.
Alternative: In-Place Migration (almalinux-deploy.sh
)
AlmaLinux’s almalinux-deploy.sh
is tempting for test or dev, but for production, zero downtime is not guaranteed. Package mismatches and post-conversion kernel quirks aren’t hypothetical:
Example error seen after in-place conversion:
dracut-initqueue[591]: Warning: Could not boot.
Always snapshot VMs or perform full disk backups first:
curl -O https://raw.githubusercontent.com/AlmaLinux/almalinux-deploy/master/almalinux-deploy.sh
chmod +x almalinux-deploy.sh
sudo ./almalinux-deploy.sh
Even when the script completes with “Success”, check every critical service for subtle breakage.
Gotchas & Practical Lessons
-
NFS v4 mounts sometimes hang on kernel change—validate via
mount -o remount
. -
SELinux relabels can take hours on large filesystems. Run
restorecon
proactively:touch /.autorelabel && reboot
-
Auditd rules may require manual re-import—check
audit.rules
after migration.
Final Word
Migrate with methodical inventory, staged parallelization, controlled cutover, and rigorous validation. Skip these, and downtime is nearly guaranteed. When done right, users won’t notice the shift—other than improved update cadence.
If you hit unexpected service failures or package gaps during your migration, capture those details. They're valuable not just for postmortem, but for your next lifecycle refresh—CentOS’s lesson is to plan not just for uptime, but for agility.