Bash Scripts: Orchestrating Command Sequences with Real-World Logic
Repetitive maintenance tasks—database dumps, rotating logs, package updates—consume attention and carry high risk if missequenced. Bash scripting is the backbone of Linux automation, but most production issues stem not from missing commands, but from skipping critical checks. This is where control flow and embedded logic transform a bash script from a simple sequence into an adaptive, fault-tolerant tool.
Integrating Conditional Logic: Beyond the Linear Script
Typical Issue: A backup script is scheduled with cron; the target directory unmounts, the rsync operation runs anyway, and 100GB floods /
, taking the system down.
Cause: Insufficient preflight logic.
Solution: Bash’s if
, case
, &&
, and ||
constructs—when wielded properly—bridge the gap between human caution and reliable automation.
Bash Control Flow Essentials
Directory Checks: A Real Backup Precondition
#!/bin/bash
BACKUP_DIR="/mnt/backup"
SRC_DIR="/srv/data"
LOG="/var/log/backup.log"
if [[ -d "$BACKUP_DIR" ]]; then
echo "[INFO] $(date) - Backup target present." >> "$LOG"
rsync -a --delete "$SRC_DIR"/ "$BACKUP_DIR"/ 2>>"$LOG"
RC=$?
if [[ $RC -ne 0 ]]; then
echo "[ERROR] $(date) - rsync failed, exit code $RC" >>"$LOG"
exit 10
fi
else
echo "[ERROR] $(date) - $BACKUP_DIR unavailable." >> "$LOG"
exit 2
fi
Notes:
- Avoid using
/mnt/backup
as a static path in production—prefer afindmnt
call to verify mounts. - Logging both success and failure cases supports postmortem debugging.
Command Exit Codes: Rely on Them, Don’t Ignore Them
Every command in bash exposes its completion status via $?
. Often overlooked: some commands set nonzero status for warnings, not just errors. Use explicit checking.
systemctl reload nginx
if [[ $? -ne 0 ]]; then
echo "[WARNING] $(date) - nginx reload failed: $(systemctl status nginx | tail -20)"
fi
Or, in a more idiomatic style:
if ! systemctl reload nginx; then
journalctl -u nginx -n 15 >&2
exit 3
fi
Side Note: Some systemctl
subcommands (e.g., is-active
) invert exit statuses—understand them before embedding into critical scripts.
Compound Conditions: AND/OR in Bash
Suppose patching should only proceed if both system load is reasonable and disk space is healthy:
LOAD=$(awk '{print int($1)}' /proc/loadavg)
FREE=$(df --output=avail / | tail -n 1)
if [[ $LOAD -lt 12 && $FREE -gt 1000000 ]]; then
apt-get upgrade -y
else
echo "Resource check failed. Load: $LOAD, Free KB: $FREE"
exit 4
fi
Gotcha: Don’t mix &&
and ||
operators without extra parentheses in [[ ... ]]
.
case
: When Branching Becomes Unwieldy
Configuration management scripts often adapt logic to the environment variable or argument.
case "$1" in
test|ci)
export ENV=staging
;;
prod | production)
export ENV=prod
;;
*)
echo "Usage: $0 {test|ci|prod}"
exit 64
;;
esac
Tip: Leverage pattern matching; don’t enumerate every permutation manually.
Dynamic Loops: Retrying with Limits
Persistent operations—restarting daemons prone to race conditions—need bounded retries.
retries=0
max=3
while ! systemctl restart postgresql-12; do
((retries++))
if (( retries >= max )); then
echo "PostgreSQL failed to restart after $max attempts, manual intervention needed"
exit 1
fi
sleep $((retries * 10))
done
Escalating backoff prevents log spam and avoids hammering services.
Assembling a Production-Ready Automation Script
Below, a script targeting Ubuntu 22.04 LTS, combining mount checks, resource validation, and logging with conditional branches:
#!/bin/bash
# backup-html.sh
SRC="/var/www/html"
DEST="/mnt/backup/html-prod"
LOG="/var/log/backup-html.log"
REQ_SPACE=1000000 # KB
{
echo ""
echo "[$(date +'%F %T')] Initiating backup"
if ! findmnt -rno TARGET "$DEST" &>/dev/null; then
echo "[FATAL] Backup mount $DEST not available"
exit 70
fi
AVAIL=$(df --output=avail "$DEST" | tail -1)
if (( AVAIL < REQ_SPACE )); then
echo "[FATAL] Not enough space: ${AVAIL}KB available"
exit 71
fi
cp -a "$SRC"/. "$DEST"/
RC=$?
if (( RC != 0 )); then
echo "[ERROR] cp failed with exit code $RC"
exit 73
fi
echo "[OK] Backup completed successfully"
exit 0
} >>"$LOG" 2>&1
Practicality:
Logs every outcome; selects the “dot slash” ("$SRC"/.
) pattern to preserve hidden files; cleanly segregates error paths; uses distinct exit
codes per failure type for CI monitoring.
Non-Obvious Pitfalls and Tips
- Always quote variables in tests:
[ -n "$VAR" ]
not[ -n $VAR ]
. Unquoted expansion breaks with spaces or wildcards. - Prefer
[[ ... ]]
for bash-specific features (regex, easier syntax); fall back to[ ... ]
only for POSIX compliance. - For production: lock scripts using
flock(1)
to avoid race conditions under cron. - Avoid silent failures—log to
syslog
or an external aggregator if logs rotate aggressively.
Known Issue:
Old NFS shares may return success for -d
even when the remote link is stale; supplement with a write test if critical.
Summary
Robust bash scripting is less about inventing commands and more about disciplined control flow, defensive checks, and explicit error paths. Scripts that encode real operational rules—resource thresholds, retries with state awareness, branching logic—become high-leverage infrastructure tools. Documentation, logging, and error codes position them for use in CI/CD and scheduled automation.
The next time you automate a workflow, identify all the points of failure, preconditions, and decision criteria up front. Then, let your script enforce them—relentlessly and reliably.