Mastering the Linux find
Command: Efficient File Discovery in Complex Environments
In large-scale Linux systems, searching for files by hand or relying on GUI tools is rarely practical—especially across multi-terabyte filesystems or when diagnosing via SSH on production servers. For real-time, precise queries, the find
utility is unmatched.
Consider this: /var/log
grows to 1M+ files overnight due to an application bug. Disk is critical. Which logs can you purge or compress based on age or size?
Fundamental Syntax
find [path...] [options] [expression]
- path: Search start point, e.g.
/var/log
,/srv/data
, or simply.
(dot) for CWD. - options/expressions: Filters such as
-name
,-type
,-mtime
, and logical operators (-and
,-or
,!
).
Targeted Searches and Filtering
Locate a File by Name, Case Sensitivity
find /var/log -name syslog
find /var/log -iname 'SYSLOG'
-name
is case-sensitive;-iname
ignores case (POSIX locale caveat: adjust for unique locales).
Filter by File Type
Option | Meaning | Example |
---|---|---|
-type f | Regular file | find /home/user -type f |
-type d | Directory | find /etc -type d |
-type l | Symbolic link | find /usr/bin -type l |
Note: On older ext3 filesystems, directory inodes may misreport types. Confirm with ls -l
.
Time-Based Location
- Modified in last week:
find /srv/backup -mtime -7
- Modified exactly three days ago:
(Interpretation: N days = midnight-to-midnight window)find /home/dev -mtime 3
- Accessed >30 days ago:
(Gotcha: atime updates may be disabled viafind /data/archive -atime +30
mount -o noatime
.)
File Size Constraints
- Files >100 MiB
find /var/tmp -size +100M
- Files <1 KiB
find /tmp -size -1k
- Regular expression on size
(Files between 1 GiB and 10 GiB)find /data -size +1G -and -size -10G
Orchestrating Compound Searches
- Find logs exceeding 10 MiB:
find /var/log -name '*.log' -size +10M
- All recently modified or owned by
alice
:
(Caution: operator precedence—without parentheses, OR binds to previous expression only.)find /srv -mtime -5 -o -user alice
Post-Processing with -exec
and +
-
Cleanup: Remove
.tmp
files—older than 30 days.find /tmp -name '*.tmp' -mtime +30 -exec rm -f {} \;
Use
-exec ... {} +
for batching:find /tmp -name '*.tmp' -mtime +30 -exec rm -f {} +
(Reduces fork() overhead, but beware: Gets “Argument list too long” at very large scales.)
-
Content Preview: Print the first line of all
.conf
files.find /etc -name '*.conf' -exec head -n 1 {} +
Output aligns with batch order, but head output may not easily correlate with filenames—wrap in a loop if necessary.
Advanced Usage and Pitfalls
-
Directory Exclusion (
-prune
)
Skip/proc
to avoid unnecessary permission errors:find / -path /proc -prune -o -name '*.log' -print
(Entire
/proc
ignored, but top-level only—nested prune for multiple dirs: seefind / \( -path /proc -o -path /sys \) -prune -o -print
.) -
Regex Matching
find . -regex '.*\.\(jpg\|png\)$'
Regular expression syntax is implementation-specific; check
find --version
(GNU findutils v4.4+ preferred). -
Permission Checks
find /srv -perm -o+w
Identifies world-writable files—a key compliance and security scan.
-
Empty Files and Directories
find /var/cache -empty
Useful for tidying up temp or spool locations. Cross-reference with du/df if space is unexpectedly low.
Notes from Production
- Filesystem Performance: On very large filesystems (>10M files), expect latency. Use filesystem-specific search tools (like
fd
ormlocate
) only if up-to-date accuracy isn't mandatory. - Errors: Permission denied errors are routine:
To suppress, appendfind: ‘/proc/45/task/45/fdinfo/4’: Permission denied
2>/dev/null
for cleaner output. Still, be wary of masking real issues during scripting. - Operator Precedence: Always group logical operators in complex expressions using escaped parentheses
\(
and\)
.
Recommended References
man 1 find
— Authoritative syntax and nuance.- GNU Findutils Manual
find --version
— Confirm exact version, as flags like-printf
and-readable
are not portable to BSD/macOS.
One More Trick: Integration With xargs
Pipelining can be more robust for certain tasks:
find /var/www -type f -name '*.php' -print0 | xargs -0 grep 'mysqli_'
This avoids issues with filenames containing whitespace or newlines.
This overview doesn’t exhaust the find
command’s capabilities, but covers core and advanced usage relevant to day-to-day system and application administration. For edge cases—merge with tools like grep
, awk
, or custom scripts. Not every scenario is solved with a single command, but deliberate use of find
preserves both time and accuracy in complex Linux estates.