If pushed, I would say that find is the least-used-most-useful
general Unix command. Here we explore real-world uses of find,
either things I do or that I wish I could remember how to do :-)
find crawls over a a directory tree, typically reporting the files,
directories, etc., that it encounters. It differs from ls in that
it recurses into sub-directories.
The general syntax is...
find path1 [path2 ...] predicate(s) action
Almost always, the "path" is . (current working directory). The
predicates are the fun part. The action (at least the way I use
find) is nearly always -print or -ls. (See a little mention of
-print0 later on.)
List what's in the current directory:
# short GNU-ish form:
find
# classic works-anywhere form:
find . -print
(In the GNU-ish minimal example, the "path" defaults to .; there is
no predicate -- so everything matches; and the action is indeed
-print.)
List the current directory, sorted:
find | sort
List only the things with "tommy" in the name:
find . -name '*tommy*'
# or, equivalently:
find | grep tommy
Only things without "tommy" in the name:
find . '(' '!' -name '*tommy*' ')'
# or...
find | grep -v tommy
The -name thing only matches against the "basename" of the paths
that find is chomping through. You can also match against the whole
paths with:
find . -wholename '*public/*tommy*'
Choosing names with a regular expression, e.g. to find all files
ending in .c, .h, .cc, .cpp, and .C:
find . -regex '.*/svnwork/.*\.\([Cch]\|cc\|cpp\)'
(Note: the -regex thing matches against full paths, as with
-wholename. Also note: those are wacky Emacs-style regular
expressions; you can change that with -regextype.)
Back to something more sane... Choosing names case-insensitively:
find . -iname '*tommy*'
(There's also -iregex, -iwholename, ...)
Alright, I admit it: it is super-rare for me to use -regex,
-iwholename, -iname, etc. However, I very commonly list all the files
(no directories, symbolic links, sockets, named pipes, special device
files, ...):
find . -type f
And I often want to know more about them than just their name; enter -ls:
find . -type f -ls
Very often, what I want to know is "What are the ten biggest files?"
That's:
find . -type f -ls | sort -k 7nr | head
(sort -k 7nr: sort on the 7th column [size, as it happens],
numerically, reverse order.)
Now, often in a case like that, I don't even want to bother with,
say, files smaller than 100KB; filter those out ahead of time with
-size:
find . -type f -size +100000c -ls | sort -k 7nr | head
(Note: with -size, and later with -mtime, +N means "greater than
N" and -N means "less than N". You almost never want -size
1000000c, which means "exactly one million characters [bytes]".)
In case it isn't clear: the component parts of a predicate (-type f,
-size ...) get ANDed together. (Yes, there is an OR operator: -or.)
And, since I've mentioned it: -mtime checks the modification time
of a file, i.e. how old it is. So, for files over 20KB modified
sometime in the last month:
find . -type f -mtime -30 -size +20000c -ls | sort -k 7nr
There are a whole slew of find options to do with picking files by
their modification/creation/access times and doing so in super-precise
ways... The only thing I ever use besides -mtime is...
find . -type f -newer ~/t/last-update -ls | sort -k 7nr
I.e. pick files by their relative age compared to a file. The nice
thing about that is you can create a file with the exact time you
care about (e.g.
touch -t 200904010000.01 ~/t/april-fool
) and then find against that.
Another 'mtime-y' thing I often do is try to find the guilty among
recently-changed files. This usually takes the form:
ls -ltr `find . -type f -mtime -1`
Besides modification and access times, you can also go finding against
permissions, with -perm. Typically, the problem is "readable files
that shouldn't be", "unreadable files that should be", "files with
gratuitous execute permission", or even "generally horked directory
permissions". I always have to double-check the manual for this
stuff but it's usually things like:
# no read permission of any kind (even owner):
find . -type f \( \! -perm /444 \)
# no read permission of any kind for non-owner (group or other):
find . -type f \( \! -perm /044 \)
# write permission for group or owner:
find .. -type f -perm /022
# any non-directory with some kind of execute permission:
find . \( \! -type d \) -perm /111
Again, there is often a find-plus-grep cheap-and-cheerful equivalent:
# look for directories with permissions we don't like:
find . -type d -ls | grep 'drwx------'
# even more flexible with egrep (regular expression-ish):
find . -type d -ls | grep 'd...------'
Times, permissions,... yes, you can also look for users and groups.
The obvious things: find stuff owned by a user (that perhaps shouldn't
be), or find stuff that isn't owned by a user (but that should be).
find . -user root -ls
find . \( \! -user tommyk \) -ls
Another tangent... Files (-type f) are not always the object of my
attentions. Directories sometimes feature; so...
# traverse in depth-first order, finding the empty directories:
find . -depth -type d -empty
(-depth is important if you want to do something rash like
'/bin/rmdir' on them [example later].)
Symbolic links also get singled out for attention, say, when they go
wonky (when copied, for example) and need to be fixed. Here's the
finding part...:
find . -type l
If you use -ls, you can grep on the symbolic links' values:
# find symlinks pointing to Acme tree:
find . -type l -ls | grep -- '-> /usr/local/acme'
OK, we're nearly done with the fun. Just a detail or two left.
If you have reason to find in the root directory, you run the risk
of walking through every mounted filesystem (when perhaps all you want
to know is "Why is my root filesystem 100% full?") For this
situation, use -mount to avoid crossing filesystem boundaries:
find / -mount -type f -size +1000000c -ls
A word about -print0 (I promised...): If you are going to do
something with find output -- and I recommend xargs -- and if the
find output contains spaces (or other shell-significant characters),
then you need -print0 (separate with NUL bytes) instead of -print
(separate with newline bytes). So, to remove empty directories:
find . -depth -type d -empty -print0 | xargs -0 /bin/rmdir
Finally, if you run find across big wads of files not all your own,
you may get lots of (expected) 'permission denied' errors which just
clutter up the output. Two ways around that:
# the find way:
find . -nowarn -type f -name 'core.[0-9][0-9]*'
# the olde (non-csh) Unix way:
find . -type f -name 'core.[0-9][0-9]*' 2> /dev/null
I hope the above illustrates that there are a lot of things you can
(and should) do with find. And the dirty little secret, as our
examples have shown: a simple find command in a straightforward shell
pipeline is often easier than deep find magic.
[An earlier version of this note appeared in Verilab's internal newsletter.]