by John Walker
Inside every Unix shop with more than a handful of machines, odds are there's a file system slowly growing toward 100% capacity, at which point all kinds of unpleasant events begin to transpire. System administrators who value the serenity and rejuvenation of a good night's sleep over the flattering feeling of being needed which comes from having your pager go off at four in the morning need tools which anticipate little problems before they mature into full-on screaming crises. This page presents a suite of such tools, all Perl programs, which monitor impending file system full disasters and aid in remediation of impending problems.
WatchFull consists of three independent Perl scripts which address different aspects of file system capacity management. In order to use these tools you must have Perl installed on your system. The programs were tested with Perl v5.22.1 and should work on any version of Perl from 5.003 and later. Note that while Perl has been ported to non-Unix platforms, these utilities are Unix-specific as they require other standard Unix programs not present on other systems. If you install these programs as cron jobs, be sure to verify whether Perl can be found from the abbreviated search path used for such jobs and, if not, either add it to the path or explicitly specify the full pathname in the crontab entry.
Documentation for WatchFull follows, in Unix manual page style.
WatchFull - monitor file systems approaching capacity
perl WatchFull.pl [ -d ] [ -g name ] [ -m address ] [ -s /path=thresh,… ] [ -t thresh ] [ -u ] [ -x /path,… ]
WatchFull examines mounted file systems on the machine on which it is run and generates a report listing those which exceed a given capacity threshold. If any file systems are found to exceed their thresholds, a warning is mailed to the designated system administrator.
Here is an example of a warning message mailed by WatchFull to the administrator of a host named “pallas”.
Greetings, carbon-based lifeform. The following file systems on pallas are approaching capacity. /dev/dsk/dks0d2s7 xfs 1960472 1804112 156360 93 /files1
WatchFull assumes the “df -k” command produces output in the format it expects and that the Mail command can be used to send mail to the designated recipient. If this isn't the case, you'll have to modify the Perl program accordingly. On some systems you'll have to replace Mail with mailx.
df(1), Mail(3)
One of the most common causes of file system exhaustion is system and server log files which grow without bound as entries are appended to them. If you don't keep an eye on these files, they can eat your disc alive. For example, let's a take a peek at the HTTP log file directory on the www.fourmilab.ch server right now:
/files/server/logs/http> ls -lt total 5012848 -rw-r--r-- 1 root sys 477730174 Aug 22 15:21 agent_log -rw-r--r-- 1 root sys 957131968 Aug 22 15:21 referer_log -rw-r--r-- 1 root sys 1113544113 Aug 22 15:21 access_log -rw-r--r-- 1 root sys 18168734 Aug 22 15:19 error_log
Yikes! That's two and a half G- G- Gigabytes of log files—time to clean house! (Actually, the file system on which these files are kept has a capacity of 17 Gb and is only about 25% full, so I can go a long time before taking the garbage out….)
Anyway, LogJam will keep an eye on the log files on your system and E-mail warnings when one or more exceed size thresholds you define on a file-by-file basis. Amid the daily chaos of system administration, it's easy to overlook log files ratcheting up to absurd dimensions. LogJam lets you know when they need attending to.
Documentation for LogJam follows, in Unix manual page style.
LogJam - monitor size of continuously growing files
perl LogJam.pl [ -d ] [ -g name ] [ -m address ] [ -t ] [ -u ] filename threshold…
A common cause of file system full crises are system and server log files which grow without bound as transactions are added. Most modern Unix systems incorporate mechanisms to limit the space consumed by system files such as console message transcripts and login histories, but many server logs such as FTP and HTTP access logs must be manually “cycled” when they grow too large. This is a task easily overlooked amidst the quotidian alarums and diversions of system administration. LogJam keeps an eye on these log files and sends a warning when one or more exceeds a given size threshold.
On the command line, list one or more “filename threshold” pairs which specify a file to be checked and the size threshold which, when exceeded, will generate a warning for that file. The size may be specified in bytes, or with a suffix of “K” for kilobytes, “M” for megabytes, “G” for gigabytes, or “T” for terabytes. Suffixes denote powers of 1000 and may be either upper or lower case. For example, to generate a warning when an HTTP access log exceeds 500 megabytes, one would use:
perl LogJam.pl /files/server/logs/http/access_log 500M
The size of a file is deemed to be whatever the Perl -s operator says it is. On systems which support and contain “holey” files—files in which all logical addresses do not correspond to allocated storage—the size reported may not correspond to the amount of storage actually occupied by the file.
Sizes of the files named on the command line are determined with the “du -sk” command. If this command does not produce the expected format on your system, you'll have to modify the Perl program to specify the appropriate command and/or parse the results it returns.
LogJam assumes the Mail command can be used to send mail to the designated recipient. If this isn't the case, you'll have to modify the source code accordingly. On some systems you'll have to replace Mail with mailx.
du(1), Mail(1)
Once WatchFull has alerted you to a file system approaching capacity and you've dealt with any oversized log files fingered by LogJam, it's time to unleash the witch hunt for huge files lurking in less obvious locations. You know—those 275 megabyte core dumps from Netscrape in each of your users' home directories, the fellow with half a gigabyte of, shall we say, “non-work related” MPEG files, system crash core dumps and kernel images dating back to 1994, patch back-out directories from three operating system releases ago, etc. This is where Top40 comes in.
Top40 scans one or more directory trees and prepares a list of the 40 largest files found in them. (You can specify the number of files to be shown with a command line option.) These files are prime candidates for clean-up campaigns.
Documentation for Top40 follows, in Unix manual page style.
Top40 - show largest files in directory trees
Top40 [ -f ] [ -h ] [ -n count ] [ -s size ] [ -t ] [ -u ] rootdir…
Top40 scans one or more directory trees starting at its rootdir and displays a list of the largest files found, 40 by default, in descending order by size. The rootdir arguments need not be file system mount points—any directory may be scanned. If no rootdir is specified, the current directory is scanned.
The Unix find command is used to traverse the specified directory trees and pre-filter the files found therein. If the find command on your system behaves in a non-standard manner, you may have to modify the options supplied to it in the program source.
Directories which contain a multitude of small files will escape scrutiny by Top40. A separate scan which sums the size of top-level directory contents would be required to identify such perpetrators and Top40 does not presently do this.
The size of a file is taken to be whatever the Perl -s operator says it is. On systems which support and contain “holey” files—files in which all logical addresses do not correspond to allocated storage—the size reported may not correspond to the amount of storage actually occupied by the file.
find(1)
To install WatchFull, download the archive using the link above, uncompress with gunzip, then extract the contents with tar. The resulting directory will contain Perl source code for each of the components of the package, this document in HTML format, and a Makefile used to create the release archive.
This software is in the public domain. Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, without any conditions or restrictions. This software is provided “as is” without express or implied warranty.
Absolutely no support or assistance of any kind whatsoever is available for WatchFull—you are entirely on your own.