Note: PageVisits is a Common Gateway Interface (CGI) program written
in the Perl language.
It requires image processing utilities from the
Netpbm
image processing toolkit. Installing a CGI program requires detailed
knowledge of the Web server configuration of the system on which
it is to be installed, and may require administrative (super-user)
privilege to install. Programs and data directories must be installed with
the correct ownership attributes and permissions, and program and
library paths may need to be set to permit the CGI program to find
the utilities it requires. Since Web server configurations differ
widely from system to system, there's no cookbook approach to installing
a program such as this--you need to understand what you're
doing, and know how to track down and fix problems based on error
messages in the HTTP server error log.
It's not an exaggeration to say that most system administrators who can
install a program such as this without some difficulty and a few false
starts could probably write such a program themselves if they
wanted to. The whole point of this program is that they don't have to!
This program can serve as a point of departure for those who wish to
extend it into something much fancier. It is almost impossible to
troubleshoot CGI programs remotely, and even if it were possible, I
don't have the time to do so. Consequently, while you're free to use
this program in any way you wish without any restrictions, you're
entirely on your own--it is utterly unsupported.
Downloading and Installation
To install PageVisits on your Web server, perform the following steps.
- Download the PageVisits distribution archive:
pagevisits.tar.gz
- Uncompress and extract the archive into a new empty directory.
- Determine where the following directories are located on your
Web server. In the Example column, I give the
directory used by the system-wide Apache Web server
installed with Red Hat Enterprise Linux 3. These
directories vary from system to system; it is
essential you determine the correct locations
for your installation! If you're installing
PageVisits as an individual user on a multi-user
server, the directories will usually be within your
own home directory tree--consult your grumpy, overworked
system administrator for information and admonishment.
I've provided blank items in the following table to write
in the directory paths for your system.
Directory |
Variable |
Example |
Location on your system |
CGI binaries |
$CGI_Directory |
/var/www/cgi-bin |
|
Netpbm utilities |
$NETPBM_Directory |
/usr/bin |
|
Perl interpreter |
#! |
/usr/bin/perl |
|
- Edit the PageVisits.pl program, locate the "Directory Configuration"
section at the top, and replace the values assigned to each of the
variables in the Variable column of the table above with the
correct value for your system. The last row in the table specifies the location
of the Perl interpreter on your system; on Unix-like systems you can
determine this with the shell command "which perl". This location
should be entered on the first line of PageVisits.pl, following the
"#! " characters.
- Save the modified PageVisits.pl file and test it with the
command:
./PageVisits.pl test
If you get a "bad interpreter" or "not executable" error, the location
of Perl in the first line of the program is probably incorrect or
the process of editing the program has caused it to lose execute
permission (or Perl is missing or improperly installed on your system).
If you get one or more missing directory or file messages, correct the
directory configuration accordingly until you get the message
"PageVisits configuration test passed.".
- Copy the PageVisits.pl into your CGI binaries directory;
this is the directory you specified as $CGI_Directory. Make sure
the program has execute permission for all users (on Unix-like
systems, you can use the command "chmod 755 PageVisits.pl"
set global execute permission).
- Create a subdirectory within your CGI binaries directory
to contain the digits for the counters and the files which
contain the visits count for each page with a counter. I
call this directory PageVisits on my system.
Copy all of the digit images from the "digits"
directory in the distribution into this directory. (If
you only plan to use a limited selection of fonts, feel
free to copy only the digits for the fonts you'll be
using.) The directory and the digit files within it should
be readable by all users.
- Create a test counter data file. This file is stored
in the directory you just created along with the digit
images. This file must be owned and writable by the
user ID under which your Web server runs CGI programs.
This differs from system to system; you'll need to
figure out how your server is configured to set
the proper ownership on this file. Note that to change
the file ownership, you must be logged in as the
super-user (root). One trick you can use to determine
the user ID of your HTTP server on Unix systems is to
enter the command "ps -ef | grep httpd"
which will list the processes belonging to the server,
giving the user name. On Red Hat Linux systems, the user
and group IDs are both "apache", and you could
create the test counter data file as follows, assuming
the current directory is the CGI binaries directory and
you called the digit subdirectory PageVisits.
# Create counter file and set count to 0
echo 0 >PageVisits/test.dat
chown apache:apache PageVisits/test.dat
- Install the test.html file included with the distribution
in a temporary location on your Web server and access it with
a browser. If everything's installed properly, you should see
the counter display as 1 the first time you load the page and
increment each time you reload it. If instead you see the text
"PageVisits Error", something went wrong. Consult your Web server's
error log to identify the source of the error and correct it.
- Now you're ready to add counters to Web pages on your site.
Remove the PageVisits/test.dat counter file and the
test.html test page from your server and follow
the instructions below to add counters to your own pages.
Adding Counters to Your Web Pages
Now that your counter is working with the test page, you can add
counters to pages on your Web site. For each page you wish to
have a counter, create a counter data file in the
PageVists
directory (or whatever you used as the "
dir=" specification
in the test HTML file). You can use as many different counter data
directories as you wish, but since each must contain the digit images,
proliferating data directories wastes space. It's usually best to
have a single directory with separate counter data files for each
page which maintains a counter. It's generally wise to give the counter
data directories names derived from that of the page they're associated
with. For example, suppose you wish to add a counter to a page on your
site whose URL is:
http://www.ratburger.net/recipes/roquefort_ratsauce.html
You might add a file for this page to the
PageVists
directory named "
roquefort_ratsauce.dat". Be sure
to set the permissions on this file so it can be read and written
by the user ID under which CGI programs run, as you did with the
test.dat file, and to initialise the counter value to
"0" in the first line of the file. Now you can add the counter
to your page with HTML like the following:
Visits to this page:
<img src="/cgi-bin/PageVisits.pl?dir=PageVists&pageid=roquefort_ratsauce"
align="bottom" alt="Page Visit Counter" />
Test the page and verify that the counter appears and updates each
time you reload the page. If you get a "broken image" instead of the
counter, look at your Web server's error log to to see what went wrong;
almost any error which causes PageVisits to fail will place a message
in the log. If your site is so busy that Web server error messages
scroll by faster than you can read them (as was the case here
during the
distributed
denial of service attack against Fourmilab in early 2004), you can
filter for lines containing the substring "
PageVisits" to exclude
other error messages.
Once the counter's working, you may wish to adjust its alignment on the page
by changing the "
align=" specification in the
<img> URL,
and/or choose a different font (see below) to better conform to your page
design.
Configuring Referer Validation
In an ideal world, or indeed, even the far from ideal Internet of a
few years ago, the above counter configuration would suffice.
Regrettably, the "faculty club" Internet of yore has devolved into
today's
Internet slum,
and when adding an active resource to a Web server, however
rudimentary, one mustn't forget the bars on the windows, the
motion detectors, and the vicious dog.
In the case of a Web counter, the risk consists of a malicious
individual with too much time on their hands "running the counter"
to inflate its value to absurd numbers in the interest of making
it look like you're claiming bogus numbers of visits to your page
or, more seriously, using the counter to consume resources on
your Web server and thus keep it from processing legitimate
requests in a timely fashion (hence, a "denial of service attack").
The best way to guard against this is to bind each counter to the
complete URL of the page which references it. Since this
URL will contain the name of your site, which is extracted
from the page containing the reference to the counter, attackers
won't easily be able to "hijack" your counter. To bind the
counter to a page on your site, add a second line to the counter
data file, after the line with the number of page visits. This
line should begin with the string "
HTTP_REFERER=" followed
by the complete URL of the page which references the counter,
for example:
HTTP_REFERER=http://www.ratburger.net/recipes/roquefort_ratsauce.html
With this as the second line in your
roquefort_ratsauce.dat
file, references to the counter in other pages on other sites
will return a "broken image", neither updating the counter nor
consuming resources on the Web server to create the counter image.
Messages in the server error log will notify you of the attack, but
the attacker won't receive any information which indicates what
went wrong. Restricting access to your counter based on the
HTTP_REFERER
of the page which references it provides only protection against
malicious users embedding your counter in other pages or running it
up by repeatedly requesting its URL directly.
Note that referer validation provides only a rudimentary degree of
security. An attacker who knows enough to forge requests from a
browser can circumvent verification of the referring page. But then
an attacker could run up your counter simply by requesting reloads
of the page which references it over and over in a script. To cope
with such attacks, you need comprehensive defence against denial of service
attacks like the
Invisible Gardol Shield
employed at this site. Perhaps some day I'll find the time to
adequately document
that package, which is dozens of times
more difficult to configure and install than PageViews, but when
you need it, it's worth it.
Using Supplied Fonts
Seven fonts are included with PageVisits; a specimen of each font is
given in the following table.
Font name |
Height |
Digits |
times |
13 |
|
courier |
12 |
|
helvetica |
13 |
|
newsgothic |
26 |
|
brushscript |
23 |
|
curlz |
27 |
|
arabic |
36 |
|
By default, the "
times" font is used. To specify a different font, add a
"
font=" argument to the PageVisits image request URL. For example, to
use the
courier font, you might specify the following:
Visits to this page:
<img src="/cgi-bin/PageVisits.pl?dir=PageVists&pageid=rat_a_tailie&font=courier"
align="bottom" alt="Page Visit Counter" />
A font simply consists of ten files, one for each of the digits from
0 to 9, in Netpbm "PPM" format. The digits may be black and white
bitmaps, grey scale images, or full colour--regardless of the image type,
the file name for a given digit is always the font name followed by the digit
with an extension of "
.ppm, for example "
times4.ppm".
Adding Custom Fonts
If none of the fonts included with PageVisits appeals to you, it's easy to make
your own font. For example, here's a font I created just for
arachnophiles.
To create a font, simply create an image containing the digits from 0
through 9 in the font and size you prefer.
It's best to include two or three spaces between the digits; that makes
it easier to isolate them. Load the image into your favourite
image editing program and select each digit and paste it into
a blank file, then save each digit's image as a separate file
in Netpbm Portable Pixmap (
.ppm) format, with the name
of the font followed by the digit number. All of the images should
have the same height and be clipped with the same baseline.
The digit images can have different widths--if the font isn't
monospace, you should crop the digits to have uniform space
to the left and right of the digit.
Although the digit images always have an extension of
.ppm
they may, in fact, be monochrome bitmaps (PBM), grey scale images
(PGM), or full-colour (PPM); the digits in a font don't even need
to all be the same image type. To save space, save the digit images
in the most compact form: if the images are grey-scale, saving them
as PGM reduces the digit file size by a factor of three; they're
pure black and white bitmaps, saving them as PBM compresses them
24 times compared to PPM format.
After you've made the bitmaps for your new font, it's a good
idea to test them to make sure the vertical alignment and spacing
between digits looks good. You can override the value of a counter
by specifying a "
value=number" argument in the
request URL, for example:
<img src="/cgi-bin/PageVisits.pl?dir=PageVists&pageid=test&font=myfont&value=1234567890"
align="bottom" alt="Page Visit Counter" />
Note that even when you're testing a counter with a "
value="
specification the "
dir=" and "
pageid=" specifications
must specify a valid directory and counter file. The value of the
counter will not, however, be incremented.
Okay, okay. You can use the following link to download the spider
font. Use it wisely.
This software is in the public domain. Permission to use, copy,
modify, and distribute this software and its documentation for
any purpose and without fee is hereby granted, without any
conditions or restrictions. This software is provided "as is"
without express or implied warranty.
by John Walker
13th October 2004