The World Wide Web Consortium (W3C) Markup Validation Service is an essential resource for Web developers who wish to create standards-compliant documents. This freely-available service checks HTML and XHTML documents for compliance with a variety of versions of the relevant standards and reports errors in a form which identifies any errors in the markup. The validator can check documents specified by a Web URL, files uploaded from the user's computer, or text pasted directly into a text box on the validator request page.
These options suffice when you're developing a new page, but if you're generating a sizable collection of documents automatically (for example, with a content management system for a Web log), or you have a complicated existing Web tree you wish to check for standards compliance, submitting each document individually for validation can become tedious. BulkValidator is a Perl program which automates the process of validating multiple documents. It submits either all of the HTML/XHTML documents in a directory or all documents in that directory and its subdirectories to the W3C validator and reports the results. For any documents which failed validation, the error reports are saved in a “discrepancies” directory whence they can be subsequently scrutinised.
BulkValidator-1.2.tar.gz: Gzipped TAR archive (12 Kb)
Included in the archive are the Perl program BulkValidator.pl and the manual page for the program extracted from the documentation embedded within it, as well as this document. You can use these files in the directory in which you extracted them or install them in your system's library directories to make them available to all users. You may wish to rename the Perl program as BulkValidator so it can be run as a regular command line program; if you do so, make sure the location of Perl in the first line of the program corresponds to where Perl is installed on your system.
This program requires the Perl modules Data::Dumper, Pod::Usage, LWP, and URI::Escape. If your Perl installation lacks one or more of these modules, you will have to install them (either system-wide or for your own user account) before you can use BulkValidator. In addition, validation of files in subdirectories requires the Unix find command. While most systems which support Perl provide this command, if it is not present (for example, on a minimalist Cygwin configuration), you will have to install it if you wish to use this feature.
BulkValidator [--copyright] [--density num] [--discrepancy dir] [--firstfiles num] [--help] [--man] [--pause num] [--rpause factor] [--shuffle] [--skipfile num] [--tree] [--validator url] [--verbose] [--version] [directory]
BulkValidator submits all of the HTML/XHTML files either in a specified directory (the current directory is assumed if none is given) or in that directory and any subdirectories to the W3C HTML validator and reports the results. The validation reports for any files which failed validation are saved for review.
All options may be abbreviated to their shortest unambiguous prefix.
Validate all HTML files in the current directory, placing
discrepancy reports in a ValidationDiscrepancies
subdirectory of the current directory.
perl BulkValidator.pl
Validate the first 10 files in alphabetical order,
then 15% of the remaining files chosen at random from the
directory /var/www/html/recipes/ratburger
and subdirectories,
placing discrepancy reports for any files which fail
validation in /home/chef/goofs
.
perl BulkValidator.pl --tree --firstfiles 10 --density 15 \ --discrepancy /home/chef/goofs \ /var/www/html/recipes/ratburger
Validate files in /var/www/html/recipes/ratburger
, saving the
pass/fail results in /home/chef/goofs/val.log
. Then, after
editing, revalidate all the files which failed to validate
the first time.
perl BulkValidator.pl /var/www/html/recipes/ratburger \ | tee /home/chef/goofs/val.log . . . Edit, edit, edit . . . perl BulkValidator.pl --skipfile /home/chef/goofs/val.log /var/www/html/recipes/ratburger
If no directory is specified on the command line, the current directory is validated.
The validation summary is written to standard output. You can redirect this to a file or make a copy with tee if you wish to use it in subsequent runs to exclude already-validated files with the --skipfile option.
The validator reports for any files which failed validation are
stored in the --discrepancy directory, which defaults to
ValidationDiscrepancies
in the current directory. Files in
this directory are named with the path name of the validated file,
with all slashes replaced by underscores. Validation reports
for files which previously failed validation but passed this time
will be automatically deleted, and the --discrepancy directory
will be removed if, at the end of the run, no files remain within it.
Please report bugs to bugs@fourmilab.ch, indicating the version numbers of BulkValidator, Perl, and the Perl LWP module installed on your system.
This software is in the public domain. Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, without any conditions or restrictions. This software is provided “as is” without express or implied warranty.
by John Walker February, 2007 |
|