EXCLUDE_ROBOTS(1) User Contributed Perl Documentation EXCLUDE_ROBOTS(1)NAME
exclude_robot.pl - a simple filter script to filter robots out of
logfiles
SYNOPSIS
exclude_robot.pl
-url <robot exclusions URL>
[ -exclusions_file <exclusions file> ]
<httpd log file>
OR
cat <httpd log file> | exclude_robot.pl -url <robot exclusions URL>
DESCRIPTION
This script filters HTTP log files to exclude entries that correspond
to know webbots, spiders, and other undesirables. The script requires
a URL as a command line option which should point to a text file
containing a linebreak separated list of lowercase strings to match on
for bots. This is based on the format used by ABC
(<http://www.abc.org.uk/exclusionss/exclude.html>).
The script filters httpd logfile entries either from a filename
specified on the command line, or from STDIN. It outputs filtered
entries to STDOUT.
OPTIONS-url <robot exclusions URL>
Specify the URL of file to grab which contains the list of agents
to exclude. The option is REQUIRED.
-exclusions_file <exclusions file>
Specify a file to save excluded entries from the logfile. This
option is OPTIONAL.
AUTHOR
Ave Wrigley <Ave.Wrigley@itn.co.uk>
COPYRIGHT
Copyright (c) 2001 Ave Wrigley. All rights reserved. This program is
free software; you can redistribute it and/or modify it under the same
terms as Perl itself.
POD ERRORS
Hey! The above document had some coding errors, which are explained
below:
Around line 100:
You forgot a '=back' before '=head1'
perl v5.20.2 2001-05-25 EXCLUDE_ROBOTS(1)