grab::Grab_XML(3) User Contributed Perl Documentation grab::Grab_XML(3)NAMEXMLTV::Grab_XML - Perl extension to fetch raw XMLTV data from a site
SYNOPSIS
package Grab_XML_rur;
use base 'XMLTV::Grab_XML';
sub urls_by_date( $ ) { my $pkg = shift; ... }
sub country( $ ) { my $pkg = shift; return 'Ruritania' }
# Maybe override a couple of other methods as described below...
Grab_XML_rur->go();
DESCRIPTION
This module helps to write grabbers which fetch pages in XMLTV format
from some website and output the data. It is not used for grabbers
which scrape human-readable sites.
It consists of several class methods (package methods). The way to use
it is to subclass it and override some of these.
METHODS
XMLTV::Grab_XML->date_init()
Called at the start of the program to set up Date::Manip. You
might want to override this with a method that sets the timezone.
XMLTV::Grab_XML->urls_by_date()
Returns a hash mapping YYYYMMDD dates to a URL where listings for
that date can be downloaded. This method is abstract, you must
override it.
Arguments: the command line options for --config-file and --quiet.
XMLTV::Grab_XML->xml_from_data(data)
Given page data for a particular day, turn it into XML. The
default implementation just returns the data unchanged, but you
might override it if you need to decompress the data or patch it
up.
XMLTV::Grab_XML->configure()
Configure the grabber if needed. Arguments are --config-file
option (or undef) and --quiet flag (or undef).
This method is not provided in the base class; if you don't provide
it then attempts to --configure will give a message that
configuration is not necessary.
XMLTV::Grab_XML->nextday(day)
Bump a YYYYMMDD date by one. You probably shouldn't override this.
XMLTV::Grab_XML->country()
Return the name of the country you're grabbing for, used in usage
messages. Abstract.
XMLTV::Grab_XML->usage_msg()
Return a command-line usage message. This calls "country()", so
you probably need to override only that method.
XMLTV::Grab_XML->get()
Given a URL, fetch the content at that URL. The default
implementation calls XMLTV::Get_nice::get_nice() but you might want
to override it if you need to do wacky things with http requests,
like cookies.
Note that while this method fetches a page, "xml_from_data()" does
any further processing of the result to turn it into XML.
XMLTV::Grab_XML->go()
The main program. Parse command line options, fetch and write
data.
Most of the options are fairly self-explanatory but this routine
also calls the XMLTV::Memoize module to look for a --cache
argument. The functions memoized are those given by the
"cachables()" method.
XMLTV::Grab_XML->cachables()
Returns a list of names of functions which could reasonably be
memoized between runs. This will normally be whatever function
fetches the web pages - you memoize that to save on repeated
downloads. A subclass might want to add things to this list if it
has its own way of fetching web pages.
XMLTV::Grab_XML->remove_early_stop_times()
Checks each stop time and removes it if it's before the start time.
Argument: the XML to correct Returns: the corrected XML
AUTHOR
Ed Avis, ed@membled.com
SEE ALSOperl(1), XMLTV(3).
perl v5.10.1 2006-01-08 grab::Grab_XML(3)