XML::RSS(3) User Contributed Perl Documentation XML::RSS(3)NAMEXML::RSS - creates and updates RSS files
SYNOPSIS
# create an RSS 1.0 file (http://purl.org/rss/1.0/)
use XML::RSS;
my $rss = XML::RSS->new(version => '1.0');
$rss->channel(
title => "freshmeat.net",
link => "http://freshmeat.net",
description => "the one-stop-shop for all your Linux software needs",
dc => {
date => '2000-08-23T07:00+00:00',
subject => "Linux Software",
creator => 'scoop@freshmeat.net',
publisher => 'scoop@freshmeat.net',
rights => 'Copyright 1999, Freshmeat.net',
language => 'en-us',
},
syn => {
updatePeriod => "hourly",
updateFrequency => "1",
updateBase => "1901-01-01T00:00+00:00",
},
taxo => [
'http://dmoz.org/Computers/Internet',
'http://dmoz.org/Computers/PC'
]
);
$rss->image(
title => "freshmeat.net",
url => "http://freshmeat.net/images/fm.mini.jpg",
link => "http://freshmeat.net",
dc => {
creator => "G. Raphics (graphics at freshmeat.net)",
},
);
$rss->add_item(
title => "GTKeyboard 0.85",
link => "http://freshmeat.net/news/1999/06/21/930003829.html",
description => "GTKeyboard is a graphical keyboard that ...",
dc => {
subject => "X11/Utilities",
creator => "David Allen (s2mdalle at titan.vcu.edu)",
},
taxo => [
'http://dmoz.org/Computers/Internet',
'http://dmoz.org/Computers/PC'
]
);
$rss->textinput(
title => "quick finder",
description => "Use the text input below to search freshmeat",
name => "query",
link => "http://core.freshmeat.net/search.php3",
);
# Optionally mixing in elements of a non-standard module/namespace
$rss->add_module(prefix=>'my', uri=>'http://purl.org/my/rss/module/');
$rss->add_item(
title => "xIrc 2.4pre2",
link => "http://freshmeat.net/projects/xirc/",
description => "xIrc is an X11-based IRC client which ...",
my => {
rating => "A+",
category => "X11/IRC",
},
);
$rss->add_item (title=>$title, link=>$link, slash=>{ topic=>$topic });
# create an RSS 2.0 file
use XML::RSS;
my $rss = XML::RSS->new (version => '2.0');
$rss->channel(title => 'freshmeat.net',
link => 'http://freshmeat.net',
language => 'en',
description => 'the one-stop-shop for all your Linux software needs',
rating => '(PICS-1.1 "http://www.classify.org/safesurf/" 1 r (SS~~000 1))',
copyright => 'Copyright 1999, Freshmeat.net',
pubDate => 'Thu, 23 Aug 1999 07:00:00 GMT',
lastBuildDate => 'Thu, 23 Aug 1999 16:20:26 GMT',
docs => 'http://www.blahblah.org/fm.cdf',
managingEditor => 'scoop@freshmeat.net',
webMaster => 'scoop@freshmeat.net'
);
$rss->image(title => 'freshmeat.net',
url => 'http://freshmeat.net/images/fm.mini.jpg',
link => 'http://freshmeat.net',
width => 88,
height => 31,
description => 'This is the Freshmeat image stupid'
);
$rss->add_item(title => "GTKeyboard 0.85",
# creates a guid field with permaLink=true
permaLink => "http://freshmeat.net/news/1999/06/21/930003829.html",
# alternately creates a guid field with permaLink=false
# guid => "gtkeyboard-0.85"
enclosure => { url=>$url, type=>"application/x-bittorrent" },
description => 'blah blah'
);
$rss->textinput(title => "quick finder",
description => "Use the text input below to search freshmeat",
name => "query",
link => "http://core.freshmeat.net/search.php3"
);
# create an RSS 0.9 file
use XML::RSS;
my $rss = XML::RSS->new( version => '0.9' );
$rss->channel(title => "freshmeat.net",
link => "http://freshmeat.net",
description => "the one-stop-shop for all your Linux software needs",
);
$rss->image(title => "freshmeat.net",
url => "http://freshmeat.net/images/fm.mini.jpg",
link => "http://freshmeat.net"
);
$rss->add_item(title => "GTKeyboard 0.85",
link => "http://freshmeat.net/news/1999/06/21/930003829.html"
);
$rss->textinput(title => "quick finder",
description => "Use the text input below to search freshmeat",
name => "query",
link => "http://core.freshmeat.net/search.php3"
);
# print the RSS as a string
print $rss->as_string;
# or save it to a file
$rss->save("fm.rdf");
# insert an item into an RSS file and removes the oldest ones if
# there are already 15 items or more
my $rss = XML::RSS->new;
$rss->parsefile("fm.rdf");
while (@{$rss->{'items'}} >= 15)
{
shift (@{ $rss->{'items'} });
}
$rss->add_item(title => "MpegTV Player (mtv) 1.0.9.7",
link => "http://freshmeat.net/news/1999/06/21/930003958.html",
mode => 'insert'
);
# parse a string instead of a file
$rss->parse($string);
# print the title and link of each RSS item
foreach my $item (@{$rss->{'items'}}) {
print "title: $item->{'title'}\n";
print "link: $item->{'link'}\n\n";
}
# output the RSS 0.9 or 0.91 file as RSS 1.0
$rss->{output} = '1.0';
print $rss->as_string;
DESCRIPTION
This module provides a basic framework for creating and maintaining RDF
Site Summary (RSS) files. This distribution also contains many examples
that allow you to generate HTML from an RSS, convert between 0.9, 0.91,
and 1.0 version, and other nifty things. This might be helpful if you
want to include news feeds on your Web site from sources like Slashdot
and Freshmeat or if you want to syndicate your own content.
XML::RSS currently supports 0.9, 0.91, and 1.0 versions of RSS. See
http://backend.userland.com/rss091 for information on RSS 0.91. See
http://www.purplepages.ie/RSS/netscape/rss0.90.html for RSS 0.9. See
http://web.resource.org/rss/1.0/ for RSS 1.0.
RSS was originally developed by Netscape as the format for Netscape
Netcenter channels, however, many Web sites have since adopted it as a
simple syndication format. With the advent of RSS 1.0, users are now
able to syndication many different kinds of content including news
headlines, threaded measages, products catalogs, etc.
Note: In order to parse and generate dates (such as "pubDate" and
"dc:date") it is recommended to use DateTime::Format::Mail and
DateTime::Format::W3CDTF , which is what XML::RSS uses internally and
requires.
METHODS
XML::RSS->new(version=>$version, encoding=>$encoding, output=>$output,
stylesheet=>$stylesheet_url, 'xml:base'=>$base)
Constructor for XML::RSS. It returns a reference to an XML::RSS
object. You may also pass the RSS version and the XML encoding to
use. The default version is 1.0. The default encoding is UTF-8. You
may also specify the output format regardless of the input version.
This comes in handy when you want to convert RSS between versions.
The XML::RSS modules will convert between any of the formats. If
you set <encode_output> XML::RSS will make sure to encode any
entities in generated RSS. This is now on by default.
You can also pass an optional URL to an XSL stylesheet that can be
used to output an "<?xsl-stylesheet ... ?>" meta-tag in the header
that will allow some browsers to render the RSS file as HTML.
You can also set "encode_cb" to a reference to a subroutine that
will encode the output in a custom way. This subroutine accepts two
parameters: a reference to the
"XML::RSS::Private::Output::Base"-derived object (which should
normally not concern you) and the text to encode. It should return
the text to encode. If not set, then the module will encode using
its custom encoding routine.
xml:base will set an "xml:base" property as per
http://www.w3.org/TR/xmlbase/
Note that in order to encode properly, you need to handle "CDATA"
sections properly. Look at XML::RSS::Private::Output::Base's
"_default_encode()" method for how to do it properly.
add_item (title=>$title, link=>$link, description=>$desc, mode=>$mode)
Adds an item to the XML::RSS object. mode and description are
optional. The default mode is append, which adds the item to the
end of the list. To insert an item, set the mode to insert.
The items are stored in the array "@{$obj->{'items'}}" where $obj
is a reference to an XML::RSS object.
One can specify a category by using the 'category' key. 'category'
can point to an array reference of categories:
$rss->add_item(
title => "Foo&Bar",
link => "http://www.my.tld/",
category => ["OneCat", "TooCat", "3Kitties"],
);
as_string;
Returns a string containing the RSS for the XML::RSS object. This
method will also encode special characters along the way.
channel (title=>$title, link=>$link, description=>$desc,
language=>$language, rating=>$rating, copyright=>$copyright,
pubDate=>$pubDate, lastBuildDate=>$lastBuild, docs=>$docs,
managingEditor=>$editor, webMaster=>$webMaster)
Channel information is required in RSS. The title cannot be more
the 40 characters, the link 500, and the description 500 when
outputting RSS 0.9. title, link, and description, are required for
RSS 1.0. language is required for RSS 0.91. The other parameters
are optional for RSS 0.91 and 1.0.
To retrieve the values of the channel, pass the name of the value
(title, link, or description) as the first and only argument like
so:
$title = channel('title');
image (title=>$title, url=>$url, link=>$link, width=>$width,
height=>$height, description=>$desc)
Adding an image is not required. url is the URL of the image, link
is the URL the image is linked to. title, url, and link parameters
are required if you are going to use an image in your RSS file. The
remaining image elements are used in RSS 0.91 or optionally
imported into RSS 1.0 via the rss091 namespace.
The method for retrieving the values for the image is the same as
it is for channel().
parse ($string, \%options)
Parses an RDF Site Summary which is passed into parse() as the
first parameter. Returns the instance of the object so one can say
"$rss->parse($string)->other_method()".
See the add_module() method for instructions on automatically
adding modules as a string is parsed.
%options is a list of options that specify how parsing is to be
done. The available options are:
· allow_multiple
Takes an array ref of names which indicates which elements
should be allowed to have multiple occurrences. So, for
example, to parse feeds with multiple enclosures
$rss->parse($xml, { allow_multiple => ['enclosure'] });
· hashrefs_instead_of_strings
If true, then some items (so far ""description"") will become
hash-references instead of strings (with a content key
containing their content , if they have XML attributes. Without
this key, the attributes will be ignored and there will only be
a string. Thus, specifying this option may break compatibility.
· modules_as_arrays
This option when true, will parse the modules key-value-pairs
as an arrayref of "{ el => $key_name, value => $value, }" hash-
refs to gracefully handle duplicate items (see below). It will
not affect the known modules such as dc ("Dublin Core").
parsefile ($file, \%options)
Same as parse() except it parses a file rather than a string.
See the add_module() method for instructions on automatically
adding modules as a string is parsed.
save ($file)
Saves the RSS to a specified file.
skipDays (day => $day)
Populates the skipDays element with the day $day.
skipHours (hour => $hour)
Populates the skipHours element, with the hour $hour.
strict ($boolean)
If it's set to 1, it will adhere to the lengths as specified by
Netscape Netcenter requirements. It's set to 0 by default. Use it
if the RSS file you're generating is for Netcenter. strict will
only work for RSS 0.9 and 0.91. Do not use it for RSS 1.0.
textinput (title=>$title, description=>$desc, name=>$name,
link=>$link);
This RSS element is also optional. Using it allows users to submit
a Query to a program on a Web server via an HTML form. name is the
HTML form name and link is the URL to the program. Content is
submitted using the GET method.
Access to the textinput values is the the same as channel() and
image().
add_module(prefix=>$prefix, uri=>$uri)
Adds a module namespace declaration to the XML::RSS object,
allowing you to add modularity outside of the the standard RSS 1.0
modules. At present, the standard modules Dublin Core (dc) and
Syndication (syn) are predefined for your convenience. The Taxonomy
(taxo) module is also internally supported.
The modules are stored in the hash %{$obj->{'modules'}} where $obj
is a reference to an XML::RSS object.
If you want to automatically add modules that the parser finds in
namespaces, set the $XML::RSS::AUTO_ADD variable to a true value.
By default the value is false. (N.B. AUTO_ADD only updates the
%{$obj->{'modules'}} hash. It does not provide the other benefits
of using add_module.)
RSS 1.0 MODULES
XML-Namespace-based modularization affords RSS 1.0 compartmentalized
extensibility. The only modules that ship "in the box" with RSS 1.0
are Dublin Core (http://purl.org/rss/1.0/modules/dc/), Syndication
(http://purl.org/rss/1.0/modules/syndication/), and Taxonomy
(http://purl.org/rss/1.0/modules/taxonomy/). Consult the appropriate
module's documentation for further information.
Adding items from these modules in XML::RSS is as simple as adding
other attributes such as title, link, and description. The only
difference is the compartmentalization of their key/value paris in a
second-level hash.
$rss->add_item (title=>$title, link=>$link, dc=>{ subject=>$subject, creator=>$creator, date=>$date });
For elements of the Dublin Core module, use the key 'dc'. For elements
of the Syndication module, 'syn'. For elements of the Taxonomy module,
'taxo'. These are the prefixes used in the RSS XML document itself.
They are associated with appropriate URI-based namespaces:
syn: http://purl.org/rss/1.0/modules/syndication/
dc: http://purl.org/dc/elements/1.1/
taxo: http://purl.org/rss/1.0/modules/taxonomy/
The Dublin Core ('dc') hash keys may be point to an array reference,
which in turn will specify multiple such keys, and render them one
after the other. For example:
$rss->add_item (
title => $title,
link => $link,
dc => {
subject=> ["Jungle", "Desert", "Swamp"],
creator=>$creator,
date=>$date
},
);
Dublin Core elements may occur in channel, image, item(s), and
textinput -- albeit uncomming to find them under image and textinput.
Syndication elements are limited to the channel element. Taxonomy
elements can occur in the channel or item elements.
Access to module elements after parsing an RSS 1.0 document using
XML::RSS is via either the prefix or namespace URI for your
convenience.
print $rss->{items}->[0]->{dc}->{subject};
or
print $rss->{items}->[0]->{'http://purl.org/dc/elements/1.1/'}->{subject};
XML::RSS also has support for "non-standard" RSS 1.0 modularization at
the channel, image, item, and textinput levels. Parsing an RSS
document grabs any elements of other namespaces which might appear.
XML::RSS also allows the inclusion of arbitrary namespaces and
associated elements when building RSS documents.
For example, to add elements of a made-up "My" module, first declare
the namespace by associating a prefix with a URI:
$rss->add_module(prefix=>'my', uri=>'http://purl.org/my/rss/module/');
Then proceed as usual:
$rss->add_item (title=>$title, link=>$link, my=>{ rating=>$rating });
You can also set the value of the module's prefix to an array reference
of "{ el => , val => }" hash-references, in which case duplicate
elements are possible:
$rss->add_item(title=>$title, link=>$link, my=> [
{el => "rating", value => $rating1, }
{el => "rating", value => $rating2, },
]
Non-standard namespaces are not, however, currently accessible via a
simple prefix; access them via their namespace URL like so:
print $rss->{items}->[0]->{'http://purl.org/my/rss/module/'}->{rating};
XML::RSS will continue to provide built-in support for standard RSS 1.0
modules as they appear.
Non-API Methods
$rss->as_rss_0_9()
WARNING: this function is not an API function and should not be called
directly. It is kept as is for backwards compatibility with legacy
code. Use the following code instead:
$rss->{output} = "0.9";
my $text = $rss->as_string();
This function renders the data in the object as an RSS version 0.9
feed, and returns the resultant XML as text.
$rss->as_rss_0_9_1()
WARNING: this function is not an API function and should not be called
directly. It is kept as is for backwards compatibility with legacy
code. Use the following code instead:
$rss->{output} = "0.91";
my $text = $rss->as_string();
This function renders the data in the object as an RSS version 0.91
feed, and returns the resultant XML as text.
$rss->as_rss_1_0()
WARNING: this function is not an API function and should not be called
directly. It is kept as is for backwards compatibility with legacy
code. Use the following code instead:
$rss->{output} = "1.0";
my $text = $rss->as_string();
This function renders the data in the object as an RSS version 1.0
feed, and returns the resultant XML as text.
$rss->as_rss_2_0()
WARNING: this function is not an API function and should not be called
directly. It is kept as is for backwards compatibility with legacy
code. Use the following code instead:
$rss->{output} = "2.0";
my $text = $rss->as_string();
This function renders the data in the object as an RSS version 2.0
feed, and returns the resultant XML as text.
$rss->handle_char()
Needed for XML::Parser. Don't use this directly.
$rss->handle_dec()
Needed for XML::Parser. Don't use this directly.
$rss->handle_start()
Needed for XML::Parser. Don't use this directly.
BUGS
Please use rt.cpan.org for tracking bugs. The list of current open
bugs is at
<http://rt.cpan.org/Dist/Display.html?Queue=XML-RSS>.
To report a new bug, go to
<http://rt.cpan.org/Ticket/Create.html?Queue=XML-RSS>
Please include a failing test in your bug report. I'd much rather have
a well written test with the bug report than a patch.
When you create diffs (for tests or patches), please use the "-u"
parameter to diff.
SOURCE AVAILABILITY
The source is available from the GitHub repository:
<https://github.com/shlomif/perl-XML-RSS>
AUTHOR
Original code: Jonathan Eisenzopf <eisen@pobox.com>
Further changes: Rael Dornfest <rael@oreilly.com>, Ask Bjoern Hansen
<ask@develooper.com>
Currently: Shlomi Fish <shlomif@cpan.org>
COPYRIGHT
Copyright (c) 2001 Jonathan Eisenzopf <eisen@pobox.com> and Rael
Dornfest <rael@oreilly.com>, Copyright (C) 2006-2007 Ask Bjoern Hansen
<ask@develooper.com>.
LICENSEXML::RSS is free software. You can redistribute it and/or modify it
under the same terms as Perl itself.
CREDITS
Wojciech Zwiefka <wojtekz@cnt.pl>
Chris Nandor <pudge@pobox.com>
Jim Hebert <jim@cosource.com>
Randal Schwartz <merlyn@stonehenge.com>
rjp@browser.org
Kellan Elliott-McCrea <kellan@protest.net>
Rafe Colburn <rafe@rafe.us>
Adam Trickett <atrickett@cpan.org>
Aaron Straup Cope <asc@vineyard.net>
Ian Davis <iand@internetalchemy.org>
rayg@varchars.com
Shlomi Fish <shlomif@iglu.org.il>
SEE ALSOperl(1), XML::Parser(3).
perl v5.18.2 2014-05-13 XML::RSS(3)