I like the layout of the rssreader mozilla extension, but nothing else about it: it requires using the bookmarks (hatehate), it is in javascript (hatehatehate) and it is superslow, no caching whatsoever etc.

Why not use any of the available tickers and RSS readers?

I liked rssreader's layout and integration with mozilla. I don't like tickers, I need full articles or at least overview data to judge whether an article is worth my time and headlines Just Don't Work for me. And, the killer argument: all my bookmark info is kept in a topicmap file so any RSS reading tool must get its info from there, too. None but my personal one would do that.

So I decided to make a slurping tool that slurps feed data onto the local box and massages things into the rssreader look. Easypeasy thought I, perl to the rescue etc. etc.

Well, RSS sucks: gadzillions of slightly different versions, all incomplete and fugly. Atom sucks, too, just differently.

The one parser module present in Debian, libxml-rss-perl, doesn't understand newer RSS (ie. 2.0) at all, and no Atom, so playing with that wasn't too successful. The other potential parser, XML::RSS::Parser, is not available as a Debian package, but it sucks less: with a bit of tweakery I got it to read all the RSS flavours and also Atom. Hmm, maybe I'll package it.

Net result of a few hours of mucking around, skirting incomplete unicode support in my perl (no I don't want to update to 5.8 yet) etc. is this script called rsslurp. The link retrieval part won't be useful to anybody who's not into topicmaps (ie. most of you out there), but the part for massaging things into rssreader-compliant CSSified HTML may be. The tool also caches the source XML and produces an overview HTML page with update times and feed names.

[ published on Mon 09.08.2004 02:34 | filed in mystuff | ]
Debian Silver Server
© Alexander Zangerl