Aquarionics

Sunday 1st May 2005

Aquaintances 2

Aquaintances 2 is an XML Feed reader for Microsoft Windows, Linux and Apple OSX. It will support baysian filtering of posts – meaning posts you are more interested in will float to the top of your reading lists – as well as regex field matching. It is built in GTK/Python (on top of LibGlade) using the Mozilla Firefox GTK bindings. It parses feeds with the Ultra Liberal Feed Parser, stores them (and most other things) in an SQLite database. It will revolutionise the way you keep track of the world.

Natually, it doesn’t work yet.

This morning I fired up Glade and put down the main interface for the system, which involved a certian amount of farting around with GTK’s box model, and all was fine until I had to attempt to tie the web browser to this.

My preference for this project was to use Mozilla’s “Gecko” rendering engine for the actual displaying of feeds, which was made more difficult by the fact that there are three sets of pages refering to the GTK bindings I was looking for. PyGTKMoz is an aborted attempt to get it working, Mozilla itself has a website on the subject, and PyGTKMoz refers me to PyGTK, which doesn’t mention the bindings at all. Eventually it transpires that the Mozilla bindings are now part of “python-gnome-extras”, which an apt-get installed for me.

That failed to work, because it doesn’t require “mozilla-dev” or “mozilla-firefox-dev” and even when those are installed, I had to manually add /usr/lib/mozilla-firefox add /usr/lib.mozilla to /etc/ld.so.conf so that Python could find it.

Then I had to work out how to get a Widget Glade doesn’t know about into my nice Gladey interface, for which this article was a handy guide, although it’s bitrotted a bit.

Okay, so we have an interface that happily displays a webpage. time to put some actual feed data in it. Importing my OPML file from Bloglines, I wrote a thing to parse the XML (Starting with minidom, which failed because I needed to keep track of recursive tags. so I rebuilt it in SAX (An acronym with too many abbreviations in it) with the standard push/pop method of keeping track of where the hell we are, a method I’ve always found exceedingly ugly, but at least it’s quick…

The tree interface to the subscriptions list was next. The GtkTreeView element caused me a number of problems, partially because I was trying to use it before I really understood it. Also, all the tutorials for using it I found (iki fi, & moeraki were the most useful, as well as the Real Docs) assume you’re building the thing from scratch, rather than editing an already existing TreeView object. Also, there was an interchangability between “TreeStore” and “TreeModel” that was starting to give me a headache. Then, in a blinding flash of light, my screensaver kicked in, englightenment dawned, and I finally grokked it.

So, I have an interface with working buttons that displays the home page of every feed I subscribe to. Except three, which causes a segmentation fault for reasons I don’t understand, but are probably Not My Fault.

Aim is to get it reading and displaying feeds by tonight, with the Bayesian stuff happening tomorrow, at which point I release Real Code and start making it both cool and useable.

2005-05-02: Changed gtkmozbinding instructions, the firefox library appears to crash if you need a plugin

Those who spoke on this:

gravatar image

Stuart Langridge:

2005-05-03 14:19 2 days after the Original Article

What was wrong with minidom? Did it actually fail to parse the XML because of the nested tags? If it did then that’s a serious error in minidom, if the XML was valid…

Comment Link

gravatar image

Aquarion:

2005-05-03 14:30 11 mins after Stuart Langridge

It parsed it, but getElementsByTagName in the Body element returned a list of all the outline elements, and I completely lost the structure. I’d have ended up iterating though the list, grabbing the parent element, seeing where it fit and rebuilding some kind of associated array of the data structure. So I did it in SAX instead.

Comment Link

gravatar image

Stuart Langridge:

2005-05-03 14:35 5 mins after Aquarion

Ah, right, gotcha. That’s what getElementsByTagName() is supposed to do, of course. The solution here, I fear, is XPath. Well, your solution was SAX, but SAX gives me hives; it seems like the Wrong Way to parse XML, the same as regular expressions do. This is pure prejudice on my part, I admit.

Comment Link


Nicholas 'Aquarion' Avenell is a web developer in London, you can find out more about him or how to get in touch.

There are more Articles, Projects, Journal Entries, Photographs and things that defy description here, too.

If you're looking for something specific, there are Calendar & Category -based lists of everything.

And if you want to follow stuff that appears here, try a Syndication Feed, or the generic Feed of everything.


Aquarion's last Twitter was: [updating]
Twitter last updated


More Journal:

[RSS Icon]
[ESF Icon]
[CDF Icon]

That which is relevant:


Explain Ads
© 2000 to 2008 inclusive Nicholas Avenell
All comments are the property of their creators, published with permission
(Unless otherwise indicated, the opinions and sentiments expressed on this site are those of the author and not of any organisation of which he is an affiliate, including his employer. Caveat Lector, E&OE. sigh)
0.010 seconds, 22 queries, 2.67Mb on Tue, 26 Aug 2008 10:50:00 +0000
Generated by Epistula Version 2.0.3