Hacker News RSS
Last Updated: 7/8/2009
Andrew Trusty has created a parametized version
of my script. You can use it to submit your own RSS feed and have it
processed. My feed url is redirecting to Andrew's version as it is being
improved and I only have so much bandwidth to give.
AppEngine quota's bit Andrew. I've relaunched his codebase on a separate
AppEngine app located here
Original Post
I have created a modified version of the Hacker News (HN) RSS feed which embeds the content of the linked article into the content feed. Instead of showing just a link to the article and discussion area on HN, the modified feed extracts the content of the article and displays that as well. This allows me to browse the article content from Google Reader instead of opening each article in its own tab. The technique for extracting the content is borrowed from Readability which seems to work well for most pages.
The feed is available here.
It is generated upon request, with caching to speed it up and errors are reported in the server logs. refreshed every 15 minutes by a cronjob which sends me an
email reporting all articles expanded and any parsing errors.
The parsing is done by a bit of Python code which is available
here. which is too ugly to
show at this time. The HTML parsing is handled by
Beautiful Soup
which can handle some types of malformed HTML. Non-absolute links are
fixed so that images appear properly and links are not relative to
the RSS url.