Lowering traffic from offsite feeds, failed

2007-11-27

You know, you think you've got a good idea to prevent gigs of traffic and loads of parsetime on your server, seems the idea is practically crap.My current project uses several offsite rss/atom feeds inline in several pages. The most obvious (and easy) way is to let php do all the work. Fetch the feed (file_get_contents() ftw!) and pass it onto an xml parser (I'm using preg_match_all() because my requirements are not so tight).The whole thing will cost quite some traffic and parsetime. For example, the feed of my blog is at the moment of writing this blog 15k. Parsing of it is not significant by itself but does add up in the whole picture. Let's say you have 5 of those feeds in a certain page, 50 pagehits per minute, 300 hits an hour, 7200 hits a day, 36.000 feed hits, 540.000.000 bytes (~515mb) of traffic. Worst part is you're probably only showing 1k per feed (so 180k of the total) and the whole thing is probably hardly used anyways (I mean, come on, when was the last time you actually clicked on an embedded news feed?). Kind of a waste, isn't it?But it seems to be the way it has to be. Of course there are tricks to lower the projected traffic. You can request the time the document was last updated or you can cache feeds locally and only request them at a certain interval (like every 5 minutes).My solution for this problem was loading, processing and displaying the feeds by the computers of the visitors. Requesting the page in javascript (AJAX), run it by some XML parser and let DOM display it.Luckily I decided to take an external xml testfile early. That way I ran into odd behaviour pretty quickly. Especially since the error was inconsistent (local files were no problem).The cause was a securityfeature of the browser. In FF and Opera even more stricter then IE by the way. Oh, you can do it, but it entails the user has to manually add the current domain to trusted zones and for Mozilla even signing your scripts. Yeah, like... no nevermind.So, it's a pity, but this seems to be a dead end.