How to detect the RSS feed for a blog

Every wondered how to automatically figure out the RSS feed for a blog?

Generally speaking, it’s a simple task — just download the HTML for the given blog and use a fancy regular expression to find the associated RSS feed. In PHP, it looks something like this:

$bloghtml = file_get_contents($blogurl);
preg_match('/<link.*types*=s*["']*application/rss+xml["']*.*hrefs*=s*["']?([^'" >]+)['" >]/i', $bloghtml, $match);
$rssurl = $match[1];

The main problem with this approach is that some blogs take a long time to load — and that often translates to your application being slow as well. On top of that, it’s frustrating to have to download and process an entire page of HTML just to extract one URL.

Recently Google came out with a better solution in the form of their AJAX Feed API. Using their API, detecting feeds is now easier, faster and more reliable:

$lookup_url = "http://ajax.googleapis.com/ajax/services/feed/lookup?v=1.0&q=".urlencode($blogurl);
$result = curl($lookup_url);

I’ve been using this API for about a month now and have really appreciated the improvements. If you need to detect feeds, give it a try. I think you’ll like it.

Comments

  1. Josh Fraser said at 10:15 pm on July 29th, 2009:

    Just noticed that my regexp is order-sensitive — it will fail if you put the href attribute in front of type. Make sure you fix that if you decide to copy and paste.


  2. Josh Fraser said at 10:16 pm on July 29th, 2009:

    Just noticed that my regexp is order-sensitive (it will break if you put the href attribute in front of the type attribute). Make sure you fix that if you decide to copy and paste.


  3. sandrar said at 7:03 am on September 10th, 2009:

    Hi! I was surfing and found your blog post… nice! I love your blog. 🙂 Cheers! Sandra. R.


  4. angelina jolie said at 9:18 am on September 10th, 2009:

    I love your site. 🙂 Love design!!! I just came across your blog and wanted to say that I


  5. @islandsmooth said at 10:29 pm on April 20th, 2010:

    Just what I needed!

    Many thanks!


  6. Bob Barcus said at 7:38 am on September 26th, 2010:

    I can't get it to work… any help?


  7. Frank said at 9:32 am on August 15th, 2014:

    ThGoogle api is the fastest option, however, it does not detect the feed url sometimes making it quite unreliable. In my tests, it failed to get the feed url for one WORDPRESS blog (It got many others though)!


  8. Bodhisattva Builder said at 2:06 pm on August 14th, 2015:

    The php way is better as google only reads the RSS meta hints. With PHP you could also conditionally query known feed urls that the website may not provide meta hints for.

    if(@file_get_contents($url)){
    preg_match_all(‘//’, file_get_contents($url), $matches);
    if(isset($matches[1][0])){
    $this->feed->url = $matches[1][0];
    }elseif(@file_get_contents($this->feed->url.’/feed’)){
    $this->feed->url = $this->feed->url.’/feed’;
    }

    $this->feed->save();