Archive for the ‘Syndication’ Category

Republish a feed (or other data) protected by HTTP basic access authentication

Sunday, January 20th, 2008

Some services let you access or download data on a url that is protected by username and password-popup, or “basic access authentication“, for example your twitter replies at http://twitter.com/statuses/replies.rss …:

basic-authentication-example-twitter-replies

… or your bloglines subscriptions in opml format at http://rpc.bloglines.com/listsubs :

basic-authentication-example-bloglines-subscriptions

PHP script to access authenticated URL

With the following simple php script you can access those files and make them available without authentication:

download Authentication Script zipped together with Snoopy Class file.

Unzip and configure the file/page to fetch (e.g. http://twitter.com/statuses/replies.rss for Twitter) and your username and password:

// Configuration! Change values here, 
// leave rest of script untouched.
$url_to_be_fetched           = "http://someserver/somepage/";
$username_to_be_sent        = "your_user_name";
$password_to_be_sent        = "your_password";

Upload it to a directory on your (php-capable) webserver together with the (included in the zip file) Snoopy PHP class file.

Use Cases

Keeping Private Feeds private

In case of personalised RSS feeds, most services do not protect the feed with authentication, but provide the user with a freely accessible feed at a “secret url”. As soon as you use that feed in Web-based feed readers (Bloglines, Google Reader…) chances are other users will stumble on them when searching.

Which is why Bloglines introduced an <access:restriction> element:

 <access:restriction relationship="deny" />

If a feed contains this element, it won’t show up in searches within Bloglines. Which is why I’ve included the restriction by default in the feed, although you can switch it off by setting the $feed_preferably_private to false:

// in case you use this to be able to read an authenticated feed 
// via Bloglines, you probably do not want it to show up in search 
// results
$feed_preferably_private    = true;

// end configuration

Support for access:restriction element

I haven’t found any mentions of Google Reader supporting it as well, but I have a feed in Google reader with this restriction for more than a month now, and it doesn’t show in the results when I search for it (using a different account). Some people have pointed out though that Google does allow you to share these “protected feeds” via Google Shared Items or Feedburner.

From the services with “personalised/secret feeds” I am using, Facebook supports it, most others like Google Reader Shared Items, the del.icio.us “for” feed and Wakoopa alerts don’t.

Feel free to leave your comments!

Gregarius plugin: do not aggregate yourself (when aggregating search feeds)

Thursday, May 10th, 2007

Yesterday evening I knocked together this “Paola246“ aggregator site.  Paola246 aka Paolavdb is the Flemish version of “Lonelygirl15“: a fictitious online persona posing as real.  “Paola” tried to tap into social media so systematically (starting a Twitter, Flickr, Youtube and a few blog accounts in one day ), that no one seems to have ever believed she was genuine.  Still, nothing beats a good Alternate Reality Game, and I guess quite some people will have fun following the story as it unfolds.

Do not feed yourself

If you aggreate Technorati searchfeeds (blogs writing about Paola246/PaolaVdb), and you syndicate the superfeed, then chances are URLs of your aggregator site will show up in your search feeds.  That is because Technorati indexes pages and not only feeds.  So I needed to filter out all URLs from the originating site.

Gregarius plugins

Gregarius has a plugin architecture, just like Wordpress (example) and Vanilla.  There’s an API you can use and a long list of “hooks“, with which can inject your own code into in the script without touching the core - so you can upgrade very easily.  Probably you will not even need to study the API, because there are already a lot of plugins available to modify.

Filtering incoming items

In my case, I started from the “PerFeedItemFilter“, that allows to filter every incoming feed with a specific regular expression.  I just wanted to prevent both aggregating and displaying (if the harm already had been done) URLs with the same hostname, so I just stripped most of the code and inserted twice a snippet like this:

      $parsed_feed_url = parse_url($item->url);
      if (is_array($parsed_feed_url)) $parsed_host = $parsed_feed_url["host"]; 

      #-- kill item if regex DOES NOT match
      if ($parsed_host == $_SERVER['HTTP_HOST']) {
         $item = NULL;
      }

Upload and activate the plugin in the Admin Panel… and done!  Download the plugin here.

About Gregarius

Gregarius is a php-based RSS aggregator: put it on a site, add feeds to the configuration, and it will create a so-called “planet site” with all the items of the added feeds in a “river of news” display.  Typically you’ll want to follow a planet site on a specific topic or group of people when you want the maintainer of the planet to do the selection of feeds for you.

The resulting superfeed has links that don’t point to the planet, but to the individual original items.

[tags]Gregarius, Plugin, Plugins, technorati, searchfeeds, search feeds, loop, filter[/tags]