Gregarius plugin: do not aggregate yourself (when aggregating search feeds)

Yesterday evening I knocked together this “Paola246” aggregator site.  Paola246 aka Paolavdb is the Flemish version of “Lonelygirl15“: a fictitious online persona posing as real.  “Paola” tried to tap into social media so systematically (starting a Twitter, Flickr, Youtube and a few blog accounts in one day ), that no one seems to have ever believed she was genuine.  Still, nothing beats a good Alternate Reality Game, and I guess quite some people will have fun following the story as it unfolds.

Do not feed yourself

If you aggreate Technorati searchfeeds (blogs writing about Paola246/PaolaVdb), and you syndicate the superfeed, then chances are URLs of your aggregator site will show up in your search feeds.  That is because Technorati indexes pages and not only feeds.  So I needed to filter out all URLs from the originating site.

Gregarius plugins

Gregarius has a plugin architecture, just like WordPress (example) and Vanilla.  There’s an API you can use and a long list of “hooks“, with which can inject your own code into in the script without touching the core – so you can upgrade very easily.  Probably you will not even need to study the API, because there are already a lot of plugins available to modify.

Filtering incoming items

In my case, I started from the “PerFeedItemFilter“, that allows to filter every incoming feed with a specific regular expression.  I just wanted to prevent both aggregating and displaying (if the harm already had been done) URLs with the same hostname, so I just stripped most of the code and inserted twice a snippet like this:

      $parsed_feed_url = parse_url($item->url);
      if (is_array($parsed_feed_url)) $parsed_host = $parsed_feed_url["host"]; 
      #-- kill item if regex DOES NOT match
      if ($parsed_host == $_SERVER['HTTP_HOST']) {
         $item = NULL;

Upload and activate the plugin in the Admin Panel… and done!  Download the plugin here.

About Gregarius

Gregarius is a php-based RSS aggregator: put it on a site, add feeds to the configuration, and it will create a so-called “planet site” with all the items of the added feeds in a “river of news” display.  Typically you’ll want to follow a planet site on a specific topic or group of people when you want the maintainer of the planet to do the selection of feeds for you.

The resulting superfeed has links that don’t point to the planet, but to the individual original items.

[tags]Gregarius, Plugin, Plugins, technorati, searchfeeds, search feeds, loop, filter[/tags]
This entry was posted in Gregarius, Syndication. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *