reBlogger past, present and future

Yesterday I completed our first interview about reBlogger with Robin Good. It will be published next week on his excellent site.

I had great fun doing it – a lot more fun than I expected. I've realized that I (and our team) are very passionate about what we're doing.

In essence I sketched out the existing reBlogger 3.x and the forthcoming Next Gen version and the corporate version after that. I'll share the extremely short version here. (I look forward to reading Robin's take on our discussion)

Background to reBlogger: We began to build reBlogger almost 2 years ago for our own internal needs on TopXML. It was about a year ago that we made some copies of reBlogger available to other websites that we have close relationships with. By the end of last year reBlogger was in it's 3rd version and we had moved it to .NET because of the massive increase in development productivity over Classic ASP. As I travelled in Europe at the end of last year I was in beautiful Venice (Italy) and I read Robin's website (also in Italy). Robin wrote passionately about newsmastering. Amongst other things he wrote:

We need something of an entirely new order of magnitude to manage all of this information.

Search engines, open directories, and millions of bloggers are not enough.

We need a multi-layered, self-organizing approach that allows the load to be highly distributed and the focus and depth to be guaranteed by the combined result of many highly focused individual efforts.

As I travelled around Venice along the grand canal (and went to a wonderful masked opera), I began to see a much larger picture of what our existing reBlogger can be used for. I saw that we could provide the answer that Robin had been looking for.

I bought for a small fortune and ever since we've been gradually building reBlogger into something that will help people and companies manage the torrent of data that is flowing past them – in Octber 2005 it was called a river of news but these days it's a tidal wave of news.

reBlogger 3.x is predominantly designed for SEO companies and websites that want to track blog content in tightly grouped themes. This product is described in this post covering the "reBlogger engine" and you can view many existing websites that are built our of reBlogger 3.x One of the great things is that right out of the box reBlogger gives you excellent ranking in search engines because of it's focus on creating themes. In this way reBlogger 3.x is similar to a content manager (CM).

reBlogger Next Gen is designed for building meme or web 2.0 websites. It's an engine with a far higher level of functionality than reBlogger 3.x. It's basically DIGG-in-a-box. With the mushrooming number of 2.0 sites out there (all containing voting and Ajax coolness) there is a big need for standardization and componentization. Atlas brings that at a technology level, but we're making DIGG sites (meme, web 2.0) into a commodity that anyone can buy. By using our reBlogger Next Gen you can easily have a DIGG site working on your website. It's got all the functionality of reBlogger 3.x plus all the existing Ajax goodies that most 2.0 existing sites have – and then some extra innovations that have not been seen yet on the web, for example Hover Comments.

reBlogger corporate version is designed for… corporates. When you have 1,000 bloggers in your company, you have major headache looming. How do you track the bloggers? When you can get the blog posts of your competitors employees you have a major opportunity! What can you extract from their blogs? Sales departments want to track buzz about a product, is it increasing or declining? Marketing departments may want to generate buzz about upcoming products and compare that graphically to buzz about upcoming competitor products (think XBox 360 and PS3).

We think the enormous volume of blog content is a whole new addition to the lives of people who are connected via the internet. Everyone wants to track something of interest to them. Everyone naturally has a desire to play and explore. We have the long term vision to enable it.



Different kinds of social/meme/news sites

I have been thinking about what makes reBlogger different to the other sites. I've nosed around for an analysis of the various sites and strengths and weaknesses of each offering.

Much research is focused on Ajax and the look and feel of the sites. This post is about broader strengths and weaknesses of the sites.

I think there are three ways people commonly read the internet:

  • Passive news – you visit the sites you like (Yahoo, CNN) and read them. They inform you of what you need to know.
  • Searching news – You go to Google and find the information you know that that you want.
  • Notification of news – have a site that scans the enormous number of blogs out there and collects together the more interesting information for you.

Of the notification of news websites, I think there are four basic types:


  • Meme sites – tracking the hot (high profile, popular) conversations on the web
  • Social sites – user submitted content (sometimes highlighting esoteric past items)
  • Online news reader – read your web feeds online
  • News tracking sites – after submitting your keywords, view only the blog posts that interest you


Meme sites



They automatically scan the news and find the "cool" topics that are in vogue in blogs around the world.


They find and track "conversations" about the news and watch them for a limited time period as they develop and evolve. To track an evolving political story this is a great kind of site.


  • They track only a relatively small number of hot and active topics. They tend to ignore small threads, ideas or posts that are not popular enough to become memes.
  • They tend to only track blogs (or information available as a web feed).
  • By nature, they have a short attention span of a few days.
  • 99% of the posts that a visitor would also have found useful and interesting are completely missed because they were single posts (or short threads) and never attained "hot"ness – and therefore never surfaced and became a part of the herd consciousness.
  • All meme sites offer a search, but none ask me what keywords I want to see in every meme, so I am assaulted by many memes that are of no interest to me
  • The memes that are displayed do not correlate to my personal interests – unless I specifically choose to visit a tech meme site (see: techmeme) or a sports meme site – otherwise I am served what the meme algorithm has determined interesting – without being aware of my interests.
  • The cost of a server farm to track all 40 million blogs is very expensive – even with OSS software.

Competitive advantage:

The algorithm defining "interesting" or "hot" is the competitive advantage between these sites – the better the algorithm, the more compelling the site is.

Example of this post:

This post – although it is a useful for many people – is highly unlikely to become popular enough to become a "meme" and therefore won't enter into the herd's collective awareness as they forage for information.


Social sites



These sites rely on user submissions to identify stories and to vote on them. It also tracks coolness, but unlike meme sites, the visitors decide what is important, not a software algorithm.


  • Individual posts that might be missed by an algorithm looking for coolness are more likely to be highlighted by individual visitors.
  • The more users which interact with the site, the more useful the site is.
  • Easy to set up, lower hardware cost.


  • Too few visitors will result in too few submissions to the site. dotnetKicks has this problem, 1 submission today and 1 submission 6 days ago – nothing in between.
  • No focus on "hot conversations" – for example the top post on DIGG right now is called "What the font?" (Ever wanted to find a font just like the one used by certain websites or publications? Well now you can, using the WhatTheFont font recognition system.) with 62 diggs. This site is very likely being gamed for attracting visitors and making sales.
  • Social sites tend to encourage tagging by the submitter and searching for keywords, but again an enormous amount of information can be lost if the post is incorrectly tagged.

Competitive advantage:

Cheap startup costs, users submit the content (no need for an expensive server farm to automatically collect all 40 million blogs).

Example of this post:

This post, if read and valued by someone, may be highlighted as useful on such a site. But the readership of this blog is so small that it is unlikely that enough people will bookmark this post on any social site in order to raise it's profile, so it will also fail to be highlighted to the herd as suitable grazing material.


Online news reader



An online feed reader. You upload an OPML file or a list of blog feeds and the service collects those feeds regularly and you read them online.


Very inexpensive to run because the user submits their web feeds.


  • They don't collect all the news and track it, they rely on user to list their own blogs that they want tracked. Therefore they online have a subset of the news.
  • Even with this subset of blogs they may have too many off-topic posts, because they do not appear to offer keywords or filtering to track only the posts you want to track – and hide the posts that are off topic.


Competitive advantage:

This is probably the broadest active market containing the most people (not passive readers of sites) and these users have the simplest needs. The software is understood by the most number of people.

Example of this post:

This post is unlikely to show up on a news reader site because my RSS feed is unlikely to be in their list.


reBlogger sites (News tracking or news mastering)



reBlogger is a combination of social and online news readers. By using keywords this site only displays the information you want to read, regardless of the source of the information. User-voting also ensures that better content is more easily discovered.




  • A stronger emphasis on context (evolution over time exploring historical information)
  • The visitor can filter the news by specifying sets of keywords to track (such as climate change, PS3 or XBox) and the software watches for those keywords and notifies the visitor when the keywords are found, regardless of which blog it was found on
  • The sources can be more than blogs (such as newsgroups) and the methods of notification can be quite proactive (email etc.)


  • With 50,000 posts per hour, even with keyword usage, there is a potential problem of having far too many posts showing up to be read. To deal with this problem, we encourage voting where users assist each other by voting content up and protect each other from bad content by voting content down.
  • The site is useful as an online news reader for individuals users (and keywords make the reading even better), but for voting to be effective many users must use the site (the more the site is used, the better it is)
  • No focus on conversations and "hot" topics (unless voted for by users)
  • Because of the focus on historical data to provide context, the effectiveness is limited to the age and completeness of the archive of data

Competitive advantage:

A strong emphasis on historical context. A focus on encouraging the user to buy their own copy of the software, hopefully off-loading the demand to other people's servers.

Example of this post:

This post will automatically be collected by a reBlogger. If the user has indicated an interest in the keywords which are used in it (such as "social" or "meme") then it will show up for the users who have said they want to track these keywords.

Revenue streams?

In all cases the revenue stream is advertising, except for reBlogger, Chuquet and Megite:

License Megite Software: Email us for more info if you are interested in licensing Megite software to create Megite like web2.0 service.

The end

If you have read to the bottom, you're probably a very committed person… committed to building the perfect social/meme/news site. This post from a VC firm makes an interesting point that I hope will broaden your thinking beyond the very small number of people that you may now be targeting. Dave has a thought on it.

Get the 30 day demo of reBlogger (Windows only, requires .NET and SQL)



Digg-In-A-Box… the interview

In a previous post reBlogger past, present and future I mentioned that I had completed our first interview about reBlogger with Robin Good. I'm very pleased to point to the interview (available as audio, MP3 and text): Digg-In-A-Box? Automatic News Filtering And Aggregation? Newsmastering Engines Keep Growing: reBlogger Is Next

Here is a short quotation from the page containing the 40 minute interview. Robin says:

There is a huge, infinite market for quality, filtered information on specialized topics out there.


Because, everyone on the edge of using new media technologies today knows that the amount of information that is ALREADY coming our way now, it is just too much to handle. Tech Memeorandum, Digg, Personalized Google News, Start/Live are great, but they are only a small part of the solution.

We just need to scale up more. And that means doing ourselves the dirty work of filtering, selecting and aggregating the very best content out there on any specific imaginable topic. This is why, this is truly the job of the future, and watching only blogs, as they are today, maybe a too limiting view.

And when we say aggregate it should not mean just to aggregate blog posts, but also and evidently news, comments, video and audio clips, relevant products and services and relevant ads and commercial info on that very content theme.

And what tools do we have today to do this kind of work?

Few. In fact too few to really satisfy the soon to explode demand for these kind of publishing services that the content market will see.

And this is why I took the time to skype up Mark Wilson, CEO and founder of one of those very few, but also very promising companies already moving its early steps into this soon-to-be-blooming newsmastering industry.

reBlogger a server-based software that Mark and his team have released over a year and half ago, is a newsmastering engine that allows the creation of highly thematic and relevant newsfeeds on just about any selected topics of interest. You feed the engine with enough news sources in the format of RSS/Atom feeds and then you specify the "themes" or topics you want to be output. reBlogger does the rest.

Not only.

As you can learn by reading through, Mark and his team are working right now on the upcoming release of a full "Digg-in-a-box" type of tool, which allow any online publisher to recreate the popular and highly effective Digg-functionality on their sites to create their vertical information portals fueled by their readers.

Here, for example, is a good example of what reBlogger could do, if you wanted to build a site about the upcoming World Soccer Cup in Germany.

Read or listen to the mp3 online: Digg-In-A-Box? Automatic News Filtering And Aggregation? Newsmastering Engines Keep Growing: reBlogger Is Next


Creating the perfect Ajax UI

I am looking at this meme page about river of news.

Here are the things I like about this UI:

  1. The name of the page is SEO-friendly, which is why I could find it in Google
  2. It hides extra stuff until you need it – for example the voting icons only slide in if you hover over the post itself
  3. The voting explanation text is hidden – unless you hover over the voting area, that's the only time you need to see the details
  4. The overall usefulness of the post is indicated by an icon

Things I don't like about it:

  1. Posts that have been voted not useful are not hidden from sight but still available (condensed like this)
  2. Hovering your mouse doesn't pop-up (using Ajax) the entire post to read it quickly, you have to go to a new page to read it
  3. You can vote without actually reading the whole post!
  4. Hitting "next page" loads a new page (not 2.0 Ajax-ified)

What are you likes and dislikes in an Ajax User Interface?

Analysis of comments threshold and post voting

I have written quite a bit on meme sites and UGC (User Generated Content). You may also have noticed I am fascinated by the Slashdot vs DIGG comparison.

Below is a typical layout of reBlogger posts for a given day. It's not inspiring. But that's ok, we were targetting SEO companies who are not focussed on the user experience (ajax, voting etc.)

RB page

But as we prepare the user-theme-website reBlogger (a tool to build meme/DIGG websites) which is under-construction here and here I am exploring more and more in this blog the ideas surrounding relevancy, user-generated content, user-interaction, community involvement, exploration, semantic web, tags, research and user-context and blog-post-context (and trying to match them).

In this post I am exploring thresholds and voting. First let's look at some leading examples of each.

Haven't we figured out that the crowd is generally smarter than any one individual in the crowd? – Jeremy Zawdony

Here is a screenshot of Slashdot style "below threshold" ranking of items. The users get to vote on what is useful and what is not… but it's for comments on posts. For anyone who has tried to read the MASSES and MASSES of comments on a slashdot post, this threshold stuff is invaluable.

Slashdot page

DIGG implemented a "below threshold" concept in their system too… but it's also for comments only.

DIGG Comments

So thresholds work well for a big flow of comments. Yay. But when you have an enormous flow of posts, could we also use this threshold concept?

Of course in terms of matching my user-context and the context of the voter, if a visitor votes down any post covering a team they don't like – that should not affect my view if I like that team! Sigh. So I really should only see the effect of votes from people who have similar keywords to mine. They like the same things as me, and therefore their votes are far more likely to be relevant.

Below is my very unattractive rendition of what user-voting combined with thresholds could do for blog posts. If you add in changing background colors to highlight the items that match their selected keywords – then a visitor can FAR more easily scan a page to view items that match their keywords and are voted hot by visitors who perhaps have more time on their hands to read and vote on everything! 🙂

RB and slash page

In my imagination I could visit the World Cup 2006 site and quickly scan for hot stories that match my interests (south african or australian teams).

It would be great if unread items were bolded and read items weren't. Heh. CSS already does that. Woohoo.

OK, I user-tested this with Joel. He didn't get it. Let me try again. 🙂

Below is a typical reBlogger page on TopXML – in amongst all those posts, some are clearly better than others. How do I (a reader who is interested in the topic) determine which item to read and which to skip?

Enter stage from left: thresholding.

In this page we have hidden the items which other users have voted down, or (inversely) which have not been voted up. Now only the really good stuff is displaying and I can get to the other items if I want to.

I added three extra goodies in the picture above:

  1. faded background highlighting to draw the eye to the hidden info
  2. if some of the voted-down posts contain my keywords I have specified that I am interested in, I am notified
  3. I inserted a star in the top most post, to somehow indicate that this post is truly a winner. We have all seen these kinds of posts, they are just head and shoulders above the rest. They should get a star, so when I view the page I can immediately click on that post with the certainty that I will see a cracking-hot post.

One concern is: because this is ordered by date (not by vote) the newest posts will always have a vote of 0, and I guess 0 should be above the threshold? But that kinda defeats the idea of hiding the complexity. Sigh. Hmmm… some users will want a threshold that includes 0 (view all new items) and some will want 1, 2 or 3 or whatever. Some may even want to view all – including viewing negatively rated items.

Some sites also use grouping of similar topics… but we aren't doing that yet. Here are some examples of that.

Cloudee (below) groups similar posts, but doesn't have voting + thresholds


The ever-wonderful Chuquet (below) also groups items and handles information overload by hiding it all and saying "(45 linking articles in the last week…)" and giving a link to all of the items. Quite effective! They don't provide voting.


I see that everyone thinks that thresholds is for commenting and voting is the way to solve information overload for posts – but I think that voting can result in thresholds for posts – it's the ideal way to sort the wheat from the chaff in blog posts.


RawSugar, whonu, TalkDigger

RawSugar. I can't figure out why, but this service is not engaging. It may be brilliant, and it certainly is someone's love child that they have worked hard over… but when I view this page I find myself waiting for it to do something. It's waiting for me. Nothing happens. Why? what is missing?

Who Nu? (whonu) (blog). Clever 2.0 name! But the interface is so full of choices that I spent ages trying to figure out where to start. Then the "lightning searches" horizontal band started flashing and I couldn't make it stop. Shows great potential – if only I could make it work. Great looking site though.

These two sites might be struggling with the UI challenge I am referring to in this post: Battle of the UI’s (search *or* meme/social)

Squidoo and MySyndicaat

I am astounded at how fast this space is maturing. I found 6 potential competitors yesterday alone!

Back in November 2005 Robin Good did an interview with Giovanni Guardalben of MySyndicaat. Robin said over and over that MySyndicaat had the first usable newsmastering solution.

Now I am finding so many newcomers to this space it's astounding. For example Squidoo (blog) (article). Their primary goal is to help users create lenses. Here is a kayak fishing lense. The community clearly has a huge need to learn how to make lenses, so they have a uni and a MIRC chatroom.