Pheed Read #3

Pheed Read #3 – RSS Feeds Provide Untapped Advertising Audience

"Pheed Read" is released quarterly by Pheedo and details trends in RSS usage. This new one (only available in PDF – doh!) is not terribly good.

But don't miss the previous two Pheed Reads which are superb and useful:


Blog search engines compared

This is a good post by Library Clips. Read more here: Bloglines blog search engine

The relevancy is based on subscriber numbers (so anything you write may be relevant to the search term by default, as you have lots of subscribers)…Technorati bases it on incoming links (again what about the long tail, and also this is just popularity)…whereas Sphere bases it on a number of things (incoming links, subscribers, content analysis, comments, etc…), I think these results will be more relevant and also reveal posts from blogs you don’t ususally see.

As usual John does a fab job of covering the options, choices and competitors. Library Clips rocks! More details about the new bloglines blog search can be found at TechCrunch (Finally! Bloglines Blog Search)

Google co-op

I'm still investigating the social stuff.

Google Co-op is a platform which enables you to use your expertise to help other users find information

I read some interesting comments about Co-op in Google Co-Op – Google Embracing Social Search?

There are obvious comparisons to Rollyo, Filangy, Prefound, Wink and other social search plays, but frankly Co-op just doesn’t cut it. It feels like another Google Base to me – ambitious in its scope, but utterly bamboozling to the user. Frankly, I’m not sure that Google will ever get social search right – community-building just isn’t in their DNA.

Scrolling down to the comments for that blog post… Google gets hammered for not making the application clearer and easier to use and this comment was particularly interesting:

One thing that bothers me at times about Google is how this massive corporation sometimes appears to act like an opensource, nonprofit project.

Froogle, Base, Co-op, and others all seem to depend on other people doing most of the work, and then Google ultimately owns the data.

I did some searches in the SEOData reBlogger to see what other posts on Google Co-op I could find:

The thing is… companies like Google keep on trying until they get it right. Windows 1.0 was nothing, 2.0 was nothing, 3.0 was good, 3.1 was big, 3.11 (with networking) was massive. reBlogger is the same, we'll just keep on going through the versions until we crack it. No one currently knows how to do "social" exactly right, so it's open slather for anyone. Google will eventually get this right and I suspect it will revolve around voting.

Here is the Wink collection for Google Co-op.


Tracking future Windows releases by using reBlogger

Have you seen Google trends yet? It tracks search activity and compares it to the preceding events.

I spoke about the inverse concept in the interview I did with Robin Good reBlogger: Digg-In-A-Box. The key differences are that I would track blog posts and not searches. Why? Searches are consumer oriented, but posts come directly from the source! It's obvious that consumers don't know the actual release date, but the bloggers inside the company's software team do. Their blog activity can give hints of what happening in the team. Even if the content they post doesn't specify the data – their activity could indicate something.

So Google tracks the number of searches and maps the news event that caused the surge in interest, but I'm suggesting mapping the blog activity and project that to a future event. We sometimes see predictive activity in searching, but it's only for a very widely known upcoming event: for example Christmas.

You can use reBlogger to track the activity of a particular group within a competing company. The value is huge for a business which tracks it's competition. Most company programmers blog (and have an OPML file) so I'm thinking that all we need is to find all their bloggers, group them and then count their post activity – and then generate a graph. Watch for any irregular change (a drop or a spike) and you know something is happening. It's so easy!

You can get the reBlogger 30 day free demo and install it (requires Microsoft SQL Server or SQL Server Express).

25 Things I Learned on Google Trends (humor)

Steve Rubel does some fabulous investigations.I particularly like:

15) Blogs have caught up to newspapers

18) Digg caught up to Slashdot.

19) Interest in blogs and RSS is much higher than in podcasting and wikis



Blog Search Engine Sphere Launches

I previously posted about Sphere (blog) in Sphere… of influence but their site wasn't live. It is now!

I do like the way they have different pages with different ways of looking at a query. They’ve got:

The coolest thing has to be the custom range slider. When viewing results by relevance, you can choose a date range (so they are relevant, but not ancient), and you can choose predetermined ranges, or use a slider to make your own on the fly.

UPDATE:  They have managed to get quite a bit of buzzzzz on the blogs about their launch. Check out their blogpulse.

(I found their announcement on reBlogger)

Next generation search algo

I was dreaming over the weekend (again) and I wondered… what if… what if datacenters were neural networks. Before you laugh and go somewhere else, let me explain why this idea would return the very best search results.

A neural network "learns" by being trained. There is a user who makes statements like: this is a human head. The neural network learns to recognise that image as a human head. The operator show thousands of different kinds of heads (and things that are not heads are shown as NOT a human head). Eventually the neural network begins to ask questions (is THIS a human head?) and the operator says yes or no. Over time the neural network (NN) can correctly identify a human head apart from a basket ball or fruit.

What if Google trained their datacenters to recognise good pages? Right now they are using inbound links to value a page, but that idea is time-limited. Google could use the user's clicks as training. The NN puts up a variety of pages with varying amounts of information and watches what people click on. With a cookie it can figure out which page I stayed on and which page I didn't. With this information (which Google already has) it can correlate the query (the search statement) to high performing pages.

Sure this would take time and money – but if the NN is able to correctly learn about what I am looking for and identify what page best meets my needs… then it's a killer search algo. The results page would be 100% accurate all of the time. It will always be learning how to serve up the best search results.

Sphere… of influence

Heads up! So what is Sphere? And what is so much better about it?

John Battelle's Searchblog

Sphere works better than other blog search I've seen, plain and simple…. when a Searchblog author goes off topic and rants about, say, Jet Blue, that that author's rant will probably not rank as high for "Jet Blue" as would a reputable blogger who regularly writes about travel, even if that Searchblog author has a lot of high-PageRank links into his site.

Om Malik's Broadband Blog

Think Blog Rank, Instead of Google’s Page Rank. The company has also taken a few steps to out-smart the spammers, and tend to push what seems like spam-blog way down the page. Not censuring but bringing up relevant content first. They have pronoun checker. Too many I’s could mean a personal blog, with less focused information.



Sphere is a new blog search engine that quite frankly blows everything, and I mean everything, I’ve seen out of the water in terms of relevance. Until now, no one has come up with a way to properly sort blog posts by relevance, and the general default way of showing results is “reverse-chrono”, which simply puts the newest stuff at the top. Sphere appears to have solved the problem, or at least taken big steps in the right direction. Their approach involves three key algorithms – an analysis of links into and out of a blog, an analysis of metadata around a post (links, post frequency, length of posts, etc.), and something Tony calls their “secret sauce”, which is content semantic analysis to filter out spam and to understand what a blog post is talking about. Result sets show only two posts per blog on the first page, so no one blog can dominate a category…

Jeremy Zawodny


Their technology seems far more splog (spam blog) resistant than many of the other engines. They don't actively filter it out, but the spam blogs end up being ranked so low that you rarely encounter them. That sounds like the right approach to me.

BusinessWeek's Stephen Baker interviews Tony Conrad and Mary Hodder (mp3 audio / podcast)

Looks interesting. 🙂 FWIW: I found this through SEOData Blogosphere keyword.


TypePad Sphere blog search widget (TechCrunch)