What if a website doesn't want Google to index it and redisplay their content in the index? What do they do? They set up a robots.txt of course.
RSS/ATOM/RDF feeds were initially set up so people could use it for free. It was free beer (meaning: at no cost) and free speach (meaning: freedom) all rolled into one. But now all this free content is being used cleverly by companies to earn money. So my free beer in my RSS feed is becoming beer for sale on your website. No wonder people cry 'no fair'.
What's the solution? I think there are three ways to go about this:
- All bloggers can include YPN/Google (or AmazonSense) into your post and then set it free to show up everywhere, on any website as long as it continues to contain your publishing ad (the tax for displaying the content)
- Bloggers force people to click through to their own website by stating somewhere that commercial publishers can only use the first X chars of the post
- Bloggers stating somewhere that no one (not reBlogger, not Technorati, not bloglines) can reprint their content.
OK, so I've said this before – perhaps not as clearly. I italicized "stating somewhere" because that is the key thing.Setting boundaries:
- If you have a website, you set boundaries (go/no go) areas for a bot on your website you use a robtos.txt
But how do you set boundaries and publishing rights for your blog or news feed?
On this page, which is a list of posts by the blogger called "Search Views"
we must develop something like 'this is my feed, I want to remove it' which leads to a page that says something like:
Remove my feed
We recognize that the author owns their content and we are taking steps to enable the author to protect their content. If this is your blog feed and you do not want it to appear here or on any other website which aggregates content, you need to insert xyz into your rss feed and on the next fetch by our software, we will automatically remove your feed and remove all pages related to your feed. Inserting xzy into your rss feed is the same as placing a robots.txt onto your site, to tell robots you do not want to be crawled. We apologize for the inconvenience caused to you.There is perhaps an attribute which you can set in the element XYZ in your feed:
- XYZ 1 – full feed syndication (it is assumed that the displaying website does not remove your adverts from Google or YPN or other, and a link back to your site is provided)
- XYZ 2 – Partial syndication (In RSS 2.0 the website must use the shortened description, not the full feed, in other feeds the website only displays the first X chars and the reader is forced to click through to the originating website)
- XYZ 3 – No syndication (this removes the feed and all posts from reBloggers, blogline, technorati etc.)
The XYZ element is a tricky thing in XML. Straight forward to do, but to get it right is important.
These are just my ideas. I don't know what it should be – but it should be extensible. Clearly it is something that will need eventually to be submitted to a standard body. But this solution right away will get their stuff (which they do own and should control) off reBlogger (and other aggregation) websites.
Aggregation is the way of the future, because society always mashes things together to form new super-structures, but only when the rights of the person/company providing the various contributing aspects are protected.
Syndication, mashing, aggregation will become a tidal wave… so we have to find ways to protect the authors. If only we had done this for painters and sculptors – they sell the content once and can never collect a usage tax ever again.