Over the weekend, Yahoo’s Delicious (del.icio.us) social bookmarking property has been blocking spiders and bots from non-Yahoo search engines from crawling the site and identifying new web pages, sites and bookmarks.
Colin Cochrane found this out the other day, saying that ‘This isn’t a simple robots.txt exclusion, but rather a 404 response that is now being served based on the requesting User-Agent.’
I took a look at del.icio.us’ robots.txt and found that it was disallowing Googlebot, Slurp, Teoma, and msnbot for the following:
Disallow: /inbox
Disallow: /subscriptions
Disallow: /network
Disallow: /search
Disallow: /post
Disallow: /login
Disallow: /rssSeeing that the robots.txt was blocking these search engine spiders, I tried accessing del.icio.us with my User-Agent switcher set to each of the disallowed User-Agents and received the same 404 response for each one.
Colin also found that Delicious pages listed in Google are lacking a cache, title, description and other information.
Why would Yahoo do this?
Yahoo has a competitive advantage over Google, MSN and Ask.com by being able to identify web pages and other content via human bookmarking on Delicious before search engine bots can. Yahoo can also classify web documents via human descriptions and tagging, lending external meta data to these documents which can result in more relevant web results and intent targeted rankings.
Since Yahoo has integrated Delicious into its search results and it is quite evident that Delicious has a very important role in Yahoo Search, Yahoo is taking full advantage of its property by blocking its competition from crawling such information.
It’s a bold move by Yahoo only for the fact that Delicious is user powered, and dependent on a community of users. On the other hand, blocking your internal secrets from your competition is a basic business practice, and Yahoo has essentially set up a security fence to keep Google, Ask.com and MSN from snooping around its back yard.
None of their competitors have anything that can compare to Delicious. Google made a very large mistake by not buying StumbleUpon for this very same reason.
Bold move by Yahoo, but competitively the correct move. Your thoughts?
[Additional discussion on Sphinn]