Friday, June 13, 2008

Robot Exclusion Protocol

Google, Yahoo! & Microsoft Talk About 'Robot Exclusion Protocol'!

Google Webmaster Central Blog, Yahoo Search Blog and the Microsoft Live Search Webmaster Center Blog have come out with quite informative documentation about Robot Exclusion Protocol, Last year in February, I had put up a post informing our readers about Google's thoughts on the Robots Exclusion Protocol. All three have come out with REP features documentation at the same time. This makes it mighty easier for users to know about the techniques employed by all three Search Engines for the Robot Exclusion Protocol.

Here is what all three Blogs are saying in Unison:

Google Webmaster Blog:

For the last couple of years Google, Yahoo! and Microsoft have been collaborating to bring essential Webmaster Tools. The REP features employed by all three search engines are applicable for all crawlers or for specific crawlers by targeting them to specific user-agents, which is how any crawler identifies itself. The following are the major REP features currently in use by all three search engines.

For Robots.txt Directives

Disallow.

Allow.

Wildcard Support.

Sitemaps Location.



For Sitemaps Directives.

NOINDEX META Tag.

NOFOLLOW META Tag.

NOSNIPPET META Tag.

NOARCHIVE META Tag.

NOODP META Tag.



Yahoo! Search Blog:


Yahoo! Follows the same REP feature as used by Google and mentioned above. However, there are some Yahoo! Specific REP directives that are neither supported by Google, nor by Microsoft. These features are:

Crawl-Delay: Allows a site to delay the frequency with which a crawler checks for new content.

NOYDIR META Tag: This is similar to the NOODP META Tag above but applies to the Yahoo! Directory, instead of the Open Directory Project.

Robots-nocontent Tag: Allows you to identify the main content of your page so that the Yahoo! crawler targets the right pages on your site for specific search queries by marking out non content parts of your page.



Microsoft Live Search Webmaster Center Blog:


Even Microsoft follows the same REP Directive as Yahoo! And Google. However, as with Yahoo, Microsoft too has a dedicated REP feature that works with Microsoft and Yahoo, but not with Google.

Crawl-Delay: Allows a site to delay the frequency with which a crawler checks for new content.



Over at Matt Cutts Blog, he also mentions the similar REP directives used by Google, Microsoft and Yahoo!. However, he has also written about some other informative online documents that Google has published over the past few weeks. Some of the really interesting posts by Google so far have been.

IP delivery/geolocation/cloaking: In this post, Google explains with the help of a video, their webserving techniques related to Googlebot. This post is all about IP Delivery, Geo-location and Cloaking techniques.

Doorway Pages: Google has recently changed the definition for Doorway Pages at the Google Webmaster Help Center. This post provides the old and the new definition for the user to compare and understand the difference between the two.



This collaborative revelation is all about providing a clear picture to the Webmasters about the actual REP functionalities. Keeping track of techniques for different Search Engines is very arduous task and hence, Yahoo!, Microsoft and Google have provided a consolidated overview of the actual similarities and differences between the implementation of REP features by these three major search engines.

No comments: