Tuesday, April 22, 2008


While Google is experimenting on crawling hidden web pages through HTML forms indexing, Yahoo on the other hand has updated its search crawler with Slurp 3.0. Although the implementation of Slurp 3.0 would not really pose a big implication on webmaster’s part, just the same here are the changes that Slurp 3.0 will bring in the way it will crawl websites.

First, Slurp 3.0 will start crawling from smaller set of IP addresses, although still within crawl.yahoo.net.domain. Reverse DNS checks will still continue working. For webmasters who use IP-based recognition for identifying Yahoo crawlers, Yahoo advises to move to reverse DNS-based identification of Yahoo! Slurp to avoid getting dropped by the Yahoo Slurp 3.0 crawlers.

Second, Yahoo! Slurp 3.0 will now publish a new user-agent – “Yahoo!Slurp 3.0”. Although existing robots.txt directives for “Slurp” or “Yahoo! Slurp” will continue working, directives for “Slurp 2.0” would not work anymore. So, Yahoo suggests that webmasters use the shorter version of the User-agent which is simply – Slurp.

No comments: