Tuesday, April 15, 2008

Yahoo Slurp 3.0-Revised Crawler Being Rolled Out

Yahoo! has rolled out its revised crawler called Yahoo! Slurp 3.0 that would be visiting your websites. Yahoo had been been preparing for the latest version of the Yahoo! Search crawler for quite some time.

Some infrastructure updates had caused variance in crawl behviour.

The new Yahoo! Slurp 3.0 recognizes the same user-agent and all robots.txt directives for ‘Yahoo! Slurp,’ though it’ll identify itself as Slurp 3.0 in web logs.

A handful of webmasters may notice a couple of changes.

Yahoo! Search Blog has put following notes.

As the new software undergoes a phased rollout to our production crawlers over the next several weeks, you’ll see the following changes:

a) The crawlers will start crawling from a different and much smaller set of IP addresses, but it’ll still be from the crawl.yahoo.net domain. Any reverse DNS checks to identify our crawler will continue to work. Please note that if you’re using IP-based recognition of our crawlers, you might see a drop in crawl/coverage from Yahoo! We strongly recommend that you move to reverse DNS-based identification of Yahoo! Slurp if you’re using any other method to avoid this problem. The current set of IPs will disappear from your web logs in the next several weeks.

b) The crawlers will also publish a new user-agent, ‘Yahoo! Slurp/3.0.’ Existing robots.txt directives for ‘Slurp’ or ‘Yahoo! Slurp’ will continue to work, but if you have directives specific to ‘Slurp/2.0,’ they won’t be recognized by the new crawler (though usage of the ‘Slurp/2.0′ user-agent is very rare on the web, so you won’t likely be affected). We recommend specifying the shorter version of: User-agent: Slurp. Check out “How do I prevent my site or certain subdirectories from being crawled?” on our Help page for more details.

It may be noted that changes would only affect the main Yahoo! Web Search crawlers.

Specific but similar crawlers would not be affected.


homeforprofits.com

0 comments:

Blogger port by Blogger Templates

Powered by Blogger, state-of-the-art semantic personal publishing platform