Site-based crawl
Lookup DNS only once per server
Retrieve robots.txt once per server
Exploit the locality of link structure
Parallelizable
Easy to enforce site-based policy
Previous slide
Next slide
Back to first slide
View graphic version