Junghoo Cho, Hector Garcia-Molina
Abstract: Crawlers vary dramatically since the underlying crawling applications have different requirements. For instance, personal crawlers expect less bandwidth resources than AltaVista's crawler, while image crawlers open far fewer socket connections per second than text crawlers since image objects are signiffcantly larger than a typical HTML page. These different crawlers can be classified into a taxonomy.
Author's Comments: Rough thoughts on different kinds of crawlers.
Status: PRIVATE
Click here to see the full text of SIDL-WP-1999-0113 (PDF)
Version | Format | Date | Comments |
---|---|---|---|
1 | PS | 9/22/1999 | Rough thoughts on different kinds of crawlers. |