Pagewise preview ]

CategoryValue
Available viahttp://dbpubs.stanford.edu/pub/2000-37
Submitted on 15th of December 2000
Author Arasu, Arvind; Cho, Junghoo; Garcia-Molina, Hector; Paepcke, Andreas; Raghavan, Sriram
Title Searching the Web
Date of publication 2000
Number of pages 42
Language English
Project Digital Libraries; Miscellaneous
Type Technical Report
Subject group Databases and the Web
Abstract We offer an overview of current Web search engine design. After introducing a generic search engine architecture, we examine each engine component in turn. We cover crawling, local Web page storage, indexing, and the use of link analysis for boosting search performance. The most common design and implementation techniques for each of these components are presented. We draw for this presentation from the literature, and from our own experimental search engine testbed. Emphasis is on introducing the fundamental concepts, and the results of several performance analyses we conducted to compare different designs.
Keywords Search engine, crawling, indexing, link analysis, PageRank, HITS, hubs, authorities, information retrieval
Fulltext source
  • Postscript (ps, ps.gz, ps.zip)
  • PDF (pdf, pdf.gz, pdf.zip)
  • Plain text (text, text.gz, text.zip)
  • Management of the document bypubs@db.stanford.edu

    Pagewise preview ]


    Stanford InfoLab Publication Server