Pagewise preview ]

CategoryValue
Available viahttp://dbpubs.stanford.edu/pub/2000-29
Next version(s) 2000-55
Submitted on 31st of July 2000
Author Melnik, Sergey; Raghavan, Sriram; Yang, Beverly; Garcia-Molina, Hector
Title Building a Distributed Full-Text Index for the Web
Date of publication July 2000
Citation Melnik, Sergey; Raghavan, Sriram; Yang, Beverly; Garcia-Molina, Hector. Building a Distributed Full-Text Index for the Web,
Number of pages 23
Language English
Project Digital Libraries
Type Technical Report
Subject group Databases and the Web; Digital Libraries
Abstract We identify crucial design issues in building a distributed inverted index for a large collection of web pages. We introduce a novel pipelining technique for structuring the core index-building system that substantially reduces the index construction time. We also propose a storage scheme for creating and managing inverted files using an embedded database system. We propose and compare different strategies for addressing various issues relevant to distributed index construction. Finally, we present performance results from experiments on a testbed distributed indexing system that we have implemented.
Keywords Full-text index, Web, WebBase, Text retrieval
Notes Extended version of paper submitted to ICDE 2001
Fulltext source
  • Postscript (ps, ps.gz, ps.zip)
  • PDF (pdf, pdf.gz, pdf.zip)
  • Plain text (text, text.gz, text.zip)
  • Management of the document bypubs@db.stanford.edu

    Pagewise preview ]


    Stanford InfoLab Publication Server