[ Pagewise preview ]
| Category | Value | ||
| Available via | http://dbpubs.stanford.edu/pub/2000-29 | ||
| Next version(s) | 2000-55 | ||
| Submitted on | 31st of July 2000 | ||
| Author | Melnik, Sergey; Raghavan, Sriram; Yang, Beverly; Garcia-Molina, Hector | ||
| Title | Building a Distributed Full-Text Index for the Web | ||
| Date of publication | July 2000 | ||
| Citation | Melnik, Sergey; Raghavan, Sriram; Yang, Beverly; Garcia-Molina, Hector. Building a Distributed Full-Text Index for the Web, | ||
| Number of pages | 23 | ||
| Language | English | ||
| Project | Digital Libraries | ||
| Type | Technical Report | ||
| Subject group | Databases and the Web; Digital Libraries | ||
| Abstract | We identify crucial design issues in building a distributed inverted index for a large collection of web pages. We introduce a novel pipelining technique for structuring the core index-building system that substantially reduces the index construction time. We also propose a storage scheme for creating and managing inverted files using an embedded database system. We propose and compare different strategies for addressing various issues relevant to distributed index construction. Finally, we present performance results from experiments on a testbed distributed indexing system that we have implemented. | ||
| Keywords | Full-text index, Web, WebBase, Text retrieval | ||
| Notes | Extended version of paper submitted to ICDE 2001 | ||
| Fulltext source |
| Management of the document by | pubs@db.stanford.edu
| |
[ Pagewise preview ]