Pagewise preview ]

CategoryValue
Available viahttp://dbpubs.stanford.edu/pub/2002-30
Submitted on 12th of June 2002
Author Sriram Raghavan and Hector Garcia-Molina
Title Representing Web Graphs
Date of publication June 2002
Citation Sriram Raghavan and Hector Garcia-Molina. Representing Web Graphs,
Number of pages 27
Language English
Project Digital Libraries
Type Technical Report
Subject group Databases and the Web
Abstract A Web repository is a large special-purpose collection of Web pages and associated indexes. Many useful queries and computations over such repositories involve traversal and navigation of the Web graph. However, efficient traversal of huge Web graphs containing several hundred million vertices and a few billion edges is a challenging problem. This is further complicated by the lack of a schema to describe the structure of Web graphs. As a result, naive graph representation schemes can significantly increase query execution time and limit the usefulness of Web repositories. In this paper, we propose a novel representation for Web graphs, called an \textit{S-Node representation}. We demonstrate that S-Node representations are highly space-efficient, enabling in-memory processing of very large Web graphs. In addition, we present detailed experiments that show that by exploiting some empirically observed properties of Web graphs, S-Node representations significantly reduce query execution times when compared with other schemes for representing Web graphs.
Keywords S-Node representation, Web graphs, Web repositories, graph traversal, WebBase, navigation queries
Fulltext source
  • Postscript (ps, ps.gz, ps.zip)
  • PDF (pdf, pdf.gz, pdf.zip)
  • Plain text (text, text.gz, text.zip)
  • Management of the document bypubs@db.stanford.edu

    Pagewise preview ]


    Stanford InfoLab Publication Server