Pagewise preview ]

CategoryValue
Available viahttp://dbpubs.stanford.edu/pub/1999-67
Submitted on 31st of October 2001
Author Fang, Min; Shivakumar, Narayanan; Garcia-Molina, Hector; Motwani, Rajeev; Ullman, Jeffrey D.
Title Computing Iceberg Queries Efficiently.
Date of publication 11th of November 1999
Published in Internaational Conference on Very Large Databases (VLDB'98), New York, August 1998
Citation Fang, Min; Shivakumar, Narayanan; Garcia-Molina, Hector; Motwani, Rajeev; Ullman, Jeffrey D.. Computing Iceberg Queries Efficiently., Internaational Conference on Very Large Databases (VLDB'98), New York, August 1998
Number of pages 25
Language English
Project Digital Libraries
Type Conference or Journal Paper
Subject group Digital Libraries
Abstract Many applications compute aggregate functions (such as COUNT, SUM) over an attribute (or set of attributes) to find aggregate values above some specified threshold. We call such queries iceberg queries because the number of above-threshold results is often very small (the tip of an iceberg), relative to the large amount of input data (the iceberg). Such iceberg queries are common in many applications, including data warehousing, information-retrieval, market basket analysis in data mining, clustering and copy detection. We propose efficient algorithms to evaluate iceberg queries using very little memory and significantly fewer passes over data, as compared to current techniques that use sorting or hashing. We present an experimental case study using over three gigabytes of Web data to illustrate the savings obtained by our algorithms.
Notes Previous number = SIDL-WP-1999-0121
Fulltext source
  • Postscript (ps, ps.gz, ps.zip)
  • PDF (pdf, pdf.gz, pdf.zip)
  • Plain text (text, text.gz, text.zip)
  • Management of the document bypubs@db.stanford.edu

    Pagewise preview ]


    Stanford InfoLab Publication Server