CategoryValue
Available viahttp://dbpubs.stanford.edu/pub/2004-13
Previous version2004-12
Submitted on 9th of March 2004
Author Srivastava, Utkarsh; Widom, Jennifer
Title Memory-Limited Execution of Windowed Stream Joins
Date of publication 5th of March 2004
Published in Technical Report
Citation Srivastava, Utkarsh; Widom, Jennifer. Memory-Limited Execution of Windowed Stream Joins, Technical Report
Number of pages 12
Language English
Project STREAM
Type Other
Subject group Data Streams
Abstract We address the problem of computing approximate answers to continuous sliding-window joins over data streams when the available memory may be insufficient to keep the entire join state. One approximation scenario is to provide a maximum subset of the result, with the objective of losing as few result tuples as possible. An alternative scenario is to provide a random sample of the join result, e.g., if the output of the join is being aggregated. We show formally that neither approximation can be addressed effectively for a sliding-window join of arbitrary input streams. Previous work has addressed only the maximum-subset problem, and has implicitly used a frequency-based model of stream arrival. We address the sampling problem for this model. More importantly, we point out a broad class of applications for which an age-based model of stream arrival is more appropriate, and we address both approximation scenarios under this new model. Finally, for the case of multiple joins being executed with an overall memory constraint, we provide an algorithm for memory allocation across the joins that optimizes a combined measure of approximation in all scenarios considered. All of our algorithms are implemented and experimental results demonstrate their effectiveness.
Keywords streams, joins, approximation, memory-limited, load-shedding
Sponsored by NSF grants IIS-0118173 and IIS-9817799 and Sequoia Capital Stanford Graduate Fellowship
Fulltext source
  • Postscript (ps, ps.gz, ps.zip)
  • PDF (pdf, pdf.gz, pdf.zip)
  • Management of the document bysiroker@db.stanford.edu


    Stanford InfoLab Publication Server