Pagewise preview ]

CategoryValue
Available viahttp://dbpubs.stanford.edu/pub/2005-38
Submitted on 3rd of December 2005
Author Das Sarma, Anish; U. Nabar, Shubha; Widom, Jennifer
Title Representing Uncertain Data: Uniqueness, Equivalence, Minimization, and Approximation
Date of publication 2005
Published in Technical Report
Citation Das Sarma, Anish; U. Nabar, Shubha; Widom, Jennifer. Representing Uncertain Data: Uniqueness, Equivalence, Minimization, and Approximation, Technical Report
Number of pages 15
Language English
Project Stanford InfoLab
Type Technical Report
Subject group Miscellaneous
Abstract Many schemes have been proposed for representing uncertain data, where an uncertain database is defined as a set of {\em possible instances} for the database. Some important properties of representation schemes, such as {\em completeness} and {\em closure}, have already been considered. In this paper we identify several other interesting properties and obtain results for them with respect to a variety of different representation schemes. We first consider the {\em uniqueness} of representations, i.e., whether there is at most one representation for a set of possible instances in a particular representation scheme. For schemes that do not guarantee unique representations, we provide algorithms and complexity results for testing {\em equivalence} of representations. We also consider the problem of {\em minimizing} the size of the representation for a set of possible instances, obtaining a strong result for uncertainty schemes comprised of tuples and constraints: minimizing the number of tuples also minimizes the size of constraints. We show the NP-hardness of minimizing a representation, and study the more restricted problem of maintaining minimality when performing operations on the data. Next, we present several results on the problem of {\em approximating} an uncertain database when we wish to use a simple (incomplete) representation scheme. Finally, we give an interesting result relating closure and completeness for uncertain databases.
Keywords Uncertainty in databases, data modelling, representation schemes
Contact address anish@cs.stanford.edu
Fulltext source
  • Postscript (ps, ps.gz, ps.zip)
  • PDF (pdf, pdf.gz, pdf.zip)
  • Plain text (text, text.gz, text.zip)
  • Management of the document bysiroker@db.stanford.edu

    Pagewise preview ]


    Stanford InfoLab Publication Server