Pagewise preview ]

CategoryValue
Available viahttp://dbpubs.stanford.edu/pub/2002-10
Submitted on 20th of February 2002
Author Klein, Dan; Kamvar, Sepandar D.; Manning, Christopher D.
Title From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering
Date of publication February 2002
Citation Dan Klein, Sepandar D. Kamvar, and Christopher D. Manning. From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering, Stanford Technical Report
Number of pages 8
Language English
Project Natural Language Processing Group
Type Technical Report
Subject group Computer Science; Data Mining; Miscellaneous
Abstract We present an improved method for clustering in the presence of very limited supervisory information, given as pairwise instance constraints. By allowing instance-level constraints to have space-level inductive implications, we are able to successfully incorporate constraints for a wide range of data set types. Our method greatly improves on the previously studied constrained k-means algorithm, generally requiring less than half as many constraints to achieve a given accuracy on a range of real-world data, while also being more robust when over-constrained. We additionally discuss an active learning algorithm which increases the value of constraints even further.
Keywords clustering, constrained clustering, prior knowledge
Contact address klein@cs.stanford.edu
Fulltext source
  • Postscript (ps, ps.gz, ps.zip)
  • PDF (pdf, pdf.gz, pdf.zip)
  • Plain text (text, text.gz, text.zip)
  • Management of the document bysiroker@db.stanford.edu

    Pagewise preview ]


    Stanford InfoLab Publication Server