[ Pagewise preview ]
| Category | Value | ||
| Available via | http://dbpubs.stanford.edu/pub/2002-10 | ||
| Submitted on | 20th of February 2002 | ||
| Author | Klein, Dan; Kamvar, Sepandar D.; Manning, Christopher D. | ||
| Title | From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering | ||
| Date of publication | February 2002 | ||
| Citation | Dan Klein, Sepandar D. Kamvar, and Christopher D. Manning. From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering, Stanford Technical Report | ||
| Number of pages | 8 | ||
| Language | English | ||
| Project | Natural Language Processing Group | ||
| Type | Technical Report | ||
| Subject group | Computer Science; Data Mining; Miscellaneous | ||
| Abstract | We present an improved method for clustering in the presence of very limited supervisory information, given as pairwise instance constraints. By allowing instance-level constraints to have space-level inductive implications, we are able to successfully incorporate constraints for a wide range of data set types. Our method greatly improves on the previously studied constrained k-means algorithm, generally requiring less than half as many constraints to achieve a given accuracy on a range of real-world data, while also being more robust when over-constrained. We additionally discuss an active learning algorithm which increases the value of constraints even further. | ||
| Keywords | clustering, constrained clustering, prior knowledge | ||
| Contact address | klein@cs.stanford.edu | ||
| Fulltext source |
| Management of the document by | siroker@db.stanford.edu
| |
[ Pagewise preview ]