[ Pagewise preview ]
| Category | Value | ||
| Available via | http://dbpubs.stanford.edu/pub/2002-11 | ||
| Submitted on | 20th of February 2002 | ||
| Author | Kamvar, Sepandar D.; Klein, Dan; Manning, Christopher D. | ||
| Title | Interpreting and Extending Classical Agglomerative Clustering Algorithms using a Model-Based Approach | ||
| Date of publication | February 2002 | ||
| Citation | Kamvar, Sepandar D.; Klein, Dan; Manning, Christopher D.. Interpreting and Extending Classical Agglomerative Clustering Algorithms using a Model-Based Approach, Stanford Technical Report, 2002. | ||
| Number of pages | 8 | ||
| Language | English | ||
| Project | Natural Language Processing Group | ||
| Type | Technical Report | ||
| Subject group | Computer Science; Data Mining; Miscellaneous | ||
| Abstract | We present two results which arise from a model-based approach to hierarchical agglomerative clustering. First, we show formally that the common heuristic agglomerative clustering algorithms -- single-link, complete-link, group-average, and Ward's method -- are each equivalent to a hierarchical model-based method. This interpretation gives a theoretical explanation of the empirical behavior of these algorithms, as well as a principled approach to resolving practical issues, such as number of clusters or the choice of method. Second, we show how a model-based approach can be used to extend these basic agglomerative algorithms. We introduce adjusted complete-link, Mahalanobis-link, and line-link as variants of the classical agglomerative methods, and demonstrate their utility. | ||
| Keywords | clustering, probabilistic models, model-based clustering, hierarchical clustering | ||
| Contact address | klein@cs.stanford.edu | ||
| Fulltext source |
| Management of the document by | siroker@db.stanford.edu
| |
[ Pagewise preview ]