edu.stanford.nlp.cluster
Class HACM

java.lang.Object
  |
  +--edu.stanford.nlp.cluster.AbstractClusteringMethod
        |
        +--edu.stanford.nlp.cluster.HACM
All Implemented Interfaces:
ClusteringMethod

public class HACM
extends AbstractClusteringMethod

Hierarchical Agglomerative Clustering Method


Field Summary
 
Fields inherited from class edu.stanford.nlp.cluster.AbstractClusteringMethod
clusters, db, method, nc, nd, nt
 
Constructor Summary
HACM(Double4Function type)
          Sets values for db, nt, nd, csf.
HACM(cern.colt.function.DoubleDoubleFunction type)
          Sets values for db, nt, nd, csf.
 
Method Summary
 Entry closest(Matrix m)
          Iterates through the similarity matrix to find the clusters with highest similarity.
 SimpleClusters cluster(DataCollection data, int num_clusters)
          Clusters documents into the desired number of clusters by Hierarchical Agglomerative Clustering
 Clusters cluster(int num_clusters, Double4Function complicatedcsf)
          Clusters using complicated cluster similarity function, such as group-average
 Clusters cluster(int num_clusters, cern.colt.function.DoubleDoubleFunction simplecsf)
          Clusters using a simple cluster similarity function, such as single-link or complete link
 void finish()
          Finalizes clusters by setting P(w|z) to be the weighted mean of the data vectors in the cluster
 
Methods inherited from class edu.stanford.nlp.cluster.AbstractClusteringMethod
cluster, evaluate, evaluate, initialize, toString, toXMLString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

HACM

public HACM(cern.colt.function.DoubleDoubleFunction type)
Sets values for db, nt, nd, csf.

Parameters:
type - Cluster Similarity Function (i.e. single-link, complete-link, group average

HACM

public HACM(Double4Function type)
Sets values for db, nt, nd, csf.

Parameters:
type - Cluster Similarity Function (i.e. single-link, complete-link, group average
Method Detail

closest

public Entry closest(Matrix m)
Iterates through the similarity matrix to find the clusters with highest similarity. returns an Entry (i,j,value), where i,j are the two closest clusters, and value is the value assigned by the similarity metric.


cluster

public SimpleClusters cluster(DataCollection data,
                              int num_clusters)
Clusters documents into the desired number of clusters by Hierarchical Agglomerative Clustering

Parameters:
num_clusters - number of final desired clusters

cluster

public Clusters cluster(int num_clusters,
                        cern.colt.function.DoubleDoubleFunction simplecsf)
Clusters using a simple cluster similarity function, such as single-link or complete link


cluster

public Clusters cluster(int num_clusters,
                        Double4Function complicatedcsf)
Clusters using complicated cluster similarity function, such as group-average


finish

public void finish()
Finalizes clusters by setting P(w|z) to be the weighted mean of the data vectors in the cluster



Stanford NLP Group