edu.stanford.nlp.cluster
Interface Cluster

All Superinterfaces:
Cloneable
All Known Implementing Classes:
SimpleCluster

public interface Cluster
extends Cloneable

Data Structure for Cluster.


Method Summary
 void clearData()
          Dets the value for all data in cluster = 0
 Object clone()
          Returns deep copy of Cluster
 boolean equals(Object o)
          Implements equality test.
 double evaluateIntraSimilarity(Matrix m, Array mean)
          Evaluates the cohesiveness of Cluster A low value reflects a cohesive cluster, and a high value reflects a scattered cluster;
 Array evaluateWeightedMean(Matrix m)
          Evaluates weighted mean of columns in Matrix m.
 Array get_pr_d_z()
          Returns entire probabilitiy distribution P(w|z)
 double get_pr_d_z(int datum_index)
          Returns P(d|z) for d=datum_index
 Array get_pr_w_z()
          Returns entire probability distribution P(w|z)
 double get_pr_w_z(int feature_index)
          Returns P(w|z) for w=feature_index
 double get_pr_z()
          Returns P(z)
 int getIndex()
          Returns cluster's index
 double getIntraSimilarity()
          Returns scatter value of Cluster, if it has already been computed
 Array getMean()
          Returns weighted mean of Cluster, if it has already been computed
 void set_pr_d_z(Array prdz)
          Sets entire P(d|z) array to prdz
 void set_pr_d_z(int datum_index, double value)
          Sets P(d|z) for d=datum_index to value P(d|z) is the probability of datum d, given this cluster
 void set_pr_w_z(Array prwz)
          Sets entire P(w|z) array to prwz
 void set_pr_w_z(int feature_index, double value)
          Sets P(w|z) for w=feature_index to value P(w|z) is the probability of feature w, given this cluster
 void set_pr_z(double value)
          Sets P(z) to value P(z) is the cluster probability, or cluster weight
 void setIndex(int i)
          Sets cluster index to i
 String toString(DataCollection dbm)
          Returns String representation of Cluster prints default number of features and data.
 String toString(int tnf, int tnd, DataCollection dbm)
          Returns String representation of Cluster
 String toXMLString(DataCollection dbm)
          Returns XML String representation of Cluster prints default number of features and data.
 String toXMLString(int tnf, int tnd, DataCollection dbm)
          Returns XML String representation of Cluster
 

Method Detail

set_pr_z

public void set_pr_z(double value)
Sets P(z) to value P(z) is the cluster probability, or cluster weight


set_pr_w_z

public void set_pr_w_z(int feature_index,
                       double value)
Sets P(w|z) for w=feature_index to value P(w|z) is the probability of feature w, given this cluster


set_pr_w_z

public void set_pr_w_z(Array prwz)
Sets entire P(w|z) array to prwz


set_pr_d_z

public void set_pr_d_z(int datum_index,
                       double value)
Sets P(d|z) for d=datum_index to value P(d|z) is the probability of datum d, given this cluster


set_pr_d_z

public void set_pr_d_z(Array prdz)
Sets entire P(d|z) array to prdz


setIndex

public void setIndex(int i)
Sets cluster index to i


get_pr_z

public double get_pr_z()
Returns P(z)


get_pr_w_z

public double get_pr_w_z(int feature_index)
Returns P(w|z) for w=feature_index


get_pr_w_z

public Array get_pr_w_z()
Returns entire probability distribution P(w|z)


get_pr_d_z

public double get_pr_d_z(int datum_index)
Returns P(d|z) for d=datum_index


get_pr_d_z

public Array get_pr_d_z()
Returns entire probabilitiy distribution P(w|z)


getIndex

public int getIndex()
Returns cluster's index


toXMLString

public String toXMLString(int tnf,
                          int tnd,
                          DataCollection dbm)
Returns XML String representation of Cluster

Parameters:
tnf - top n features. prints top tnf features with largest probabilities
tnd - top n datums. prints top tnd datums with largest probabilities

toXMLString

public String toXMLString(DataCollection dbm)
Returns XML String representation of Cluster prints default number of features and data. (normally 20)


toString

public String toString(int tnf,
                       int tnd,
                       DataCollection dbm)
Returns String representation of Cluster

Parameters:
tnf - top n features. prints top tnf features with largest probabilities
tnd - top n datums. prints top tnd datums with largest probabilities

toString

public String toString(DataCollection dbm)
Returns String representation of Cluster prints default number of features and data. (normally 20)


clone

public Object clone()
Returns deep copy of Cluster

Overrides:
clone in class Object

equals

public boolean equals(Object o)
Implements equality test. Returns true if o==this, and false if o!=this.

Overrides:
equals in class Object

evaluateWeightedMean

public Array evaluateWeightedMean(Matrix m)
Evaluates weighted mean of columns in Matrix m. Weights columns by P(d|z)


evaluateIntraSimilarity

public double evaluateIntraSimilarity(Matrix m,
                                      Array mean)
Evaluates the cohesiveness of Cluster A low value reflects a cohesive cluster, and a high value reflects a scattered cluster;


getIntraSimilarity

public double getIntraSimilarity()
Returns scatter value of Cluster, if it has already been computed


getMean

public Array getMean()
Returns weighted mean of Cluster, if it has already been computed


clearData

public void clearData()
Dets the value for all data in cluster = 0



Stanford NLP Group