edu.stanford.nlp.cluster
Class SimpleCluster

java.lang.Object
  |
  +--edu.stanford.nlp.cluster.SimpleCluster
All Implemented Interfaces:
Cloneable, Cluster
Direct Known Subclasses:
HardCluster, HiddenState, SoftCluster

public class SimpleCluster
extends Object
implements Cluster

A simple implementation of Cluster. Warning: Does not initialize Arrays pr_w_z and pr_d_z upon construction Subclasses should have constructor initialize Arrays pr_w_z and pr_d_z


Field Summary
protected  int index
          cluster index
protected  Array mean
          weighted mean of datum in cluster
protected  int nd
          number of data
protected  int nt
          number of features
protected  Array pr_d_z
          P(d|z) for a set of w: Probability distribution of data over cluster
protected  Array pr_w_z
          P(w|z) for a set of w: Probability distribution of features over cluster
protected  double pr_z
          P(z): Probability of given cluster
protected  double scatter
          scatter value of data in cluster.
 
Constructor Summary
SimpleCluster(double prz, Array prwz, Array prdz, int i)
           
SimpleCluster(int num_features, int num_data)
           
SimpleCluster(int num_features, int num_data, int i)
           
 
Method Summary
 void clearData()
          Dets the value for all data in cluster = 0
 Object clone()
          Returns deep copy of Cluster
 boolean equals(Object o)
          Implements equality test.
 double evaluateIntraSimilarity(Matrix m, Array mean)
          Evaluates the cohesiveness of Cluster A low value reflects a cohesive cluster, and a high value reflects a scattered cluster;
 Array evaluateWeightedMean(Matrix m)
          Evaluates weighted mean of columns in Matrix m.
 Array get_pr_d_z()
          Returns entire probabilitiy distribution P(w|z)
 double get_pr_d_z(int datum_index)
          Returns P(d|z) for d=datum_index
 Array get_pr_w_z()
          Returns entire probability distribution P(w|z)
 double get_pr_w_z(int feature_index)
          Returns P(w|z) for w=feature_index
 double get_pr_z()
          Returns P(z)
 int getIndex()
          Returns cluster's index
 double getIntraSimilarity()
          Returns scatter value of Cluster, if it has already been computed
 Array getMean()
          Returns weighted mean of Cluster, if it has already been computed
 void set_pr_d_z(Array prdz)
          Sets entire P(d|z) array to prdz
 void set_pr_d_z(int datum_index, double value)
          Sets P(d|z) for d=datum_index to value P(d|z) is the probability of datum d, given this cluster
 void set_pr_w_z(Array prwz)
          Sets entire P(w|z) array to prwz
 void set_pr_w_z(int feature_index, double value)
          Sets P(w|z) for w=feature_index to value P(w|z) is the probability of feature w, given this cluster
 void set_pr_z(double value)
          Sets P(z) to value P(z) is the cluster probability, or cluster weight
 void setIndex(int i)
          Sets cluster index to i
 String toString(DataCollection dbm)
          Returns String representation of Cluster prints default number of features and data.
 String toString(int tnf, int tnd, DataCollection dbm)
          Returns String representation of Cluster
 String toXMLString(DataCollection dbm)
          Returns XML String representation of Cluster prints default number of features and data.
 String toXMLString(int tnf, int tnd, DataCollection dbm)
          Returns XML String representation of Cluster
 
Methods inherited from class java.lang.Object
finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

index

protected int index
cluster index


scatter

protected double scatter
scatter value of data in cluster. if data in cluster are very scattered, scatter will be high.


mean

protected Array mean
weighted mean of datum in cluster


pr_z

protected double pr_z
P(z): Probability of given cluster


pr_w_z

protected Array pr_w_z
P(w|z) for a set of w: Probability distribution of features over cluster


pr_d_z

protected Array pr_d_z
P(d|z) for a set of w: Probability distribution of data over cluster


nt

protected int nt
number of features


nd

protected int nd
number of data

Constructor Detail

SimpleCluster

public SimpleCluster(int num_features,
                     int num_data)

SimpleCluster

public SimpleCluster(int num_features,
                     int num_data,
                     int i)

SimpleCluster

public SimpleCluster(double prz,
                     Array prwz,
                     Array prdz,
                     int i)
Method Detail

set_pr_z

public void set_pr_z(double value)
Description copied from interface: Cluster
Sets P(z) to value P(z) is the cluster probability, or cluster weight

Specified by:
set_pr_z in interface Cluster

set_pr_w_z

public void set_pr_w_z(int feature_index,
                       double value)
Description copied from interface: Cluster
Sets P(w|z) for w=feature_index to value P(w|z) is the probability of feature w, given this cluster

Specified by:
set_pr_w_z in interface Cluster

set_pr_w_z

public void set_pr_w_z(Array prwz)
Description copied from interface: Cluster
Sets entire P(w|z) array to prwz

Specified by:
set_pr_w_z in interface Cluster

set_pr_d_z

public void set_pr_d_z(int datum_index,
                       double value)
Description copied from interface: Cluster
Sets P(d|z) for d=datum_index to value P(d|z) is the probability of datum d, given this cluster

Specified by:
set_pr_d_z in interface Cluster

set_pr_d_z

public void set_pr_d_z(Array prdz)
Description copied from interface: Cluster
Sets entire P(d|z) array to prdz

Specified by:
set_pr_d_z in interface Cluster

setIndex

public void setIndex(int i)
Description copied from interface: Cluster
Sets cluster index to i

Specified by:
setIndex in interface Cluster

get_pr_z

public double get_pr_z()
Description copied from interface: Cluster
Returns P(z)

Specified by:
get_pr_z in interface Cluster

get_pr_w_z

public double get_pr_w_z(int feature_index)
Description copied from interface: Cluster
Returns P(w|z) for w=feature_index

Specified by:
get_pr_w_z in interface Cluster

get_pr_w_z

public Array get_pr_w_z()
Description copied from interface: Cluster
Returns entire probability distribution P(w|z)

Specified by:
get_pr_w_z in interface Cluster

get_pr_d_z

public double get_pr_d_z(int datum_index)
Description copied from interface: Cluster
Returns P(d|z) for d=datum_index

Specified by:
get_pr_d_z in interface Cluster

get_pr_d_z

public Array get_pr_d_z()
Description copied from interface: Cluster
Returns entire probabilitiy distribution P(w|z)

Specified by:
get_pr_d_z in interface Cluster

getIndex

public int getIndex()
Description copied from interface: Cluster
Returns cluster's index

Specified by:
getIndex in interface Cluster

toXMLString

public String toXMLString(int tnf,
                          int tnd,
                          DataCollection dbm)
Description copied from interface: Cluster
Returns XML String representation of Cluster

Specified by:
toXMLString in interface Cluster
Parameters:
tnf - top n features. prints top tnf features with largest probabilities
tnd - top n datums. prints top tnd datums with largest probabilities

toXMLString

public String toXMLString(DataCollection dbm)
Description copied from interface: Cluster
Returns XML String representation of Cluster prints default number of features and data. (normally 20)

Specified by:
toXMLString in interface Cluster

toString

public String toString(int tnf,
                       int tnd,
                       DataCollection dbm)
Description copied from interface: Cluster
Returns String representation of Cluster

Specified by:
toString in interface Cluster
Parameters:
tnf - top n features. prints top tnf features with largest probabilities
tnd - top n datums. prints top tnd datums with largest probabilities

toString

public String toString(DataCollection dbm)
Description copied from interface: Cluster
Returns String representation of Cluster prints default number of features and data. (normally 20)

Specified by:
toString in interface Cluster

clone

public Object clone()
Description copied from interface: Cluster
Returns deep copy of Cluster

Specified by:
clone in interface Cluster
Overrides:
clone in class Object

equals

public boolean equals(Object o)
Description copied from interface: Cluster
Implements equality test. Returns true if o==this, and false if o!=this.

Specified by:
equals in interface Cluster
Overrides:
equals in class Object

evaluateWeightedMean

public Array evaluateWeightedMean(Matrix m)
Description copied from interface: Cluster
Evaluates weighted mean of columns in Matrix m. Weights columns by P(d|z)

Specified by:
evaluateWeightedMean in interface Cluster

evaluateIntraSimilarity

public double evaluateIntraSimilarity(Matrix m,
                                      Array mean)
Description copied from interface: Cluster
Evaluates the cohesiveness of Cluster A low value reflects a cohesive cluster, and a high value reflects a scattered cluster;

Specified by:
evaluateIntraSimilarity in interface Cluster

getIntraSimilarity

public double getIntraSimilarity()
Description copied from interface: Cluster
Returns scatter value of Cluster, if it has already been computed

Specified by:
getIntraSimilarity in interface Cluster

getMean

public Array getMean()
Description copied from interface: Cluster
Returns weighted mean of Cluster, if it has already been computed

Specified by:
getMean in interface Cluster

clearData

public void clearData()
Description copied from interface: Cluster
Dets the value for all data in cluster = 0

Specified by:
clearData in interface Cluster


Stanford NLP Group