Package edu.stanford.nlp.classify

Data Structures and Algorithms for Classification.

See:
          Description

Interface Summary
Classifier  
ClassifierB An interface for a classifier.
Filter A Filter takes a String object and returns the String after having applied the proper filter to it.
Smoother Interface for classes which smooth a matrix
SmootherB The Smoother should return a smoothed ProbabilitySet.
TokenReader Title: Naive Bayes Classifier Description: An interface for converting something into a continuous string Copyright: Copyright (c) 2001 Company:
 

Class Summary
AbstractClassifier  
AddEpsilonSmoother Smooths a ProbabilitySet using a HashMap vocabulary.
AddOneSmoother  
ButFilter This (very primitive) filter removes everything from the beginning of a sentence until the word "but".
CategoryMatrix Wrapper for a Matrix whose columns are P(w|c).
Classification This class holds the classification assigned to a FeatureSet
Classify Takes two arguments: the first is a directory where the training sets are held.
ClassProbability This class holds the counts and probabilities for the features of a single class (classification category).
ClassProbBuilder Accepts a feature vector and a class id, and returns a ClassProbability
CrossValidator This class takes a probability set and does cross-validation for as many folds as desired.
Feature A wrapper for a String which allows you to add other attributes, such as weights, to a Feature.
FeatureAdder This class takes a ProbabilitySet and the number of classes, and returns a new ProbabilitySet where each ClassProbability represents the probabilities for the entire class.
FeatureMaker This class accepts a string and returns a HashMap that holds the counts for each token in the string.
FeatureSet This class stores pairs of features and counts.
FeatureSetBuilder Given a HashMap of Features, and an ID (true classification), this class can create a FeatureSet.
FileSorter Takes a file that contains a set of lists of double/key pairs, one to a line, and sorts them.
FirstNFilter This filter is used if you only want to use the first n sentences in a document.
GTPreparer This class prepares the ProbabilitySet to_be_smooted for smoothing, and calls the Good Turing smoother
ID  
InfomapSmoother  
KNN K nearest-neighbors classifier.
LinkedList A linked list class.
MassVerifier Tracks the statistics for a set of test documents.
MedlineHandler  
NaiveBayesClassifier Naive Bayes Classifier.
NBClassifier The classifier takes a set of (smoothed) probability vectors and a test vector, and classifies the test vector according to the probability vectors
NegativeFilter This filter does some very primitive negative scoping, using the same method as the Almaden code.
Node For the LinkedList?
POSFilter A filter that takes a of tagged text string and a list of parts of speech to be kept, and returns the string that includes only those part of speech.
POSPrepFilter Title: Description: Copyright: Copyright (c) 2001 Company:
POSTokenReader This class reads in Epinions data that has been POS tagged using the Qtag system.
ProbabilitySet This class manipulates an array of ClassProbabilities.
RemoveSpacesFilter This filter removes the spaces in the middle of the POS tags, so that tokenization will happen properly.
ScoreHandler  
SimpleGoodTuring  
SortedArray Maintains a sorted array of ints.
Tester Builds a FeatureSet for the files in the directory "parent_dir" and attempts to classify them using the ProbabilitySet my_ps that was created during training.
TokenBreaker Title: Naive Bayes Classifier Description: Puts a space between all tokens Copyright: Copyright (c) 2001 Company:
Trainer Builds a ProbabilitySet for the classes in String[] my_dirnames and smooths that ProbabilitySet.
Verifier Compares the classification of a FeatureSet (true value) to the classification of a Classification (predicted value)
VocabBuilder Takes a ProbabilitySet and builds a hash table that contains the entire vocabulary.
 

Package edu.stanford.nlp.classify Description

Data Structures and Algorithms for Classification. This package is currently being modified. Currently implements KNN and a passable but not good version of Naive Bayes. Right now it also has a lot of stuff in there that doesn't belong (POSFilter, Smoothers, LinkedList, NegativeFilter, etc.).


Sepandar David Kamvar
Last modified: Thu Oct 31 10:56:14 PST 2002



Stanford NLP Group