edu.stanford.nlp.ie.hmm
Class TypedTaggedDocument

java.lang.Object
  |
  +--java.util.AbstractCollection
        |
        +--java.util.AbstractList
              |
              +--java.util.ArrayList
                    |
                    +--edu.stanford.nlp.dbm.BasicDocument
                          |
                          +--edu.stanford.nlp.ie.hmm.TypedTaggedDocument
All Implemented Interfaces:
Cloneable, Collection, Datum, Document, Featurizable, Labeled, List, RandomAccess, Serializable

public class TypedTaggedDocument
extends BasicDocument

Document whose words are TypedTaggedWord objects. When reading in text, all word types are assumed to be 0 (i.e. background state).

See Also:
getTypeSequence(), Serialized Form

Field Summary
 
Fields inherited from class edu.stanford.nlp.dbm.BasicDocument
labels, originalText, title
 
Fields inherited from class java.util.AbstractList
modCount
 
Constructor Summary
TypedTaggedDocument()
           
 
Method Summary
 int[] getTypeSequence()
          Returns an array representing the type of each word in this Document.
protected  void parse(String s)
          Tokenizes the given text to populate the list of Words this Document represents.
 
Methods inherited from class edu.stanford.nlp.dbm.BasicDocument
addLabel, asFeatures, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, label, labels, main, originalText, presentableText, setLabel, setLabels, setTitle, title
 
Methods inherited from class java.util.ArrayList
add, add, addAll, addAll, clear, clone, contains, ensureCapacity, get, indexOf, isEmpty, lastIndexOf, remove, removeRange, set, size, toArray, toArray, trimToSize
 
Methods inherited from class java.util.AbstractList
equals, hashCode, iterator, listIterator, listIterator, subList
 
Methods inherited from class java.util.AbstractCollection
containsAll, remove, removeAll, retainAll, toString
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.List
add, add, addAll, addAll, clear, contains, containsAll, equals, get, hashCode, indexOf, isEmpty, iterator, lastIndexOf, listIterator, listIterator, remove, remove, removeAll, retainAll, set, size, subList, toArray, toArray
 

Constructor Detail

TypedTaggedDocument

public TypedTaggedDocument()
Method Detail

parse

protected void parse(String s)
Description copied from class: BasicDocument
Tokenizes the given text to populate the list of Words this Document represents. The default implementation uses a SimpleTokenizer and tokenizes the entirity of the text into words. Subclasses should override this method to parse documents in non-standard formats, and/or to pull the title of the document from the text. The given text may be empty ("") but will never be null.

Overrides:
parse in class BasicDocument

getTypeSequence

public int[] getTypeSequence()
Returns an array representing the type of each word in this Document. The ith element in the returned array is the type of the ith word if it implements HasType (as it should if you've constructed this normally), or 0 (i.e. background state) otherwise.



Stanford NLP Group