edu.stanford.nlp.dbm
Class OhsumedDocument

java.lang.Object
  |
  +--java.util.AbstractCollection
        |
        +--java.util.AbstractList
              |
              +--java.util.ArrayList
                    |
                    +--edu.stanford.nlp.dbm.BasicDocument
                          |
                          +--edu.stanford.nlp.dbm.OhsumedDocument
All Implemented Interfaces:
Cloneable, Collection, Datum, Document, Featurizable, Labeled, List, RandomAccess, Serializable

public class OhsumedDocument
extends BasicDocument

Stores, processes, and allows access to a Document of the format specified in the Ohsumed document collection NOTE: THIS NEEDS TO BE CONVERTED TO WORK WITH BASICDOCUMENT THE WAY CRANDOCUMENT DOES

See Also:
Serialized Form

Field Summary
 
Fields inherited from class edu.stanford.nlp.dbm.BasicDocument
labels, originalText, title
 
Fields inherited from class java.util.AbstractList
modCount
 
Constructor Summary
OhsumedDocument()
           
 
Method Summary
 String abstractText()
          Returns the abstract text for this document.
 boolean hasAbstract()
          Returns true if Document has abstract.
 boolean in(Set UIDs)
          Returns true if Document is in UIDs
protected  void parse(String s)
          Tokenizes the given text to populate the list of Words this Document represents.
 
Methods inherited from class edu.stanford.nlp.dbm.BasicDocument
addLabel, asFeatures, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, label, labels, main, originalText, presentableText, setLabel, setLabels, setTitle, title
 
Methods inherited from class java.util.ArrayList
add, add, addAll, addAll, clear, clone, contains, ensureCapacity, get, indexOf, isEmpty, lastIndexOf, remove, removeRange, set, size, toArray, toArray, trimToSize
 
Methods inherited from class java.util.AbstractList
equals, hashCode, iterator, listIterator, listIterator, subList
 
Methods inherited from class java.util.AbstractCollection
containsAll, remove, removeAll, retainAll, toString
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.List
add, add, addAll, addAll, clear, contains, containsAll, equals, get, hashCode, indexOf, isEmpty, iterator, lastIndexOf, listIterator, listIterator, remove, remove, removeAll, retainAll, set, size, subList, toArray, toArray
 

Constructor Detail

OhsumedDocument

public OhsumedDocument()
Method Detail

parse

protected void parse(String s)
Description copied from class: BasicDocument
Tokenizes the given text to populate the list of Words this Document represents. The default implementation uses a SimpleTokenizer and tokenizes the entirity of the text into words. Subclasses should override this method to parse documents in non-standard formats, and/or to pull the title of the document from the text. The given text may be empty ("") but will never be null.

Overrides:
parse in class BasicDocument

hasAbstract

public boolean hasAbstract()
Returns true if Document has abstract.


abstractText

public String abstractText()
Returns the abstract text for this document.


in

public boolean in(Set UIDs)
Returns true if Document is in UIDs



Stanford NLP Group