OhsumedDocument (Stanford JavaNLP API)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.stanford.nlp.dbm
Class OhsumedDocument

java.lang.Object
  |
  +--java.util.AbstractCollection
        |
        +--java.util.AbstractList
              |
              +--java.util.ArrayList
                    |
                    +--edu.stanford.nlp.dbm.BasicDocument
                          |
                          +--edu.stanford.nlp.dbm.OhsumedDocument

All Implemented Interfaces:: Cloneable, Collection, Datum, Document, Featurizable, Labeled, List, RandomAccess, Serializable

public class OhsumedDocument
extends BasicDocument

Stores, processes, and allows access to a Document of the format specified in the Ohsumed document collection NOTE: THIS NEEDS TO BE CONVERTED TO WORK WITH BASICDOCUMENT THE WAY CRANDOCUMENT DOES

See Also:: Serialized Form

Field Summary

Fields inherited from class edu.stanford.nlp.dbm.BasicDocument

labels, originalText, title

Fields inherited from class java.util.AbstractList

modCount

Constructor Summary

OhsumedDocument()


Method Summary

String abstractText()
          Returns the abstract text for this document.

boolean hasAbstract()
          Returns true if Document has abstract.

boolean in(Set UIDs)
          Returns true if Document is in UIDs

protected void parse(String s)
          Tokenizes the given text to populate the list of Words this Document represents.

Methods inherited from class edu.stanford.nlp.dbm.BasicDocument

addLabel, asFeatures, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, label, labels, main, originalText, presentableText, setLabel, setLabels, setTitle, title

Methods inherited from class java.util.ArrayList

add, add, addAll, addAll, clear, clone, contains, ensureCapacity, get, indexOf, isEmpty, lastIndexOf, remove, removeRange, set, size, toArray, toArray, trimToSize

Methods inherited from class java.util.AbstractList

equals, hashCode, iterator, listIterator, listIterator, subList

Methods inherited from class java.util.AbstractCollection

containsAll, remove, removeAll, retainAll, toString

Methods inherited from class java.lang.Object

finalize, getClass, notify, notifyAll, wait, wait, wait

Methods inherited from interface java.util.List

add, add, addAll, addAll, clear, contains, containsAll, equals, get, hashCode, indexOf, isEmpty, iterator, lastIndexOf, listIterator, listIterator, remove, remove, removeAll, retainAll, set, size, subList, toArray, toArray

Constructor Detail

OhsumedDocument

public OhsumedDocument()

Method Detail

parse

protected void parse(String s)

Description copied from class: BasicDocument

Tokenizes the given text to populate the list of Words this Document represents. The default implementation uses a SimpleTokenizer and tokenizes the entirity of the text into words. Subclasses should override this method to parse documents in non-standard formats, and/or to pull the title of the document from the text. The given text may be empty ("") but will never be null.

Overrides:: parse in class BasicDocument

hasAbstract

public boolean hasAbstract()

Returns true if Document has abstract.

abstractText

public String abstractText()

Returns the abstract text for this document.

in

public boolean in(Set UIDs)

Returns true if Document is in UIDs