edu.stanford.nlp.dbm
Class CranDocument

java.lang.Object
  |
  +--java.util.AbstractCollection
        |
        +--java.util.AbstractList
              |
              +--java.util.ArrayList
                    |
                    +--edu.stanford.nlp.dbm.BasicDocument
                          |
                          +--edu.stanford.nlp.dbm.CranDocument
All Implemented Interfaces:
Cloneable, Collection, Datum, Document, Featurizable, Labeled, List, RandomAccess, Serializable

public class CranDocument
extends BasicDocument

Stores, processes, and allows access to a Document of the format specified in the Cranfield document collection

See Also:
Serialized Form

Field Summary
 
Fields inherited from class edu.stanford.nlp.dbm.BasicDocument
labels, originalText, title
 
Fields inherited from class java.util.AbstractList
modCount
 
Constructor Summary
CranDocument()
           
 
Method Summary
protected  void parse(String text)
          Parses the given text as a Cranfield document to extract the title and text.
 
Methods inherited from class edu.stanford.nlp.dbm.BasicDocument
addLabel, asFeatures, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, init, label, labels, main, originalText, presentableText, setLabel, setLabels, setTitle, title
 
Methods inherited from class java.util.ArrayList
add, add, addAll, addAll, clear, clone, contains, ensureCapacity, get, indexOf, isEmpty, lastIndexOf, remove, removeRange, set, size, toArray, toArray, trimToSize
 
Methods inherited from class java.util.AbstractList
equals, hashCode, iterator, listIterator, listIterator, subList
 
Methods inherited from class java.util.AbstractCollection
containsAll, remove, removeAll, retainAll, toString
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.List
add, add, addAll, addAll, clear, contains, containsAll, equals, get, hashCode, indexOf, isEmpty, iterator, lastIndexOf, listIterator, listIterator, remove, remove, removeAll, retainAll, set, size, subList, toArray, toArray
 

Constructor Detail

CranDocument

public CranDocument()
Method Detail

parse

protected void parse(String text)
Parses the given text as a Cranfield document to extract the title and text.

The second line of every CRAN document has the form .T experimental investigation of the aerodynamics of a wing in a slipstream . .A etc. where .T denotes that the title is coming, and the next lines until .A specify the title.

CRAN documents denote the abstracts of each document by .W The lines following .W until the end of the document are all part of the abstract text--parses these lines to get the text.

Overrides:
parse in class BasicDocument


Stanford NLP Group