|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
See:
Description
| Interface Summary | |
| DataCollection | Interface for data collections. |
| Datum | Interface for Objects which can be described by their features. |
| Document | Represents a text document as a list of Words with a title. |
| Featurizable | Interface for Objects that can be described by their features. |
| IndexedSet | List in which no duplicate Objects may be stored |
| LabeledDataCollection | Interface for hand-classified data collections. |
| MatrixWrapper | Interface for a class of objects which construct and store a Matrix. |
| Class Summary | |
| AbstractDataCollection | Abstract Data Collection. |
| BasicDatum | Basic implementation of Datum interface that can be constructed with a Collection of features and one more more labels. |
| BasicDocument | Basic implementation of Document that should be suitable for most needs. |
| BOFDataMatrix | "Bag of Features" Feature Matrix. |
| Context | One line, with word as the first element, and it's context following it on the same line. |
| Contexts | Contains methods dealing with populating a DBM given a file containing one Context per line, where the word to be disambiguated is at the top of the file |
| ContextSet | A collection of word contexts that does not allow duplicate words. |
| CranDocument | Stores, processes, and allows access to a Document of the format specified in the Cranfield document collection |
| Cranfield | Contains methods dealing with populating a DBM given a file containing all Cranfield documents |
| DataMatrix | Class with methods to construct a DataMatrix (such as a term-document matrix), from the initial Objects (such as Documents). |
| DataSet | A Data Collection that does not allow duplicate data. |
| DBIndexedSet | Implementation of IndexedSet which uses a List and a SimpleDatabase: the List to store index-object pairs, and the SimpleDatabase to store object-index pairs |
| DSDataMatrix | DataMatrix for a DataSet, where duplicate Data are not allowed. |
| FileDataCollection | DataCollection in which the Data and Features are stored in a File or Directory. |
| HTMLDocument | The HTMLDocument class implements Document methods for an HTML encoded document. |
| LabelMatrix | Wrapper for a Matrix whose columns are Label Vectors of Labeled Objects |
| LocusLink | A DataCollection where each Data Item is a LocusLink document about a gene. |
| LocusLinkDocument | A LocusLink document about a gene with LocusLink ID locusID. |
| Medline | Contains methods dealing with populating a DBM given a file containing all Medline documents |
| MedlineDocument | A Medline Document in Medline XML Format. |
| Ohsumed | Contains methods dealing with populating a DBM given a file containing all Ohsumed documents. |
| OhsumedDocument | Stores, processes, and allows access to a Document of the format specified in the Ohsumed document collection NOTE: THIS NEEDS TO BE CONVERTED TO WORK WITH BASICDOCUMENT THE WAY CRANDOCUMENT DOES |
| PersistentHashList | Persistent List backed by a SimpleDatabase |
| RestrictedDataMatrix | "Restricted Bag of Features" Feature Matrix. |
| SimpleCollection | Just like AbstractCollection, but it implements the size()
method for you in the obvious way by using the Iterator. |
| USPDI | A DataCollection where each Data Item is a LocusLink document about a gene. |
| USPDIDocument | A USPDIDocument is the relevant contents of a query on USP DI Drug Database
for the drug given by drugName. |
Classes for building and operating on documents and data collections. Two of the basic
interfaces are Document for representing a document as a list of
words with meta-data, and DataCollection for representing a
collection of documents. The most common document class you will probably use is
BasicDocument, which provides support for constructing documents
from a variety of input sources. There are several subclasses of BasicDocument that handle
special file formats and additional meta-data. The most common DataCollection class you will
probably use is FileDataCollection, which manages a group of
files and allows you to iterate through them or work with them in aggregate.
NOTE: The package name dbm is a historical anachronism, and will probably soon change to something like data.
|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||