|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
DataCollection | Interface for data collections. |
Datum | Interface for Objects which can be described by their features. |
Document | Represents a text document as a list of Words with a title. |
Featurizable | Interface for Objects that can be described by their features. |
IndexedSet | List in which no duplicate Objects may be stored |
LabeledDataCollection | Interface for hand-classified data collections. |
MatrixWrapper | Interface for a class of objects which construct and store a Matrix. |
Class Summary | |
AbstractDataCollection | Abstract Data Collection. |
BasicDatum | Basic implementation of Datum interface that can be constructed with a Collection of features and one more more labels. |
BasicDocument | Basic implementation of Document that should be suitable for most needs. |
BOFDataMatrix | "Bag of Features" Feature Matrix. |
Context | One line, with word as the first element, and it's context following it on the same line. |
Contexts | Contains methods dealing with populating a DBM given a file containing one Context per line, where the word to be disambiguated is at the top of the file |
ContextSet | A collection of word contexts that does not allow duplicate words. |
CranDocument | Stores, processes, and allows access to a Document of the format specified in the Cranfield document collection |
Cranfield | Contains methods dealing with populating a DBM given a file containing all Cranfield documents |
DataMatrix | Class with methods to construct a DataMatrix (such as a term-document matrix), from the initial Objects (such as Documents). |
DataSet | A Data Collection that does not allow duplicate data. |
DBIndexedSet | Implementation of IndexedSet which uses a List and a SimpleDatabase: the List to store index-object pairs, and the SimpleDatabase to store object-index pairs |
DSDataMatrix | DataMatrix for a DataSet, where duplicate Data are not allowed. |
FileDataCollection | DataCollection in which the Data and Features are stored in a File or Directory. |
HTMLDocument | The HTMLDocument class implements Document methods for an HTML encoded document. |
LabelMatrix | Wrapper for a Matrix whose columns are Label Vectors of Labeled Objects |
LocusLink | A DataCollection where each Data Item is a LocusLink document about a gene. |
LocusLinkDocument | A LocusLink document about a gene with LocusLink ID locusID . |
Medline | Contains methods dealing with populating a DBM given a file containing all Medline documents |
MedlineDocument | A Medline Document in Medline XML Format. |
Ohsumed | Contains methods dealing with populating a DBM given a file containing all Ohsumed documents. |
OhsumedDocument | Stores, processes, and allows access to a Document of the format specified in the Ohsumed document collection NOTE: THIS NEEDS TO BE CONVERTED TO WORK WITH BASICDOCUMENT THE WAY CRANDOCUMENT DOES |
PersistentHashList | Persistent List backed by a SimpleDatabase |
RestrictedDataMatrix | "Restricted Bag of Features" Feature Matrix. |
SimpleCollection | Just like AbstractCollection, but it implements the size()
method for you in the obvious way by using the Iterator . |
USPDI | A DataCollection where each Data Item is a LocusLink document about a gene. |
USPDIDocument | A USPDIDocument is the relevant contents of a query on USP DI Drug Database
for the drug given by drugName . |
Classes for building and operating on documents and data collections. Two of the basic
interfaces are Document
for representing a document as a list of
words with meta-data, and DataCollection
for representing a
collection of documents. The most common document class you will probably use is
BasicDocument
, which provides support for constructing documents
from a variety of input sources. There are several subclasses of BasicDocument that handle
special file formats and additional meta-data. The most common DataCollection class you will
probably use is FileDataCollection
, which manages a group of
files and allows you to iterate through them or work with them in aggregate.
NOTE: The package name dbm is a historical anachronism, and will probably soon change to something like data.
|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |