|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
HtmlLexerConstants |
Class Summary | |
Annotator | A simple Java editor that supports the easy addition of xml-style tags to mark off portions of text. |
HtmlCleaner | HtmlCleaner removes various code elements
(style , script , applet , and so on)
from an HTML document. |
HtmlLexer | |
HtmlLexerTokenManager | |
PTBLexer | This class is a scanner generated by JFlex 1.3.5 on 8/5/02 12:16 AM from the specification file file:/dfs/hake/0/grow/lexer/jflex/ptblexer.flex |
SimpleCharStream | An implementation of interface CharStream, where the stream is assumed to contain only ASCII characters (without unicode processing). |
TaggedStreamTokenizer | TaggedStreamTokenizer is similar to
java.io.StreamTokenizer ,
except that it is better suited to deal with documents containing html-style
tags. |
Token | Describes the input token stream. |
Exception Summary | |
ParseException | This exception is thrown when parse errors are encountered. |
Error Summary | |
TokenMgrError |
A simple markup annotator, originally designed for producing training
data for supervised information extraction systems. The main class for
doing this is Annotator
.
This package currently also contains a couple of tokenizers. They should probably really live somewhere else.
|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |