|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--edu.stanford.nlp.trees.SentenceNormalizer
A class for sentence normalization. Part of the job of a
SentenceNormalizer
is to encode what is a sentence end.
The default one does no
normalization, but implements Penn Treebank rules for a sentence end.
Other sentence normalizers will change various node labels.
Another operation that a SentenceNormalizer
may wish to perform is interning the String
's passed to
it. A Singleton. Designed to be overriden.
Constructor Summary | |
SentenceNormalizer()
|
Method Summary | |
boolean |
endSentenceToken(String token,
String prev,
String next)
Returns true if this token represents the end of a sentence. |
boolean |
eolIsSentenceEnd()
This function can be checked by a SentenceReader so as
to know whether an end-of-line is always to be treated as an
end-of-sentence. |
Sentence |
normalizeSentence(Sentence sent,
LabelFactory lf)
Normalize a sentence -- this method assumes that the argument that it is passed is the whole (linguistic) Sentence . |
String |
normalizeString(String word)
Normalizes a read string word (and maybe intern it). |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public SentenceNormalizer()
Method Detail |
public String normalizeString(String word)
word
- The word to normalize
public Sentence normalizeSentence(Sentence sent, LabelFactory lf)
Sentence
.
It is normally implemented as a List-walking routine. It is
assumed that the unnormalized sentence can be destructively
modified, as it is otherwise unneeded.
sent
- The sentence to be normalizedlf
- the LabelFactory to create new words (if needed)
public boolean eolIsSentenceEnd()
SentenceReader
so as
to know whether an end-of-line is always to be treated as an
end-of-sentence. If this is true, then the
endSentenceToken()
function is not used.
public boolean endSentenceToken(String token, String prev, String next)
token
- The String
to be checkedprev
- The previous tokennext
- The next token (lookahead)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |