|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Object | +--edu.stanford.nlp.trees.SentenceNormalizer
A class for sentence normalization. Part of the job of a
SentenceNormalizer is to encode what is a sentence end.
The default one does no
normalization, but implements Penn Treebank rules for a sentence end.
Other sentence normalizers will change various node labels.
Another operation that a SentenceNormalizer
may wish to perform is interning the String's passed to
it. A Singleton. Designed to be overriden.
| Constructor Summary | |
SentenceNormalizer()
|
|
| Method Summary | |
boolean |
endSentenceToken(String token,
String prev,
String next)
Returns true if this token represents the end of a sentence. |
boolean |
eolIsSentenceEnd()
This function can be checked by a SentenceReader so as
to know whether an end-of-line is always to be treated as an
end-of-sentence. |
Sentence |
normalizeSentence(Sentence sent,
LabelFactory lf)
Normalize a sentence -- this method assumes that the argument that it is passed is the whole (linguistic) Sentence. |
String |
normalizeString(String word)
Normalizes a read string word (and maybe intern it). |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
public SentenceNormalizer()
| Method Detail |
public String normalizeString(String word)
word - The word to normalize
public Sentence normalizeSentence(Sentence sent,
LabelFactory lf)
Sentence.
It is normally implemented as a List-walking routine. It is
assumed that the unnormalized sentence can be destructively
modified, as it is otherwise unneeded.
sent - The sentence to be normalizedlf - the LabelFactory to create new words (if needed)
public boolean eolIsSentenceEnd()
SentenceReader so as
to know whether an end-of-line is always to be treated as an
end-of-sentence. If this is true, then the
endSentenceToken() function is not used.
public boolean endSentenceToken(String token,
String prev,
String next)
token - The String to be checkedprev - The previous tokennext - The next token (lookahead)
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||