edu.stanford.nlp.trees
Class BobChrisTreeNormalizer

java.lang.Object
  |
  +--edu.stanford.nlp.trees.TreeNormalizer
        |
        +--edu.stanford.nlp.trees.BobChrisTreeNormalizer

public class BobChrisTreeNormalizer
extends TreeNormalizer

Normalizes trees roughly the way used in Manning and Carpenter 1997. NB: This implementation is still incomplete! The normalizations performed are: (i) terminals are interned, (ii) nonterminals are stripped of alternants, functional tags and cross-reference codes (on |, =, -) and then interned, (iii) empty elements (ones with nonterminal label "-NONE-") are deleted from the tree, (iv) the null label at the root node is replaced with the label "ROOT".
17 Apr 2001: This was fixed to work with different kinds of labels, by making proper use of the Label interface, after it was moved into the trees module.

The normalizations of the original (Prolog) BobChrisNormalize were: 1. Remap the root node to be called 'ROOT' 2. Truncate all nonterminal labels before a -, = or | 3. Remap the representation of certain leaf symbols 4. Map to lowercase all leaf nodes 5. Delete empty/trace nodes (ones marked '-NONE-') 6. Recursively delete any nodes that do not dominate any words 7. Delete A over A nodes where the top A dominates nothing else 8. Remove backquotes from lexical items (the Treebank inserts them to escape slashes (/) and stars (*) Some are purely aesthetic, but 6 and 7 should presumably be added.

14 June 2002: It now deletes unary A over A if both nodes labels are equal (7), and (6) was always part of the Tree.prune() functionality...


Constructor Summary
BobChrisTreeNormalizer()
           
 
Method Summary
 String normalizeNonterminal(String category)
          Normalizes a nonterminal contents.
 String normalizeTerminal(String leaf)
          Normalizes a leaf contents.
 Tree normalizeWholeTree(Tree tree, TreeFactory tf)
          Normalize a whole tree -- one can assume that this is the root.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BobChrisTreeNormalizer

public BobChrisTreeNormalizer()
Method Detail

normalizeTerminal

public String normalizeTerminal(String leaf)
Normalizes a leaf contents. This implementation interns the leaf.

Overrides:
normalizeTerminal in class TreeNormalizer

normalizeNonterminal

public String normalizeNonterminal(String category)
Normalizes a nonterminal contents. This implementation strips functional tags, etc. and interns the nonterminal.

Overrides:
normalizeNonterminal in class TreeNormalizer

normalizeWholeTree

public Tree normalizeWholeTree(Tree tree,
                               TreeFactory tf)
Normalize a whole tree -- one can assume that this is the root. This implementation deletes empty elements (ones with nonterminal tag label '-NONE-') from the tree. It does work for a null tree.

Overrides:
normalizeWholeTree in class TreeNormalizer
Parameters:
tree - The tree to be normalized
tf - the TreeFactory to create new nodes (if needed)
Returns:
Tree the normalized tree


Stanford NLP Group