IEMan (Stanford JavaNLP API)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.stanford.nlp.ie.pcfg
Class IEMan

java.lang.Object
  |
  +--edu.stanford.nlp.ie.pcfg.IEMan

public class IEMan
extends Object

Information Extraction Manager. Performs information extraction. Must be fed training data. Each IEMan has its own IE parameters (e.g., smoothing ratios). Altering these parameters allows one to run comparisons between two IEMans on the same data

Field Summary

List g
          not currently in use

double grammarP1
          the mixing ratio between gram4 (the lexicalized, tagged grammar) and gram3 (the tagged grammar)

double grammarP2
          the mixing ratio between gram3 (the unlexicalized, tagged grammar) and the uniform grammar (see GLUtil)

List l


double lexiconP
          the mixing ratio between the PNPC lexicon (see PNPC, XPNPC) and the uniform lexicon (see GLUtil)

int maxTagCombinationSize
          the maximum number of tags that can be identified per sentence

int numAfterthoughts
          not used currently

boolean repeat
          if repeat, the program reads the PNPC lexicon from rulesFN rather than generating it again.

static boolean sly


Constructor Summary

IEMan()
          constructs a new IEMan.

Method Summary

List GetBestTagSets(Tree tree, int numParses)
          gets the "numParses" best tag sets corresponding to a given tree.

List GetGrammar(Tree tree)
          Gets the mixed grammar that will be used to parse this tree.

List GetLexicon(Tree tree)
          Gets the mixed lexicon that will be used to parse this tree.

static HashMap GetMap(List rules)
          creates a hashmap that links rules (without probabilties) to their XRules (rules + probabilities).

List GetMissingGrammar(Tree tree)
          no longer used

static void main(String[] args)
          Analyzes training data and produces formatted grammars and lexicons.

void ScoreTagSet(Tree tree, XTagSet tagSet)
          Calculates the probability that this tagset corresponds to this tree.

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

sly

public static boolean sly

grammarP1

public double grammarP1

the mixing ratio between gram4 (the lexicalized, tagged grammar) and gram3 (the tagged grammar)

grammarP2

public double grammarP2

the mixing ratio between gram3 (the unlexicalized, tagged grammar) and the uniform grammar (see GLUtil)

lexiconP

public double lexiconP

the mixing ratio between the PNPC lexicon (see PNPC, XPNPC) and the uniform lexicon (see GLUtil)

numAfterthoughts

public int numAfterthoughts

not used currently

maxTagCombinationSize

public int maxTagCombinationSize

the maximum number of tags that can be identified per sentence

repeat

public boolean repeat

if repeat, the program reads the PNPC lexicon from rulesFN rather than generating it again. Obviously, you can only do this if you're only parsing one sentence (or if lexiconP == 0). Generating the PNPC lexicon takes a while (about a minute?) so setting repeat = true can save time, but keep in mind that generating the PNPC lexicon is a one-time-per-execution cost

g

public List g

not currently in use

l

public List l

Constructor Detail

IEMan

public IEMan()
      throws IOException

constructs a new IEMan. reads probUntaggedGrammar, probGrammar, probHeadGrammar, and listLexicon from their respective files. assumes that these files have already been written (see IEMan.main)

Method Detail

GetMap

public static HashMap GetMap(List rules)

creates a hashmap that links rules (without probabilties) to their XRules (rules + probabilities). Currently uses LookupRules as keys, although this should probably be updated (see LookupRule)

GetGrammar

public List GetGrammar(Tree tree)
                throws IOException

Gets the mixed grammar that will be used to parse this tree. See code for mixing details.

IOException

GetLexicon

public List GetLexicon(Tree tree)
                throws IOException

Gets the mixed lexicon that will be used to parse this tree. See code for mixing details.

IOException

ScoreTagSet

public void ScoreTagSet(Tree tree,
                        XTagSet tagSet)

Calculates the probability that this tagset corresponds to this tree. sets tagSet.p (see XTagSet)

GetBestTagSets

public List GetBestTagSets(Tree tree,
                           int numParses)
                    throws IOException

gets the "numParses" best tag sets corresponding to a given tree. ("best" is determined by the parse probability). Assumes that the sentence has already been parsed and lexical information has already been added to the tree

IOException

GetMissingGrammar

public List GetMissingGrammar(Tree tree)

no longer used

main

public static void main(String[] args)
                 throws IOException

Analyzes training data and produces formatted grammars and lexicons. The user must run IEMan before performing information extraction. running "IEMan 2" (i.e., IEMan.main("2")) will produce a non-tagged, non-lexicalized grammar. Running "IEMan 3" will produce a tagged, non-lexicalized grammar and lexicon. Running "IEMan 4" will produce a tagged, lexicalized grammar. All three steps must be completed before doing any information extraction. NOTE: You might imagine that all three steps could be run in one execution, but I had problems running ExtractPTBRules.main more than once in a single execution.

IOException

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

Stanford NLP Group

Field Summary
`List`	`g` not currently in use
`double`	`grammarP1` the mixing ratio between gram4 (the lexicalized, tagged grammar) and gram3 (the tagged grammar)
`double`	`grammarP2` the mixing ratio between gram3 (the unlexicalized, tagged grammar) and the uniform grammar (see GLUtil)
`List`	`l`
`double`	`lexiconP` the mixing ratio between the PNPC lexicon (see PNPC, XPNPC) and the uniform lexicon (see GLUtil)
`int`	`maxTagCombinationSize` the maximum number of tags that can be identified per sentence
`int`	`numAfterthoughts` not used currently
`boolean`	`repeat` if repeat, the program reads the PNPC lexicon from rulesFN rather than generating it again.
`static boolean`	`sly`

Method Summary
`List`	`GetBestTagSets(Tree tree, int numParses)` gets the "numParses" best tag sets corresponding to a given tree.
`List`	`GetGrammar(Tree tree)` Gets the mixed grammar that will be used to parse this tree.
`List`	`GetLexicon(Tree tree)` Gets the mixed lexicon that will be used to parse this tree.
`static HashMap`	`GetMap(List rules)` creates a hashmap that links rules (without probabilties) to their XRules (rules + probabilities).
`List`	`GetMissingGrammar(Tree tree)` no longer used
`static void`	`main(String[] args)` Analyzes training data and produces formatted grammars and lexicons.
`void`	`ScoreTagSet(Tree tree, XTagSet tagSet)` Calculates the probability that this tagset corresponds to this tree.

edu.stanford.nlp.ie.pcfg Class IEMan

sly

grammarP1

grammarP2

lexiconP

numAfterthoughts

maxTagCombinationSize

repeat

g

l

IEMan

GetMap

GetGrammar

GetLexicon

ScoreTagSet

GetBestTagSets

GetMissingGrammar

main

edu.stanford.nlp.ie.pcfg
Class IEMan