|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--edu.stanford.nlp.ie.merge.AbstractInstanceMerger | +--edu.stanford.nlp.ie.merge.GenericMerger
Provides basic merging functionality.
Field Summary | |
protected HashMap |
ignoredFields
stores the names of fields that don't participate in the merge |
protected HashMap |
ignoredPenalties
stores the names of fields that are allowed to conflict |
Constructor Summary | |
GenericMerger()
Empty constructor; subclasses can use the constructor to make calls to ignoreField and suppressConflictPenalty |
Method Summary | |
boolean |
compatibleConcept(edu.unika.aifb.kaon.Concept c)
By default, compatible with all concepts. |
protected void |
concatFields(edu.unika.aifb.kaon.Relation r,
edu.unika.aifb.kaon.Instance i1,
edu.unika.aifb.kaon.Instance i2,
Confidence c1,
Confidence c2,
edu.unika.aifb.kaon.Instance newInstance,
Confidence newConfidence)
For the specified relation, takes the two values in each of the instances and concatenates them together delimiting with a comma space. |
edu.unika.aifb.kaon.Instance |
getBestInstance(edu.unika.aifb.kaon.Instance[] instances,
Confidence[] confidences)
Calls getMergedInstances and returns the best one. |
double |
getConflictPenalty()
Gets the conflict penalty, which is assessed when the same field in two different instances don't match. |
void |
getMergedInstances(Vector instances,
Vector confidences)
Finds the best Instance according to getRank . |
protected double |
getMergedRank(edu.unika.aifb.kaon.Instance i1,
edu.unika.aifb.kaon.Instance i2,
Confidence c1,
Confidence c2)
The rank of a merge is the rank of the Instance resulting from the union of the fields of the two input Instances. |
double |
getMergePenalty()
Gets the merge penalty, which is assessed every time two Instances are merged. |
protected double |
getRank(edu.unika.aifb.kaon.Instance i,
Confidence c)
Rank of an instance is the sum of the confidences for each field, as specified by the Confidence object. |
double |
getVacuousMergePenalty()
Gets the "vacuous merge" penalty, which is assessed when a merged instance would contain no values from one of the original instances. |
void |
ignoreField(String fieldName)
For the purposes of ranking, a relational field whose name matches the one passed into this method is ignored. |
protected edu.unika.aifb.kaon.Instance |
merge(edu.unika.aifb.kaon.Instance i1,
edu.unika.aifb.kaon.Instance i2,
Confidence c1,
Confidence c2,
Confidence resultingConfidence)
Merges two Instances as described by getMergedRank . |
protected boolean |
mergeInstances(int index,
Vector instances,
Vector confidences)
Given the index of the starting instance, compare all subsequent instances looking for the best merge; if one is not found, return false indicating a merge was not found. |
protected void |
reconcileConflictedField(edu.unika.aifb.kaon.Relation r,
edu.unika.aifb.kaon.Instance i1,
edu.unika.aifb.kaon.Instance i2,
Confidence c1,
Confidence c2,
edu.unika.aifb.kaon.Instance newInstance,
Confidence newConfidence)
Similar to reconcileIgnoredFields except it handles
fields specified by suppressConflictPenalty . |
protected void |
reconcileIgnoredFields(edu.unika.aifb.kaon.Instance i1,
edu.unika.aifb.kaon.Instance i2,
Confidence c1,
Confidence c2,
edu.unika.aifb.kaon.Instance newInstance,
Confidence newConfidence)
When two instances are merged, fields specified by ignoreField are ignored until the end, when this
method is called. |
void |
setConflictPenalty(double penalty)
Sets the conflict penalty. |
void |
setMergePenalty(double penalty)
Sets the merge penalty. |
protected void |
setOneField(edu.unika.aifb.kaon.Relation r,
edu.unika.aifb.kaon.Instance i1,
edu.unika.aifb.kaon.Instance i2,
Confidence c1,
Confidence c2,
edu.unika.aifb.kaon.Instance newInstance,
Confidence newConfidence)
For the specified relation, takes the the value in the instance with the higher confidence, otherwise i1's value if it's a tie. |
void |
setVacuousMergePenalty(double penalty)
Sets the vacuous merge penalty. |
protected void |
sortInstances(Vector instances,
Vector confidences)
sorts the instances vector by rank, sorts the confidences in parallel. |
void |
suppressConflictPenalty(String fieldName)
Exempts a particular field from conflict penalties -- that is, when ranking a merger between two Instances, if they disagree on a particular field, a conflict penalty for that mismatch is not assessed if this method was called for that field. |
Methods inherited from class edu.stanford.nlp.ie.merge.AbstractInstanceMerger |
isEmpty, storeMerger |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
protected HashMap ignoredFields
protected HashMap ignoredPenalties
Constructor Detail |
public GenericMerger()
Method Detail |
public boolean compatibleConcept(edu.unika.aifb.kaon.Concept c)
c
- the Concept to be checked against the Merger
public double getConflictPenalty()
public void setConflictPenalty(double penalty)
penalty
- the new penalty valuepublic double getMergePenalty()
public void setMergePenalty(double penalty)
penalty
- the new penalty valuepublic double getVacuousMergePenalty()
getMergedRank
,
so fields exempted by either ignoreField
or
suppressConflictPenalty
do not contribute toward deciding
whether a merge is vacuous. Default is 10, but conceivably could
be set to 0 if appropriate. Note that if a normal
mergePenalty
is specified, this is also assessed
for vacuous merges.
public void setVacuousMergePenalty(double penalty)
penalty
- the new penalty valuepublic void ignoreField(String fieldName)
gm.ignoreField("Address");
means
that the rank of an Instance or the rank of a merged Instance
does not depend on the confidence ranking or contents of the
Address field.
The method reconcileIgnoredFields
is called
during actual merger of two instances, which can contain
code that deals with ignored fields accordingly.
public void suppressConflictPenalty(String fieldName)
The method reconcileConflictedField
is
called during merger of two instances to handle the merging
behavior in the event that a conflict for one of these fields
arises.
fieldName
- the name of the Relationprotected void reconcileIgnoredFields(edu.unika.aifb.kaon.Instance i1, edu.unika.aifb.kaon.Instance i2, Confidence c1, Confidence c2, edu.unika.aifb.kaon.Instance newInstance, Confidence newConfidence)
ignoreField
are ignored until the end, when this
method is called. Ignored fields are concatenated together,
delimited by a comma and space, and the higher of the two
confidence values is used. Possible behavior for overriding
subclasses could consist of using some general or specific
criteria for choosing one of the values to use. Changes
are made to newInstance
and newConfidence
i1
- the first instance to mergei2
- the other instance to mergec1
- the confidence object corresponding to i1c2
- the confidence object corresponding to i2newInstance
- the Instance so-far from merging i1 and i2newConfidence
- the Confidence object so-far for newInstanceprotected void reconcileConflictedField(edu.unika.aifb.kaon.Relation r, edu.unika.aifb.kaon.Instance i1, edu.unika.aifb.kaon.Instance i2, Confidence c1, Confidence c2, edu.unika.aifb.kaon.Instance newInstance, Confidence newConfidence)
reconcileIgnoredFields
except it handles
fields specified by suppressConflictPenalty
. Note
that this method deals with such fields only if a conflict arises.
And unlike reconcileIgnoredFields
, it does not
operate in bulk.
The two values for the conflicting field are concatenated together,
delimited by a comma and space, and the higher of the two
confidence values is used. Possible behavior for overriding
subclasses could consist of using some general or specific
criteria for choosing one of the values to use. Changes
are made to newInstance
and newConfidence
r
- the relation corresponding to the field with the conflicti1
- the first instance to mergei2
- the other instance to mergec1
- the confidence object corresponding to i1c2
- the confidence object corresponding to i2newInstance
- the Instance so-far from merging i1 and i2newConfidence
- the Confidence object so-far for newInstanceprotected void concatFields(edu.unika.aifb.kaon.Relation r, edu.unika.aifb.kaon.Instance i1, edu.unika.aifb.kaon.Instance i2, Confidence c1, Confidence c2, edu.unika.aifb.kaon.Instance newInstance, Confidence newConfidence)
reconcile*
methods.
r
- the relation corresponding to the field to mergei1
- the first instance to mergei2
- the other instance to mergec1
- the confidence object corresponding to i1c2
- the confidence object corresponding to i2newInstance
- the Instance so-far from merging i1 and i2newConfidence
- the Confidence object so-far for newInstanceprotected void setOneField(edu.unika.aifb.kaon.Relation r, edu.unika.aifb.kaon.Instance i1, edu.unika.aifb.kaon.Instance i2, Confidence c1, Confidence c2, edu.unika.aifb.kaon.Instance newInstance, Confidence newConfidence)
reconcile*
methods.
r
- the relation corresponding to the field to mergei1
- the first instance to mergei2
- the other instance to mergec1
- the confidence object corresponding to i1c2
- the confidence object corresponding to i2newInstance
- the Instance so-far from merging i1 and i2newConfidence
- the Confidence object so-far for newInstanceprotected double getRank(edu.unika.aifb.kaon.Instance i, Confidence c)
ignoreField
are not included
in the calculation.
getRank
in class AbstractInstanceMerger
i
- the Instance to rankc
- the Confidence object describing the Instance
protected double getMergedRank(edu.unika.aifb.kaon.Instance i1, edu.unika.aifb.kaon.Instance i2, Confidence c1, Confidence c2)
conflictPenalty
. Fields specified by
ignoreField
are not included in the calculation.
Past conflictPenalties are taken into account and are reflected in the current merged rank as well. However, there is no memory of what particular conflicting fields caused the previous conflictPenalties, so it is possible that a given field can cause several penalties over the course of several mergers.
getMergedRank
in class AbstractInstanceMerger
i1
- one of the Instancesi2
- the other Instancec1
- the Confidence object corresponding to i1c2
- the Confidence object corresponding to i2
protected edu.unika.aifb.kaon.Instance merge(edu.unika.aifb.kaon.Instance i1, edu.unika.aifb.kaon.Instance i2, Confidence c1, Confidence c2, Confidence resultingConfidence)
getMergedRank
.
Ignored fields specified by ignoredFields
are dealt
with according to reconcileIgnoredFields
. This
method is not destructive to either input instance or either input
confidence. This will force merge the two instances, regardless of
how bad the resulting instance is, so getMergedRank
should probably be called first to determine whether the merge
is reasonable.
merge
in class AbstractInstanceMerger
i1
- the first instance to mergei2
- the other instance to mergec1
- the confidence object corresponding to i1c2
- the confidence object corresponding to i2resultingConfidence
- an instantiated but empty Confidence
object that will store the confidence information of the resulting
merged instance
public edu.unika.aifb.kaon.Instance getBestInstance(edu.unika.aifb.kaon.Instance[] instances, Confidence[] confidences)
instances
- an array of instancesconfidences
- an array of confidences indexed parallel to
the array of instances
public void getMergedInstances(Vector instances, Vector confidences)
getRank
. Then
finds the best merger with that Instance, provided that the
resulting Instance does not have a lower score than the original
best Instance. Keeps attempting to merge with other Instances
while the score doesn't go down. The same process is applied
to the next best unmerged Instance, and so forth, until there
are no more productive mergers. The resulting merged Instances and
any leftover unmerged Instances
are returned as the only elements in the input instances Vector,
and their confidences are returned in the parallel confidences
Vector.
instances
- a Vector of Instances to be merged; during the run
of this method, this Vector is cleared, and when the method returns,
the Vector will contain the resulting merged and leftover Instances
in decreasing rank orderconfidences
- a Vector of Confidences parallel to the instances
that will be treated the same way as the instances Vector and remain
parallel to its contents.protected void sortInstances(Vector instances, Vector confidences)
protected boolean mergeInstances(int index, Vector instances, Vector confidences)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |