2.1.4.1 Example custom comparison

Custom comparisons may be added into the match library - they are added to widgets.xml in the same way as processors (widgets). The only limitation is that a comparison must have exactly two inputs and one output. Outputs must be either strings (for Boolean comparisons) or numbers (for comparisons that use Result Bands). Boolean comparisons return "T" for True or "F" for False.

Each custom comparison must be associated with an identifier type - either an existing type (String, Number or Date), or a custom type - see Example custom identifier type.

Associating comparison gadgets with identifier types

Comparison gadgets must be associated for use with specific Identifier types. If you want to associate new comparisons with existing system Identifiers, their names are:

dnm:string for Strings

dnm:number for Numbers

dnm:date for Dates

The following example xml represents a comparison association added to matchlibrary.xml:

  <identifierComparison>
    <ident>dnm:string</ident>
    <gadget>dnm:exactstringmatch</gadget>
  </identifierComparison>

This associates the identifier "dnm:string" with the comparison "dnm:exactstringmatch".

Setting default result bands for comparisons

The following xml represents a comparison default result band added to matchlibrary.xml for the 'String Edit Distance' comparison:

  <comparisonReturn>
    <widgetId>dnm:stringeditdistance</widgetId>
    <resultBand name="exact" label="Exact Match">0</resultBand>
    <resultBand name="onetypo" label="One Typo">1</resultBand>
    <resultBand name="twotypos" label="Two Typos">2</resultBand>
    <resultBand name="threetypos" label="Three Typos">3</resultBand>
  </comparisonReturn>

Complete Example

The following example files may be packaged in a JAR file and used to add a custom 'Character Transposition Match' comparison to the match library. The Character Transposition Match comparison matches strings where character transpositions have occurred. For example, when comparing the values 'Michael' and 'Micheal', a single transposition will be counted, so the two values will match if the Maximum allows transpositions option is set to 1 or higher:

Example 2-1 matchlibrary.xml

<?xml version="1.0" encoding="UTF-8"?>
<!--
Custom Match Library Extension
Copyright 2008 Oracle Ltd. All rights reserved.
 -->
<matchLibrary>
  <identifierComparison>
    <ident>dnm:string</ident>
<gadget>dn:characterTranspositionMatch</gadget>
  </identifierComparison>
</matchLibrary>

Example 2-2 widgets.xml

<?xml version="1.0" encoding="UTF-8"?>
<widgets>
  <comment>Oracle Match example script widgets</comment>
  <copyright>Copyright 2008 Oracle Ltd. All rights reserved.</copyright>
  <widget id="dn:characterTranspositionMatch" class="com.datanomic.director.match.library.util.JavaScriptGadget">
    <guidata>
      <label>%characterTranspositionMatch.gadget</label>
      <group>compare</group>
      <icon>script</icon>
 
    </guidata>

    <!-- inputs -->
    <inputs>
 
      <input id="1" type="string" maxattributes="1">
        <guidata><label>label1</label></guidata>
      </input>
 
      <input id="2" type="string" maxattributes="1">
        <guidata><label>label1</label></guidata>
      </input>
    </inputs>
 
    <!-- outputs -->
    <outputs cardinality="1:1">
      <output id="1" type="string" name="result">
        <guidata><label>resultlabel</label></guidata>
      </output>
    </outputs>
<properties>    
 
      <property name="matchNoDataPairs" type="boolean" required="true">
        <guidata>
        <label>%characterTranspositionMatch.property.matchNoDataPairs.label</label>
        </guidata>
        <default>false</default>
      </property>
      
      <property name="ignoreCase" type="boolean" required="true">
        <guidata>
        <label>%characterTranspositionMatch.property.ignoreCase.label</label>
        </guidata>
        <default>true</default>
      </property>

      <property name="startsWith" type="boolean" required="true">
        <guidata>
        <label>%characterTranspositionMatch.property.startsWith.label</label>
        </guidata>
        <default>false</default>
      </property>
      <property name="maxAllowedTranspositions" type="number" required="true">
        <guidata> <label>%characterTranspositionMatch.property.maxAllowedTranspositions.label</label>
        </guidata>
        <default>1</default>
      </property>
    </properties>
<parameters>
      <parameter name="script">
<![CDATA[
function S(s)
{
return (s == null) ? "" : s;
}
function doit()
{
 // no data pairs
if (S(input1) == "" | S(input2) == "")
 {
 if (matchNoDataPairs)
 output1 = "T";
 else
 output1 = "F";
 return;
 }
 
 if (!startsWith)
 {
 if (input1.length != input2.length)
 {
 output1 = "F";
 return;
 }
 }
 
var transpositions = 0;
var longword = input1.length > input2.length ? input1 : input2;
 
var shortword = input1.length > input2.length ? input2 : input1;
 
if (ignoreCase)
{
// convert to uppercase
longword = longword.toUpperCase();
shortword = shortword.toUpperCase();
}
for (var i = 0; i < shortword.length; i++)
{
if (shortword[i] != longword[i])
{
 
// are we at the end of the string?
if (i == shortword.length - 1)
 
{
output1 = "F";
return;
}
 
// not a transposition match?
if (shortword[i] != longword[i + 1])
{
output1 = "F";
return;
}
 
// compare the next character
if (shortword[i + 1] != longword[i])
{
output1 = "F";
 
return;
}
transpositions++;
 
// too many transpositions?
if (transpositions > maxAllowedTranspositions)
{
output1 = "F";
return;
}

// skip over the characters
i++;
}
}
output1 = "T";
}
]]>
      </parameter>
      <parameter name="function">doit</parameter>
    </parameters>
  </widget>
</widgets>

Example 2-3 matchlibrary.properties

[This file was not required in this case as the comparison does not support result bands, and does not require new identifiers.]

Example 2-4 widgets.properties

characterTranspositionMatch.gadget = Character Transposition Match
characterTranspositionMatch.property.matchNoDataPairs.label = Match No Data pairs?
characterTranspositionMatch.property.ignoreCase.label = Ignore case?
characterTranspositionMatch.property.startsWith.label = Starts with?
characterTranspositionMatch.property.maxAllowedTranspositions.label = Maximum allowed transpositions

Example 2-5 version.properties

name=Character Transposition Match
version=v8.1.3.(175)
title=Character Transposition Match
type=GADGET