The following topics provide instructions for each step of creating custom comparators. You might need to create multiple Java files and Java packages for the comparator, depending on the validations, data sources, dependency classes, and curve adjustments you use. Create them in the same directory structure because you will need to package them up into a ZIP file when you are through.
Step 5: Define Curve Adjustment or Linear Fitting (Optional)
Step 8: Configure the Comparator in the Match Configuration File
Before you create your custom comparators, take into account the following requirements for the comparators.
Determine how many comparators you need to create and whether each will require a different Java class or some can use the same Java class.
Determine what parameters, if any, you need to define for each comparator.
Determine what validations, if any, need to be created.
Determine whether you need to use a data source.
Decide if the comparators you create will have a dependency on any other comparator classes.
Decide whether you will use curve adjustment, linear fitting, or neither.
The first step to creating custom comparators is defining the matching logic in custom comparator Java classes that are stored in the real-time module of the Master Index Match Engine. Follow these guidelines when creating the class:
Create a working directory that will contain all the Java packages and the comparators list file for the new comparators.
The Java classes need to implement com.sun.mdm.matcher.comparators.MatchComparator.java interface, located in Matcher.jar. This class includes the methods described below.
Once you create the Java classes, continue to Step 2: Register the Comparator in the Comparators List.
The initialize method initializes the values for the parameters, data sources, and dependency class used for each custom comparator. It provides the necessary information to access the comparator's configuration in the match configuration file and the comparators list file.
void initialize(Map<String, Map> params, Map<String, Map> dataSources, Map<String, Map> dependClassList)
Parameter |
Type |
Description |
---|---|---|
params |
Map |
A mapping of all the parameters associated with a match field in matchConfigFile.cfg. |
dataSources |
Map |
A mapping of all the data sources associated with a match field in matchConfigFile.cfg. |
dependClassList |
Map |
A mapping of all the dependency classes associated with a match field in matchConfigFile.cfg. |
None.
None.
The compareFields method contains all the comparison logic needed to compare two field values and calculate a matching weight that shows how similar the values are.
double compareFields(String recordA, String recordB, Map context)
Parameter |
Type |
Description |
---|---|---|
recordA |
String |
A field value from the record against which the reference record is being compared. |
recordB |
String |
A field value from the reference record. |
context |
Map |
A set of arguments passed to the comparator. |
A number between zero and one that indicates how closely two field values match.
MatchComparatorException
The setRTParameters method sets the runtime parameters for the comparator, providing the ability to customize every call to the parameter.
void setRTParameters(String key, String value)
Parameter |
Type |
Description |
---|---|---|
key |
string |
The key to map the parameter value. |
value |
string |
The value of the parameter. |
None.
None.
The stop method closes any related connections to the data sources used by the comparator.
void stop()
None.
None.
None.
In order to include new comparators in a master index application, you need to create a comparators list file defining the configuration of the comparators. When you import the comparator package into the master index application, this file is read and the entries are added to the comparators list for the project.
Below is a sample comparators list file. Note that the first comparator includes all possible configurations (parameters, dependency classes, data sources, and curve adjust). Most comparators will not be that complex. The second comparator class defines two comparators, Approx and Adjust.
<?xml version="1.0" encoding="UTF-8"?> <comparators-list xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="comparatorsList.xsd"> <group description="New group of comparators" path="com.mycomparators.matchcomparators"> <comparator description="New Exact Comparator"> <className>NewExactComparator</className> <codes> <code description="New Exact Comparator" name="Exact" /> </codes> <params> <param description="Fixed length" name="length" type="java.lang.Integer" /> <param description="Data type" name="dataType" type="java.lang.String" /> </params> <data-sources> <datasource description="Serial numbers" type="java.io.File" /> </data-sources> <dependency-classes> <dependency-class matchfield="Serial" name="com.genericcomparaotrs.StringComparator" /> </dependency-classes> <curve-adjust status="true" /> </comparator> <comparator description="New Approximate Comparator"> <className>NewApproxComparator</className> <codes> <code description="New approximate comparator" name="Approx" /> <code description="New adjustable comparator" name="Adjust" /> </codes> </comparator> </group> </comparators-list> |
In the same folder where you created the custom Java class package, create a new file named comparatorsList.xml.
The comparators list file needs to be in the same working directory you created for the custom comparator Java classes.
Add the following header information to the file. You can copy this from the comparatorList.xml file in a master index application.
<?xml version="1.0" encoding="UTF-8"?> <comparators-list xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="comparatorsList.xsd"> ... </comparators-list> |
Define the following properties, using the XML structure described in Master Index Match Engine Comparator Definition List. Use the sample above as an example.
The group description and Java package for the group.
A description for each comparator.
The Java class name for each comparator or comparator subgroup.
The unique identifying name for each comparator.
A list of static parameters for each comparator or comparator subgroup (optional). If you define parameters, you must also perform the steps under Step 3: Define Parameter Validations (Optional).
A list of data sources for each comparator or comparator subgroup (optional). If you define data sources, you must also perform the steps under Step 4: Define Data Source Handling (Optional).
A list of dependency classes for each comparator or comparator subgroup (optional).
Whether to use curve adjustment for each comparator or comparator subgroup (optional). If you set curve adjustment to true, you must perform the steps under Step 5: Define Curve Adjustment or Linear Fitting (Optional).
Continue to Step 3: Define Parameter Validations (Optional)
If your custom comparators take parameters, you should create a Java class that validates the parameter properties. You need to perform this step if you defined parameters for the comparator in comparatorsList.xml. You do not need to create this file in the same package as the Java comparator class, but for packaging purposes, create it in the same working folder.
Complete Step 2: Register the Comparator in the Comparators List.
Create a Java class named the same name as the Java class that defines the comparator with “ParamsValidator” appended.
For example, if the comparator is defined by a class named ExactComparator, the parameter validation class would be ExactComparatorParamsValidator.
In this class, implement com.sun.mdm.matcher.comparators.validator.ParametersValidator.
The method contained in this class is described below.
Continue to Step 4: Define Data Source Handling (Optional).
The ParametersValidator class contains one method, validateComparatorsParameters, that allows you to validate parameter types, ranges, and other properties. For logging purposes, you can use net.java.hulp.i18n, which is used within matcher.jar, or you can use your own logger.
void validateComparatorsParameters(Map<String, Object> params)
Parameter |
Type |
Description |
---|---|---|
params |
Map |
A list of parameters to validate. |
None.
MatcherException
If your custom comparators use external data sources to provide additional information for matching weight calculations, you need to create a Java class that lets you load the file to memory or have real-time access to the data file content. You can also define validations to perform. You do not need to create this file in the same package as the Java comparator class, but for packaging purposes, create it in the same working folder.
You need to perform this step if you defined lines similar to the following in comparatorsList.xml:
<data-sources> <datasource description="Serial numbers" type="java.io.File" /> </data-sources> |
Create a Java class named the same name as the Java class that defines the comparator with “SourcesHandler” appended.
For example, if the comparator is defined by a class named ExactComparator, the parameter validation class would be ExactComparatorSourcesHandler.
In this class, implement com.sun.mdm.matcher.comparators.validator.DataSourcesHandler.
The method in this class is described below.
Continue to Step 5: Define Curve Adjustment or Linear Fitting (Optional).
The DataSourcesHandler class contains one method, handleComparatorsDataSources, that allows you to define properties for the data source. This method takes one parameter that is a DataSourcesProperties object. This class and its methods are described in DataSourcesProperties Class.
Object handleComparatorsDataSources(DataSourcesProperties dataSources)
Parameter |
Type |
Description |
---|---|---|
dataSources |
DataSourceProperties |
A list of properties for the data handler (see DataSourcesProperties Class). |
Object
MatcherException
IOException
The DataSourcesProcerties interface is used as a parameter to the handleComparatorsDataSources described in Step 4: Define Data Source Handling (Optional). The methods in the class are listed and described below.
The getDataSourcesList returns the comparator's list of associated data source paths.
List getDataSourcesList(String codeName)
Parameter |
Type |
Description |
---|---|---|
codeName |
string |
The name of the comparator. The name is defined in comparatorsList.xml in the name attribute of the code element. In the example below, the comparator's code name is “Exact”. <code description="New exact comparator" name="Exact" /> |
A list of paths and filenames as specified in comparatorsList.xml.
None.
The isDataSourceLoaded method checks whether a specific file has already been loaded or opened.
boolean isDataSourceLoaded(String sourcePath)
Parameter |
Type |
Description |
---|---|---|
sourcePath |
string |
The path and filename of the file to check. |
A boolean indicator of whether the specified file has already been loaded or opened.
None.
The setDataSourceLoaded method sets the loading status of a data source.
void setDataSourceLoaded(String sourcePath, boolean status)
Parameter |
Type |
Description |
---|---|---|
sourcePath |
string |
The path and filename of the file. |
status |
boolean |
The load status of the file. Specify true if the file is loaded; otherwise specify false. |
None.
None.
The getDataSourceObject method returns the file located at the specified source path.
Object getDataSourceObject(String sourcePath)
Parameter |
Type |
Description |
---|---|---|
sourcePath |
string |
The path and filename of the file you want to load. |
An object containing the data source information.
None.
If your custom comparators use curve adjustment or linear fitting to adjust matching weight calculations, you need to create a Java class that defines the curve. You do not need to create this file in the same package as the Java comparator class, but for packaging purposes, create it in the same working folder.
You need to perform this step if you defined the following line in comparatorsList.xml for the comparator:
<curve-adjust status="true" /> |
Create a Java class named the same name as the Java class that defines the comparator with “CurveAdjustor” appended.
For example, if the comparator is defined by a class named ExactComparator, the parameter validation class would be ExactComparatorCurveAdjustor.
In this class, implement com.sun.mdm.matcher.configurator.CurveAdjustor.
The method in this class is described below.
Continue to Step 6: Compile and Package the Comparator.
The processCurveAdjustment method provides handling for curve adjustment within a specific match comparator.
double[] processCurveAdjustment(String compar, double[] cap)
Parameter |
Type |
Description |
---|---|---|
compar |
string |
The name of the comparator, as defined in the name attribute of the code element for the comparator. |
cap |
double[] |
An array of values that define the curve adjustment. |
An array of curve adjustment values.
MatcherException
Before you perform these steps, make sure you have completed Step 1: Create the Custom Comparator Java Class through Step 5: Define Curve Adjustment or Linear Fitting (Optional).
When you are finished defining all the Java classes for the comparators and have registered each comparator in your comparators list file, you can compile the Java code and package the files into a ZIP file that you can then import into a master index application. Compile the classes using the compiler of your choice.
To package the files, create a temporary directory and copy the comparators list file to the directory. Copy all the class folders and files to the same directory. The top level of the temporary directory should include comparatorsList.xml and a com folder (which contains all the Java classes). Create a ZIP file of the directory. For more information about the ZIP package, see About the Comparator Package.
After you compile and package the comparator, continue to Step 7: Import the Comparator Package Into Sun Master Index.
You need to import the your new comparators into NetBeans to make them available to all master index applications or only the current application.
Launch NetBeans, and open the master index project that will use the new comparators.
In the Projects window, expand the main master index project.
Right-click Match Engine, and select Import Comparator Plug-in.
In the dialog box that appears, navigate to the location of the plug-in ZIP file.
Select the file containing the plug-in, and then click Open.
Do one of the following:
To import the plug-in and make it available to all future master index application, click Yes.
To import the plug-in and make it only available to the current master index application, click No.
The contents of the ZIP file are imported into the Match Engine node and the new comparators are added to the list of comparator definitions in comparatorsList.xml.
In the Match Engine node, navigate to the /lib folder that was added and verify that all of the required files are there.
Open comparatorsList.xml and verify the new comparator definitions are included.
After you import custom comparators, you need to add them to the match configuration file (matchConfigFile.cfg) and define the matching configuration. This makes the comparator available for use in the master index match string. For information about this file, see The Master Index Match Engine Match Configuration File. For instructions on modifying the file, see Configuring the Comparison Functions for a Master Index Application in Configuring Sun Master Indexes .