Understanding the Sun Match Engine

Date Comparators

The Sun Match Engine provides various date comparison functions. When comparing dates, the match engine compares each date component (for example, it compares the year in the first date against the year in the second date, the month against the month, and the day against the day). This allows for multiple transpositions in each date field. The date comparators use the Java date format (java.sql.Date), allowing the comparator to use the Gregorian calendar and to take into account the time zone where the date field originated.

The following comparison functions are available for matching on date fields.

As with the numeric comparison functions, the date comparison functions can use either a direct string comparison or a relative distance calculation. When using a relative distance calculation, the matching weight between two dates decreases as the dates become further apart, until the relative distance is reached. When the difference becomes the relative distance plus one, the dates are considered non-matches. You can specify different relative distances for before and after the given date. Any dates falling outside of the specified time period receive a complete disagreement weight. The relative distances are specified in the smallest unit of time being matched.

Figure 4 illustrates how the weight is decreased as the difference between the two compared fields reaches either the before or after relative distance. In this diagram, the before relative distance is 11, the after relative distance is 5, and the light blue line represents the agreement weight. When the base date is later than the compared date and the difference between the dates reaches 11 (distance before plus one), the fields are considered a non-match and are given the full disagreement weight. When the base date is earlier than the compared date and the difference between the dates reaches 6 (distance after plus 1), the fields are considered a non-match.

Figure 4 Date Relative Distance Comparison

Figure illustrates how weights are assigned when using
relative distance for date comparators.

The date comparison functions take the parameters listed in Table 41.

Table 41 Date Comparison Function Parameters

Parameter 

Description 

distance-or-string

Specifies whether a relative distance calculation or a direct string comparison is used. Specify “y” to use a relative distance calculation; specify “n” to use a string comparison. 

distance-before 

The number of units prior to the reference date/time for which two date fields can still be considered a match. 

distance-after 

The number of units following the reference date/time for which two date fields can still be considered a match. 

Date Comparator - Year only (dY)

This date comparison function takes only the 4-character year into account for matching. If relative distance calculation is specified, the relative distance is specified in years.

Date Comparator - Month-Year (dM)

This date comparison function takes the month and year into account for matching. If relative distance calculation is specified, the relative distance is specified in months.

Date Comparator - Day-Month-Year (dD)

This date comparison function takes the day, month, and year into account for matching. If relative distance calculation is specified, the relative distance is specified in days.

Date Comparator - Hour-Day-Month-Year (dH)

This date comparison function takes the hour, day, month, and year into account for matching. If relative distance calculation is specified, the relative distance is specified in hours.

Date Comparator - Min-Hour-Day- Month-Year (dm)

This date comparison function takes the minute, hour, day, month, and year into account for matching. If relative distance calculation is specified, the relative distance is specified in minutes.

Date Comparator - Sec-Min-Hour-Day- Month-Year (ds)

This date comparison function takes the second, minute, hour, day, month, and year into account for matching. If relative distance calculation is specified, the relative distance is specified in seconds.