1.3.4.10.9 Output Selector: Most Common Value

The Most Common Value output selector selects the Most Common Value for an output attribute from the input attribute values, for all the records being merged together.

Use the Most Common Value output selector when you think the best output value for an attribute is likely to be the value that occurs most often in the records that are being merged together.

The Most Common Value selector is most useful where more than two records are likely to be merged, since otherwise there is no meaningful definition of the Most Common Value. For example, when deduplicating and selecting a value for a Name attribute, the Most Common Value from "John Lewis" and "John Louis" cannot be determined. However, the Most Common Value from "John Lewis", "John Lewis" and "John Louis" is simply determined.

Alternatively, this selector can be used when two records are being merged to raise a selection error, thus requiring manual selection, where the values are different between records.

Note that where the Most Common Value selector selects from a mixture of Null values and values with data, it ignores the Null values. If all values are Null, however, it will select Null as the output value. The Allow Nulls option then controls whether or not an error is raised if a Null value is selected.

The following table describes the configuration options:

Configuration Description

Inputs

Any attributes of any type from any input data sets.

Options

Specify the following options:

  • Use first non-empty value if tied?: this option provides a way of selecting a value arbitrarily if no value occurs more frequently than any other.

    The first alphabetically sorted value will be selected for String values, the lowest value for Numbers, and the earliest value for Dates.

    Type: Yes/No. Default value: Yes.

Example

In this example, the Most Common Value output selector is used to select the value for a Surname attribute from all records in each match group.

Example configuration

First Non-Empty Value if Tied = No

Example output

The following table shows example output using the Most Common Value selector:

Table 1-108 Example Output Using Most Common Value Selector

Record A Record B Record C Output value(Most Common Value)

Lewis

Lewis

Null

Lewis

Lewis

Lewis

Louis

Lewis

Francis

Frances

Null

Selection error (needs manual resolution)

Francis

Frances

Franciss

Selection error (needs manual resolution)

Lewis

Null

Null

Lewis

Null

Null

Null

Null