1.3.9.14 Using Attribute Tags

The Map subprocessor of the Parse processor defines internal attributes for the Parse processor and maps the actual input attributes to them. The rest of the Parse processor's functionality is defined in terms of the internal attributes, so configured Parse processors can be re-used with a variety of input data sources simply by remapping the inputs. Each internal attribute is identified by a name, which is defined by the user who configures the Parse processor, and an attribute tag, which is automatically generated and cannot be edited. Attribute tags take the form a1, a2, a3 and so on.

Attribute tags can be used to distinguish between instances of the same token which originated from different input attributes. For example, a Parse processor which is being used to analyze name data may define three internal attributes: Title, Forename and Surname. By using attribute tags, a valid title which was extracted from the 'Title' field can be treated differently from a valid title which was extracted from a 'Forename' field, and so on.

This distinction is made in the Resolve subprocessor. If we consider just one possible token pattern:

<valid Title> <valid Forename> <valid Surname>

we can immediately see that it includes no reference to which attributes each token was contained in at the input stage. Without the use of attribute tags, any of the following four input patterns will be resolved the same way:

Table 1-130 Resolution of Input Patterns

Pattern Title (a1) Forename (a2) Surname (a3)

1

<valid Title>

<valid Forename>

<valid Surname>

2

<valid Title> <valid Forename>

<valid Surname>

3

<valid Title> <valid Forename> <valid Surname>

4

<valid Title>

<valid Forename> <valid Surname>

Records which contain correct data, correctly fielded are generally considered to be of higher quality than records which contain correct data which has been incorrectly fielded. On this basis, we can define a resolution rule for the first pattern which distinguishes it from the other patterns, simply by including the attribute tags in the search criteria. The search term specific to pattern 1 would be:

<valid a1.Title> <valid a2.Forename> <valid a3.Surname>

Including the non-specific search term in a subsequent resolution rule would catch patterns 2, 3 and 4, which could be assigned a different resolution result based on the lower quality of their formatting.

Note:

Because resolution rules are applied in order, the more specific rules must be tested for before the more generic rules. Otherwise, the generic rule will act on all the patterns without the specific match ever being considered.

Further examples

The following table gives further examples of search strings for resolution rules, and specifies which of the above patterns will match them:

Table 1-131 Further Examples of Search Strings

Search string Matching patterns

<valid a1.Title> <valid a2.Forename> <valid a3.Surname>

1

<valid Title> <valid a2.Forename> <valid a3.Surname>

1, 2

<valid Title> <valid a2.Forename> <valid Surname>

1, 2, 3

<valid Title> <valid Forename> <valid a3.Surname>

1, 2, 4