1.3.11.16 Extract Values

The Extract Values processor extracts values, or parts of values, to a new attribute, where those values match a reference list.

The matching against the list may be done in one of five ways:

  • Whole Value

  • Starts With

  • Ends With

  • Contains

  • Delimiter Match

This affects the way that values are extracted. For example, if you want to extract Business Suffixes from a Company Name attribute, you may want to extract them only if the value ends with the value in the list.

Use Extract Values to create a new attribute containing a distinct part of an input attribute that you want to treat separately.

For example, if you have a Product_Description attribute containing values that represent the units of a product (for example, PINTS, PNTS, PTS etc.) you may want to extract these values to a separate attribute.

The following table describes the configuration options:

Configuration Description

Inputs

Specify one or more String or String Array attributes from which you want to extract values that match a list.

Options

Specify the following value map options:

  • Reference data: list of values to extract. Specified as Reference Data. Default value: None.

Specify the following match options:

  • Ignore Case?: whether or not to ignore case when matching values against the list. Specified as Yes/No. Default value: Yes.

  • Match list by: drives how to match the list. Specified as a Selection (Whole Value/Starts With/Ends With/Contains). Default value: Contains.

  • Delimiters: when matching values to the list by splitting the data using delimiters, this allows you to specify the delimiter characters to use. Specified as free text. Default value: Space.

Outputs

Describes any data attribute or flag attribute outputs.

Data Attributes

The following data attributes are output:

  • [Attribute Name].ExtractedValue: a new attribute with the part of the value that matched the list extracted. Where there was a match against the list, the value is that which matched the list. Where there was no match against the list, the value is a Null value.

Flags

The following flags are output:

  • ExtractedFlag: indicates whether data has been extracted. Possible values: Y/N.

The following table describes the statistics produced by the profiler:

Statistic Description

Extracted

The number of records which matched the list, and so where an extraction was performed.

Unextracted

The number of records which did not match the list and so no extraction was performed.

Output Filters

The following output filters are available:

  • Records that matched the list

  • Records that did not match the list

Example

In this example, Extract Values is used to extract the County value from an ADDRESS3 attribute which normally just contains the County, but in some cases contains both the County and other trailing information, such as a Postcode. In this case, the list is matched using a Starts With option, and the matching values extracted to an output attribute named County:

ADDRESS3.trimmed County

Cheshire

Cheshire

Kent

Kent

Surrey, CB0 8YN

Surrey

Herts, AL1 3HL

Herts

Cambridgeshire

Cambridgeshire

Essex, SS2 5QN

Essex

London, WC2E 8JG

London