Providing a whitelist to extract terms from attribute values

If you know the exact words you want to extract from an attribute, then you can use the Tag from Whitelist transformation to extract them and create a new attribute. You can also extract them, replace them with another term, and create a new attribute.

Each line in the editor represents a single tagging action. Types of available tagging actions include:

Tagging Action Syntax Examples
Extract a single term. Specify the term on its own line.
red

If the term red is found, add it to the output attribute.

Extract a single term and replace it with another term. Specify the term on its own line, add a tab, and specify the replacement term.
red<tab>color

If red is found it is replaced by color in the output attribute.

Extract several terms and replace it with one term.

Replace a selected term or terms with another term.
This is the same as single term replacement but you use a new line for each term and its replacement.
red<tab>color
yellow<tab>color
blue<tab>color

If red, yellow, or blue are found, the term is extracted and replaced by color in the output attribute.

To provide a whitelist to extract terms from attribute values:

  1. In the Catalog, select a project.
  2. Select Transform.
  3. Locate an attribute, of type String, that has text you want to extract.
  4. From the transform menu, select Advanced > Tag from whitelist.
  5. Click Enter Terms.
  6. Specify the terms you want to extract according to the syntax in the table above.
    For example:
    Shows the Whitelist Tagger with a three line replacement example

  7. If desired, specify whether you want a case-sensitive match or a whole word match (no sub-String matches) for the terms.
  8. Click Done.
  9. In New Attribute Name, specify the name of the attribute you want to create.
  10. Either click Preview to see the previewed results of running the transformation, or click Add to Script to save the transformation step to the script.

If you are done making changes to the project data set, you can commit the changes. See Running the transformation script against a project data set.