Transform Data Using Replace

Transform data in a column using a simple pattern match or a regular (regex) expression. For example, in a State column, you might change CA to California.

You can use any Perl Compatible Regular Expression (PCRE), which are also referred to as regex expressions. Transforms are applied to every applicable row in a dataset.
  1. On the Home page, open a dataset or workbook.
    • Select the dataset, click the Actions menu, and then select Open. In the Data Diagram or Join Diagram, right-click a data source and click Open to display the transform editor.
    • Select the workbook that contains the dataset, click the Actions menu, select Open, then click Data. In the Data Diagram or Join Diagram, right-click a data source and click Open to display the transform editor.

    Note:

    Oracle Analytics displays a Data Diagram for single-table datasets or a Join Diagram for multi-table datasets.
  2. Hover over or select the text column that you want to transform.
  3. Click Options at the base of the quality tile, then click Replace.
    Tip: To replace more than one value at the same time, use the Replace Value List option.
  4. In the Replace step editor, specify how you'd like to match values. For example:
    • Click Use regular expression to match using a complex regular expression (regex).
    • Click Match entire values only to change "male" to "M" but not "female" to "feM".
  5. In the String to replace field and New string field, specify the literal text or regular expressions to search and replace (case-sensitive).
    Examples:
    • To change "2553 Bolman Court" to "2553 #Bolman# #Court#", select Use regular expression, enter ([A-Za-z]+) in the String to replace field, and enter #$1# in the New String field.
    • To change "male" with "M", select Match entire values only, enter male in the String to replace field, and enter M in the New String field. . This changes "male" to "M" but won't change "female" to "feM".

    When you complete both fields, you see a preview of the transformed data in the preview pane. Adjust the replace options until you get the transformation you want in the preview pane.
  6. Click Add Step.
    Oracle Analytics adds a step to the Preparation Script panel.
  7. To save your data preparation changes and apply them to your data, click Save.

Example Replace Transforms Using Regular Expressions

Here are just a few examples of using regular (regex) expressions to transform data.

In these examples, the Search Expression shows what to enter in the String to replace field, and the Replace Value shows what to enter in the New string field.

Table 4-1 Example replace transforms

Search Expression Replace Value Original Text Replaced Text Notes
@([a-z]+)(?=\.[a-z]{3}) @example MichelePFalk@yahoo.com MichelePFalk@example.com This example replaces domain details in email addresses.
^Gray|Grey$ Silver

Grey

Gray

Graystone

Silver

Silver

Graystone

The ^$ characters mean only look for entire string matches. The vertical bar is the regular expression for OR, so in this case the regular expression looks for either "Gray" or "Grey" and replace with "Silver".

The string Graystone isn't transformed because the regular expression is looking for entire value matches only.

\d+ 9999

8398 Park Street

123 Oracle Parkway

9999 Park Street

9999 Oracle Parkway

This regular expression looks for one" \d" or more "+" digits and replaces them with "9999". The replace also works when the original text has only three digits.
([A-z]+)(\d+) $2

UA101654

US829383

101654

829383

This regular expression is looking for one "[A-z]" or more "+" consecutive letters followed by one" \d" or more "+" digits - each of the two expressions is bounded by parentheses, which captures two groups - the first one ([A-z]) and the second one (\d+) - these groups are numbered automatically and can then be used in the replacement by using the dollar sign for the second group, for example, "$2".
([A-z]+)(\d+) Postal Code: $2

UA101654

US829383

Postal Code: 101654

Postal Code: 829383

This example uses the same match expression as the previous example, except that it shows how to insert your own replacement text in addition to a group. Text can be inserted before and after a recalled match group.