Before you Begin

This 10-minute tutorial shows you how to use regular expressions while preparing data for use in projects, data flows, or reports.

Background

In Oracle Analytics while preparing data, you can use regular expressions to match a specific text pattern in your column data, and remove or replace it with a different text string. Oracle Analytics supports using Perl Compatible Regular Expressions (PCRE) syntax.

The replace transforms in this tutorial use the following regular expression syntax:

  • ^ use a caret to indicate the start of a string
  • $ use a dollar sign to indicate the end of a string
  • | use a pipe as an OR operator
  • . use a period to search for any character
  • * use an asterisk to search for zero
  • \d+ use this combination to search for the consecutive digits in the string
  • [A-Z] use brackets around uppercase letters A through Z to search for any uppercase letters
  • $n use to represent a group, specified by the number that replaces n, in your expression containing the value for the new string

What Do You Need?

  • Access to Oracle Analytics Cloud or Oracle Analytics Desktop
  • Download data_replace_2024.xlsx to your computer

Create a Dataset

  1. Sign in to Oracle Analytics.
  2. On the Home page, click Create, and then click Dataset.
  3. In Create Dataset, click Drop data file here or click to browse. In File Upload, select data_replace_2024.xlsx, and then click Open.
  4. In Create Dataset Table from data_replace_2024.xlsx, click OK.


    Description of data_set.png follows
    Description of data_set.png
  5. Click Save Save icon. In Save Dataset As, enter data_replace_2024 in Name, and click OK.

Implement Data Consistency

In this section, you create a regular expression to implement data consistency in the Uniform Color column. In reviewing the column, notice that the same color uses two spellings, gray and grey.

  1. In the data_replace_2024 dataset, click Toggle Quality Tiles Toggle Quality Tiles icon.
  2. Select the Uniform Color column, click Options Options menu icon, and then select Replace.
  3. In Replace, click Use regular expression. Enter ^Gray$|^Grey$ in String to Replace. In New string, enter Silver, and then click Add Step.

    Review the Uniform Color column to verify the change. Silver replaces the colors, gray and grey but doesn't replace the gray in graystone.


    Description of gray_replaced.png follows
    Description of the illustration gray_replaced.png

Obfuscate Numbers in Street Addresses

In this section, you mask the numbers in street addresses by replacing the numbers with 9999.

  1. Select the Street Address column, click Options Options menu icon, and then select Replace.
  2. In Replace, click Use regular expression. Enter \d+ in String to Replace. In New string, enter 9999, and then click Add Step.


    Description of mask_numbers.png follows
    Description of the illustration mask_numbers.png

Remove a Portion of a String

In this section, you use a regular expression to strip the alphabetic country codes such as AU, US, and FR from the postal codes in the Postal_Code column.

  1. Select the Postal_Code column, click Options Options menu icon, and then select Replace.
  2. In Replace, click Use regular expression. Enter ([A-Z]+)(\d+) in String to Replace. In New String, enter $2, and then click Add Step.


    Description of postal_code.png follows
    Description of the illustration postal_code.png

Extract Data from a String

In this section, you extract the domain from the email address in the Customer Preferred Contact column using regular expression groups, and add text to the new string.

  1. Select the Customer Preferred Contact column, click Options Options menu icon, and then select Replace.
  2. In Replace, click Use regular expression. Enter (.*)(@)(.*) in String to Replace. In New String, enter Domain=$3, and then click Add Step.


    Description of email_domain.png follows
    Description of the illustration email_domain.png
  3. Click Save Save icon.

Learn More