Before you Begin
This 10-minute tutorial shows you how to use regular expressions while preparing data for use in projects, data flows, or reports.
Background
In Oracle Analytics while preparing data, you can use regular expressions to match a specific text pattern in your column data, and remove or replace it with a different text string. Oracle Analytics supports using Perl Compatible Regular Expressions (PCRE) syntax.
The replace transforms in this tutorial use the following regular expression syntax:
^
use a caret to indicate the start of a string$
use a dollar sign to indicate the end of a string|
use a pipe as an OR operator.
use a period to search for any character*
use an asterisk to search for zero\d+
use this combination to search for the consecutive digits in the string[A-Z]
use brackets around uppercase letters A through Z to search for any uppercase letters$n
use to represent a group, specified by the number that replacesn
, in your expression containing the value for the new string
What Do You Need?
- Access to Oracle Analytics Cloud or Oracle Analytics Desktop
- Download data_replace_2024.xlsx to your computer
Create a Dataset
- Sign in to Oracle Analytics.
- On the Home page, click Create, and then click Dataset.
- In Create Dataset, click Drop data file here or click to browse. In File Upload, select
data_replace_2024.xlsx
, and then click Open. - In Create Dataset Table from data_replace_2024.xlsx, click OK.
Description of data_set.png - Click Save
. In Save Dataset As, enter
data_replace_2024
in Name, and click OK.
Implement Data Consistency
In this section, you create a regular expression to implement data consistency in the Uniform Color column. In reviewing the column, notice that the same color uses two spellings, gray and grey.
- In the data_replace_2024 dataset, click Toggle Quality Tiles
.
- Select the Uniform Color column, click Options
, and then select Replace.
- In Replace, click Use regular expression. Enter
^Gray$|^Grey$
in String to Replace. In New string, enterSilver
, and then click Add Step.Review the Uniform Color column to verify the change. Silver replaces the colors, gray and grey but doesn't replace the gray in graystone.
Description of the illustration gray_replaced.png
Obfuscate Numbers in Street Addresses
In this section, you mask the numbers in street addresses by replacing the numbers with 9999.
- Select the Street Address column, click Options
, and then select Replace.
- In Replace, click Use regular expression. Enter
\d+
in String to Replace. In New string, enter9999
, and then click Add Step.Description of the illustration mask_numbers.png
Remove a Portion of a String
In this section, you use a regular expression to strip the alphabetic country codes such as AU, US, and FR from the postal codes in the Postal_Code column.
- Select the Postal_Code column, click Options
, and then select Replace.
- In Replace, click Use regular expression. Enter
([A-Z]+)(\d+)
in String to Replace. In New String, enter$2
, and then click Add Step.Description of the illustration postal_code.png
Extract Data from a String
In this section, you extract the domain from the email address in the Customer Preferred Contact column using regular expression groups, and add text to the new string.
- Select the Customer Preferred Contact column, click Options
, and then select Replace.
- In Replace, click Use regular expression. Enter
(.*)(@)(.*)
in String to Replace. In New String, enterDomain=$3
, and then click Add Step.Description of the illustration email_domain.png - Click Save
.
Learn More
- Transform Data Using Replace in Oracle Analytics Cloud
- Transform Data Using Replace in Oracle Analytics Desktop
Prepare Data with Regular Expressions in Oracle Analytics
F31220-06
November 2024
Learn how to use regular expressions to replace data values while preparing data in Oracle Analytics.
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract. The terms governing the U.S. Government's use of Oracle cloud services are defined by the applicable contract for such services. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Inside are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Epyc, and the AMD logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.