Before you Begin
This 10-minute tutorial shows you how to modify columns and data in a data flow to create curated datasets in Oracle Analytics. This tutorial uses a spreadsheet as the data source, however, you can use any supported data source.
Background
You might need to implement changes to you data before using that data in analyses. In a data flow, you can add, remove, change or merge columns, add calculations, modify the data contained in the columns, and create multiple datasets from one data flow. If you schedule the data flow to run periodically, you can capture updates in the data source, and enable persisting the transformations in your curated datasets.
After running the data flow, you can use the dataset to analyze the data by creating visualizations.
What Do You Need?
- Access to Oracle Analytics
- Download the samp_revenue_denorm2024.xlsx to your computer
Create a Dataset and Data Flow
- Sign in to Oracle Analytics Cloud.
- On the Home page, click Create, and then click Dataset. In Create Dataset, click Drop data file here or click to browse, select the samp_revenue_denorm2024.xlsx file, and then click Open.
- In Create Dataset Table from samp_revenue_denorm2024, click OK.
- In the Join Diagram, click the OFFICE_NUMBER column, click Measure , and then click Attribute.
- Click the PROD_NUMBER column, click Measure , and then click Attribute.
- Click the ORDER_NUMBER column, click Measure , and then click Attribute.
- Click Save . In Save Dataset As, enter
Sample Revenue 2024
, and then click OK. - Click the samp_revenue_denorm2024 tab. In the dataset, click the No data column, select Options , and then select Delete.
- Click Save . Click Go back .
Create a Dataflow
In this section, you use the Sample Revenue 2024 dataset to create a dataflow that branches to create separate datasets.
- On the Home page, click Create, and then click Data Flow. In Add Data, select Sample Revenue 2024, and then click Add.
- From the Data Flow Steps panel, drag Branch to the Add a step node.
Branch uses 2 as the default number of datasets created from the source dataset. You can increase the number of datasets created when the data flow is run.
- Click Add a step node on the top branch, and then select Merge Columns.
- In Merge Columns, enter
Prod_Attribute
in New column name. Next to Merge column, click the hyperlink and then select PROD_ATTRIBUTE1. Next to With, click the hyperlink, and then select PROD_ATTRIBUTE2. From the Delimiter list, select Space ().
Select Columns to Create a Dataset
In this section, you select the columns used to create a PRODUCTS dataset.
- From the Data Flow Steps panel, drag Select Columns to Add a step between Merge Columns and the Save Data node.
- In Select Columns, click Remove all. Hold down the Ctrl key and select the following columns:
- PROD_NAME
- PROD_TYPE
- PROD_LOB
- PROD_BRAND
- PROD_NUMBER
- Prod_Attribute
- Click Add selected.
- Click the top Save Data node. In Save Dataset, enter
Products
in the Dataset field. In the PROD_NUMBER row, click Measure in the Treat As column, and then select Attribute.
Create a Second Dataset
- From the Data Flow Steps panel, drag Select Columns to Add a step between the Branch and the second Save Data nodes.
- In Select Columns, click Remove all. Hold down the Ctrl key and select the following columns:
- PROD_NAME
- ORDER_NUMBER
- REVENUE
- UNITS
- DISCNT_VALUE
- BILL_DAY_DT
- ORDER_DAY_DT
- ORDER_STATUS
- ORDER_TYPE
- Click Add selected.
- Drag Add Columns to the Add a Step node between Select Columns and Save Data. In Add Columns, enter
ACTUAL_REVENUE
in Name. - In the Expression field, start entering
Revenue
, and then selectREVENUE
from Available Data. Expand Operators, and double-click the minus sign (-). After the minus sign, start enteringDIS
, and then select DISCNT_VALUE from Available Data. - Click Validate, and then click Apply.
- Click the Save Data node on the branch with Add Columns. In Save Dataset, enter
Orders
in Dataset. In the ORDER_NUMBER row, click Measure in the Treat As column, and then select Attribute. - Click Save . In Save Data Flow As, enter
Sample Revenue DF
in Name, and then click OK. - Click Run Data Flow .
- After the data flow run completes, click Go back .
- On the Home page, select Products, click the Actions , and then select Inspect. In the PRODUCTS dataset, click Data Elements to review the dataset. Click Close.
- (Optional) On the Home page, select Orders, click the Actions , and then select Inspect. In the ORDERS dataset, click Data Elements to review the dataset. Click Close.
Schedule the Data Flow Run
In this section, you schedule the data flow to run by defining the repetition, the duration, and interval. Your data might not change frequently, so you could define a schedule that meets your needs.
- On the Home page, click the Data search tag, enter
Sample Revenue 2024
to located the data flow, and then click Search. - Select the Sample Revenue 2024 data flow, click Actions , and select New schedule.
- In Schedules, click New.
- In Schedule, enter a Name for the schedule. In Start, click the calendar , and then select a month and day. In Time, click the clock to select the hour and minutes for the run's start time.
- From Repeat, select No Repeat as the frequency to use for running the data flow, and then click OK.
In a production environment, you can select an actual frequency interval for running the dataflow.
Learn More
Create Curated Data with Data Flows in Oracle Analytics
F17659-07
February 2024
Copyright © 2024, Oracle and/or its affiliates.
Learn how to modify columns and data from a spreadsheet file with a data flow to create datasets in Oracle Analytics.
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract. The terms governing the U.S. Government's use of Oracle cloud services are defined by the applicable contract for such services. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Inside are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Epyc, and the AMD logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.