What Steps Can I Use to Organize and Integrate My Data?

Use these steps in data flows to organize, integrate, and transform your data. For example, you might merge data sources, aggregate data, or perform geo-spatial analysis.

Steps enable you to transform your data visually without requiring coding skills.

Use the data flow editor to add steps to your data flows.
Description of data-flow-steps-oa.png follows
Description of the illustration data-flow-steps-oa.png

Add Columns

Add custom columns to your target dataset. For example, you might calculate the value of your stock by multiplying the number of units in a UNITS column by the sale price in a RETAIL_PRICE column (that is, UNITS * RETAIL_PRICE).

Add Data

Add data sources to your data flow. For example, if you're merging two datasets, you add both datasets to your data flow. See Database Support for Data Flows.

Aggregate

Create group totals by applying aggregate functions. For example, count, sum, or average.

Analyze Sentiment

Detect sentiment for a given text column. For example, you might analyze customer feedback to determine whether it's positive or negative. Sentiment analysis evaluates text based on words and phrases that indicate a positive, neutral, or negative emotion. Based on the outcome of the analysis, a new column contains Positive, Neutral, or Negative.

Apply AI Model

Analyze data using an artificial intelligence model. For example, you might perform object detection, image classification, or text detection using a model created in the OCI Vision service. See Use OCI Vision Models in Oracle Analytics. You can also perform language analysis such as sentiment analysis and language detection using models created in OCI Language Service.

Apply Model

Analyze data by applying a machine learning model from Oracle Machine Learning or OCI Data Science. For example, you might have created a classification model to predict whether emails are spam or not spam. See Apply a Predictive or Registered Oracle Machine Learning Model to a Dataset.

Apply Custom Script

Transform your data using a function, such as one defined in Oracle Cloud Infrastructure (OCI). For example, you might use a function to convert English text into Spanish or German. Your Oracle Analytics administrator registers these functions to make them available to you.

AutoML

Use Oracle Autonomous Data Warehouse's AutoML capability to recommend and train a predictive model for you. The AutoML step analyzes your data, calculates the best algorithm to use, and registers a prediction model in Oracle Analytics. The analytics are computed in the database, not in Oracle Analytics. This step is available in the step selector when you're connected to a dataset based on Oracle Autonomous Data Warehouse.

See Train a Predictive Model Using AutoML in Oracle Autonomous Data Warehouse.

Tutorial icon Tutorial

Bin

Assign data values into categories, such as high, low, or medium. For example, you might categorize values for RISK into three bins for low, medium, and high.

Branch

Creates multiple outputs from a data flow. For example, if you have sales transactions data based on country, you might save data for United States in the first branch and data for Canada in the second branch.

Create Essbase Cube

Create an Essbase cube from a spreadsheet or database.

Cumulative Value

Calculate cumulative totals such as moving aggregate or running aggregate.

Database Analytics

Perform advanced analysis and data mining analysis. For example, you can detect anomalies, cluster data, sample data, and perform affinity analysis. This step is available in the step selector when you're connected to a dataset based on Oracle database or Oracle Autonomous Data Warehouse. The analytics are computed in the database, not in Oracle Analytics. See Database Analytics Functions.

Filter

Select only the data that you're interested in. For example, you might create a filter to limit sales revenue data to the years 2020 through 2022.

Graph Analytics

Perform geo-spatial analysis, such as calculating the distance or the number of hops between two vertices. This step is available in the step selector when you're connected to a dataset based on Oracle database or Oracle Autonomous Data Warehouse. The analytics are computed in the database, not in Oracle Analytics. See Graph Analytics Functions.

Group

Categorize non-numeric data into groups that you define. For example, you might put orders for lines of business Communication and Digital into a group named Technology, and orders for Games and Stream into a group named Entertainment.

Join

Combine data from multiple data sources using a database join based on a common column. For example, you might join an Orders dataset to a Customer_orders dataset using a customer ID field.

Merge

Combine multiple columns into a single column. For example, you might merge the street address, street name, state, and ZIP code columns into one column.

Rename Columns

Change the name of a column to more meaningful. For example, you might change CELL to Contact Cell Number.

Reorder Columns

Change the ordering of columns in the output dataset. For example, you might want to order columns alphabetically based on column name, or order columns based on data type (character, integer, and so on).

Save Data

Specify where to save the data generated by the data flow. You can save the data in a dataset in Oracle Analytics or in a database. You can also specify runtime parameters, or change the default dataset name. See Database Support for Data Flows.

Select Columns

Specify which columns to include or exclude in your data flow (the default is to include all data columns).

Split Columns

Extract data from within columns. For example, if a column contains 001011Black, you might split this data into two separate columns, 001011 and Black.

Time Series Forecast

Calculate forecasted values based on historical data. A forecast takes a time column and a target column from a given dataset and calculates forecasted values for the target column.

Train <model type>

Train machine learning models using algorithms for numeric prediction, multi-classification, binary-classification and clustering. See Data Flow Steps for Training Machine Learning Models.

When you've trained a machine learning model, apply it to your data using the Apply Model step.

Transform Column

Change the format, structure, or values of data. For example, you might convert text to uppercase, trim leading and trailing spaces from data, or calculate a percentage increase in value.

Union Rows

Merge the rows of two data sources (known as a UNION command in SQL terminology). You can match columns by order or name.