Before you Begin
In this tutorial, you create a dataset with multiple tables and select a column for Explain machine learning to analyze. When the dataset has more than one table, Explain reviews the table containing the selected column and the other tables in the dataset for their impact on the selected column.
Background
The Oracle Analytics Explain machine learning algorithm reviews the entire dataset for patterns and facts, and generates visualizations that you can use in your workbook. Using Auto Insights as the basis for the columns used by Explain, only the most interesting columns are reviewed in the data analyses. By only looking at the most interesting columns, Explain provides analyses without compromising performance.
When using multiple table or star schema datasets with Explain you need to identify the fact table by setting the preserve grain property.
You can change the columns used in Explain by adding or removing the default column selections in Settings. Your column selections in Explain persist during the Oracle Analytics session and aren't persisted when you close the workbook.
What Do You Need?
- Access to Oracle Analytics or Oracle Analytics Desktop
When using Oracle Analytics Desktop, you must install machine learning (DVML) to use Diagnostics Analytics (Explain), Machine Learning Studio, or advanced analytics.
- Access to the SH sample schema to perform the steps in this tutorial, see Installing Sample Schemas
Create a Dataset with Multiple Tables
In this section, you create a dataset from the SH schema. By default, the Auto Join tables option uses the relationships defined in the schema to create the table joins.
This example uses the SH schema from an Oracle Database connection.
- Sign in to Oracle Analytics.
- On the Home page, click Create, and then click Dataset.
- In Create Dataset, select a connection that supports datasets with multiple tables to use as the source.
- In the Connections panel, expand the SH schema, hold down the Ctrl key, and then select the following:
- CHANNELS
- COUNTRIES
- CUSTOMERS
- PRODUCTS
- SALES
- TIMES
- Drag the tables to the Join Diagram.
Description of the illustration sample_sales_ds.png -
Right-click SALES and select Preserve Grain.
Description of the illustration sales_fact_table.png - Click Save
. In Save Dataset As, enter
sample_sales
in Name and click OK.
Create Visualization with Explain
In this section, you create a workbook, select a column for Explain to analyze, and then examine the visualizations.
- Click Create Workbook.
- Close the Auto Insights panel.
- In the Data panel, expand PRODUCTS, right-click PROD_SUBCATEGORY, and then select Explain PROD_SUBCATEGORY.
Description of the illustration basic_facts.png - In Basic Facts about PROD_SUBCATEGORY, scroll down to view the horizontal bar chart of PROD_SUBCATEGORY by AMOUNT_SOLD.
Description of the illustration basic_fact_amt_sold.png - Click Settings.
- Expand SALES and remove the checks from TIME_ID levels, CHANNEL_ID, and PROMO_ID. Select QUANTITY_SOLD.
Description of the illustration basic_facts_settings.png - Click Apply.
Explain generates a horizontal bar visualization of PROD_SUBCATEGORY by QUANTITY_SOLD. The basic facts donut visualization doesn't change.
Description of the illustration qnty_sold_prodsubcat.png - Hover your cursor over the upper right side of the PROD_SUBCATEGORY by QUANTITY_SOLD and the PROD_SUBCATEGORY by AMOUNT_SOLD visualizations, and then click Select for Canvas
. When the check mark selected for the canvas changes to green
, click Add Selected.
Oracle Analytics adds the selected visualizations to the canvas and closes Explain.
Description of the illustration bf_added_to_canvas.png
Examine Key Drivers, Segments, and Anomalies
- In the Data pane, right-click PROD_SUBCATEGORY and select Explain PROD_SUBCATEGORY.
- In Explain, click Key Drivers of PROD_SUBCATEGORY.
Explain examines all tables in the dataset that are related to the PROD_SUBCATEGORY column in the PRODUCTS table. The results include the interaction with FISCAL_MONTH_NAME and CALENDAR_MONTH_NAME from the TIMES table.
Description of the illustration key_drivers.png - Click Segments that Explain PROD_SUBCATEGORY.
The Explain segment analysis shows the product subcategory values and the impact by CHANNEL_CLASS and CHANNEL_DESC in the CHANNELS table.
Description of the illustration prod_subcat_segments.png - Click Settings. In Settings, expand COUNTRIES, select COUNTRY_NAME, and then click Apply.
- Click
Anomalies of PROD_SUBCATEGORY . Scroll down to When COUNTRY_NAME is Germany..., click Select for Canvas.
Description of the illustration germany.png - In Anomalies of PROD_SUBCATEGORY, next to the message "4 updates available," click Refresh View.
Explain adds four visualizations to Anomalies.
- Hover and select one or two of the visualizations for the canvas. In each visualization, click Select for Canvas
, and then click Add Selected.
Oracle Analytics add another canvas to your workbook with the visualizations selected from Explain Anomalies.
Description of the illustration explain_anomalies.png
Learn More
Analyze Datasets with Multiple Tables using Oracle Analytics Explain
F83957-01
July 2023
Copyright © 2023, Oracle and/or its affiliates.
Learn how to gather information about columns in your dataset with multiple tables using Oracle Analytics Explain.
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract. The terms governing the U.S. Government's use of Oracle cloud services are defined by the applicable contract for such services. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Inside are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Epyc, and the AMD logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.