Explain uses machine learning to find useful insights about your data.
What is Explain?
Explain analyzes the selected column within the context of its data set and generates text descriptions about the insights it finds. Explain creates corresponding visualizations that you can add to your project's canvas.
Explain uses Oracle's machine learning to generate accurate, fast, and powerful information about your data. To use Explain, choose a column in your data set and select the Explain option. Explain automatically applies machine learning's statistical analysis to find the most significant patterns, correlations (drivers), classifications, and anomalies in your data. Explain then returns visualizations displaying the insights it found. Users can select a visualization to open the project editor and customize the visualizations and drill further into the data.
For example, suppose you want to look for information about your company's employee attrition. You create a project using a data set that contains attrition information and various profile attributes about employees who have left the organization compared to employees who are still in the organization. Select Explain for the Attrition column, and Explain reveals that one of the key drivers of employee attrition is marital status.
Explain is for data analysts who might not know what data trends they're looking for, and don't want to spend time experimenting by either dragging and dropping columns onto the canvas, or using data flows to train and apply predictive models.
Explain is also a useful starting point for data analysts to confirm a trend that they're looking for in their data, and then use that information to create and tune predictive models to apply to other data sets.
What Are Insights?
Insights are categories that describe the selected column within the context of its data set.
The insights that Explain delivers are based on the column type or aggregation that you chose and will vary according to the aggregation rule set for the chosen metric. Explain generates only the insights that makes sense for the column type that you chose.
|Basic Facts||Displays the basic distribution of the column's values. Column data is broken down against each of the data set's measures.
This insight is available for all column types.
|Key Drivers||Shows the columns in the data set that have the highest degree of correlation with the selected column outcome. Charts display the distribution of the selected value across each correlated attributes value.
This tab displays only when explaining attribute columns, or when explaining a metric column that has an average aggregation rule.
|Segments||Displays the key segments (or groups) from the column values. Explain runs a classification algorithm on the data to determine data value intersections and identifies ranges of values across all dimensions that generate the highest probability for a given outcome of the attribute.
For example, a group of individuals of a certain age range, from a certain set of locations, with a certain range of years of education form a segment that has a very high probability of purchasing a given product.
This tab displays only when explaining attribute columns.
|Anomalies||Identifies a series of values where one of the (aggregated) values deviates substantially from what the regression algorithms expect.|
Use Explain to Discover Data Insights
When you select a column and choose the Explain feature, Oracle Analytics uses machine learning to analyze the column in the context of the data set. For example, Explain searches the selected data for key drivers and anomalies.
- In the Home page, click Create and then Project to create a new project.
- Click Visualize to open the Visualize canvas.
- In the Data Panel, right-click a column and select Explain <Data Element>.For Explain to successfully analyze an attribute, the attribute must have three to 99 distinct values.The Explain dialog displays basic facts, anomalies, and other information about the selected column.
- (Optional) In the Segments view, select the segments (or groups) that predict outcomes for the column you selected.
Click one or more columns to see how they impacts the column's outcome.
Sort how the information is displayed in the Segments. For example, confidence high to low or low to high.
- (Optional) If your results contain too many correlated and highly ranked columns (for example, ZIP code with city and state), then excluding some columns from the data set so that Explain can identify more meaningful drivers. To do this, exit Explain, go to the Prepare canvas and either hide or delete columns, return to the Visualize canvas, locate and right-click the column, and select Explain <Data Element>.
- For each visualization that you want to include in your project's canvas, hover over it and click its checkmark.
- After you've selected all the visualizations that you want to include in the canvas, click Add Selected. You can manage the Explain (data insight) visualizations like any other visualizations you’ve manually created on the canvas.
Create a Local Subject Area to Use with Explain
Explain isn't available for subject areas. You can work around this issue by using a subject area to create a local subject area. You can then use Explain to analyze the columns in the local subject area.
- On the Home page, click Create, and then click Data Set.
- Click Local Subject Area.
- In the Add Data Set editor, double click a subject area and then add columns to the local subject area.
- (Optional) Click the Filters step and select the column and values that you want to filter on.
- (Optional) Click the Sample step and in the Description field update the description. In the Data Access field, select Live or Automatic Caching.
- Click Add.In the Results dialog you can click the Create Project button now or from the Home page's Create button to create a project later. In the project's Prepare canvas, right-click a column and select Explain <Data Element>.