Predictive Query nodes enable you to score data dynamically without a predefined model. Predictive Queries use in-database scoring technology.
Note:
Predictive Query Nodes require Oracle Database 12c or later.Scoring using Predictive Query nodes has the following limitations:
The transient models created during the running of Predictive Query node are not available for inspection or fine tuning.
If it is necessary to inspect the model, correlate scoring results with the model, specify special algorithm settings, or run multiple scoring queries that use the same model, then a predefined model must be created.
The output of a Predictive Query is the output of an Apply operation.
There are several Predictive Query nodes:
See Also:
"Apply Node Output"An Anomaly Detection Query node analyses the input for anomalies. That is, it detects unusual cases in data.
Note:
Predictive Query Nodes require Oracle Database 12c or later.Anomaly Detection Query can run in parallel.
This section about Anomaly Detection consists of the following topics:
To create an Anomaly Detection Query in an existing workflow:
In the Components pane, go to Workflow Editor. If the Components pane is not visible, then go to View and click Components.
Alternately, press Ctrl+Shift+P to dock the Components pane.
Create the Data Source node containing the input data.
Expand the Predictive Query section in the Components pane.
Drag and drop the Anomaly Detection Query node from the Components pane to the workflow pane.
Connect the Data Source node to the Anomaly Detection Query Node.
Edit the Anomaly Detection Query node.
Run the Predictive Query node and view the data for a Predictive Query node.
To save the results of the query, use a Create Table or View node.
To edit an Anomaly Detection Query Node, either double-click the node or right-click the node and select Edit. In the Select Columns window, specify the following information on the Anomaly Predictions tab:
Double-click the Anomaly Detection Query node or right-click the node and click Edit. The Edit Anomaly Detection Query Node dialog box opens.
In the Anomaly Detection Query Node dialog box, enter the details in the following tabs:
Anomaly Predictions tab:
In the Case ID field, select a case ID from the drop-down list (Optional). It is recommended that you specify a case ID to uniquely define each record. The case ID helps with model repeatability and is consistent with good data mining practices.
You can edit the outputs of an Anomaly Prediction (optional). Oracle Data Miner automatically determines the output for the query. You can modify the output.
Partition tab: You can perform the following tasks.
Add one or more Partition attributes. This is optional. Selecting a Partition attribute directs the predictive query to build a virtual model for each unique partition.
Select partitions. To select partitions, click the Partition tab and click . Use the Add Partitioning Columns dialog box to select the partitions. You can also specify partitioning expressions.
Input tab: You can perform the following tasks:
Add and modify input. Click Input to add and modify input.
Remove input.
Change the mining type.
Additional Output tab: You can add output (optional). By default, all target columns, the Case ID column, and partitioning columns are automatically added to additional output. To make changes, click Additional Output.
Click OK.
Oracle Data Miner automatically selects output for query. The default output is listed in the Anomaly Prediction Outputs section in the Anomaly Predictions tab. The defaults are:
Prediction
Prediction Details
Prediction Probability
You can select Prediction Set and edit parameters of the output functions. You can perform the following tasks:
Delete: To delete an output, select the output and click .
Add: To add an output, click . Use the Add Anomaly Function dialog box to select an output.
Edit: To edit an output, either double-click the function or select the function and click . Use Edit Anomaly Function dialog box to make changes.
The output of a Predictive Query is the output of an Apply (Scoring) operation.
To add Anomaly Function:
In the Function field, select a function from the drop-down list. The options are:
Prediction
Prediction Probability
Prediction Details
Prediction Set
To specify a default name instead of using the default name, deselect Auto. This turns off automatic selection.
Click OK.
To edit an Anomaly Function:
Select the change that you want to make to the function.
To specify a name instead of using the default name, deselect Auto. This turns off automatic selection.
Click OK.
The Anomaly Detection Query Properties enables you to view and change information about this Predictive Query node. If Properties is closed, then go to View and click Properties. Alternately, right-click the node and click Go to Properties.
The Anomaly Detection Query Node Properties has these sections:
Displays the predictions produced by the query.
See Also:
"Edit an Anomaly Detection Query"Displays the output specified.
See Also:
"Edit Anomaly Prediction Output"To view the Anomaly Detection Query node context menu, right-click the node. The following options are available in the context menu:
Edit. Opens Edit an Anomaly Detection Query dialog box.
View Data. It performs two functions:
Runs the node on a small sample of the data.
Opens the View Data for a Predictive Query dialog box.
Parallel Query. See "About Parallel Processing" for more information.
Show Runtime Errors. Displayed only if there is an error.
Show Validation Errors. Displayed if there are validation errors.
A Clustering Query node returns the clusters in the input.
Note:
Predictive Query nodes require Oracle Database 12c or later.A Clustering Query can run in parallel.
This section about Clustering Query node contains the following topics:
See Also:
To create a Clustering Query in an existing workflow:
In the Components pane, go to Workflow Editor. If the Components pane is not visible, then go to View and click Components.
Alternately, press Ctrl+Shift+P to dock the Components pane.
Create a Data Source node containing the input data.
Expand the Predictive Queries section in the Components pane.
Drag and drop a Clustering Query node to the workflow.
Connect the Data Source node to the Clustering Query node.
Edit the Clustering Query node.
Run the Predictive Query node and view the data.
To save the results of the query, use a Create Table or View node.
To edit a Clustering Query node:
Double-click the Clustering Query node or right-click the node and click Edit. The Edit Clustering Query Node dialog box opens.
In the Edit Clustering Query Node dialog box, enter the details in the following tabs:
Cluster Predictions tab: In the Case ID field, select a case ID from the drop-down list. The case ID is optional. It is recommended that you specify a Case ID to uniquely define each record. The case ID helps with model repeatability and is consistent with good data mining practices.
In the Case ID field, select a case ID from the drop-down list. The case ID is optional. It is recommended that you specify a case ID to uniquely define each record. The case ID helps with model repeatability and is consistent with good data mining practices.
In the Number of Clusters to Compute field, specify the number to compute. Default is 10.
Edit Cluster Prediction Outputs. Oracle Data Miner automatically determines the output for the query. You can modify the output.
Partition tab: You can perform the following tasks:
Add one or more Partition attributes. This is optional. Selecting a Partition attribute directs the predictive query to build a virtual model for each unique partition.
Select Partitions: To select partitions, click the Partitions tab. Then, click . Then use the Add Partitioning Columns to select the partitions. You can also specify partitioning expressions.
Input tab: You can modify Input. This is optional. You can add or remove inputs and change the mining types of inputs. Click Input.
Additional Output tab: You can add outputs (optional). By default, all target columns, the Case ID column, and partitioning columns are automatically added to Additional Output. To make changes, click Additional Output.
Click OK.
The output of a Predictive Query is the output of an Apply (scoring) operation. See "Apply Functions" for more information about Apply output.
Oracle Data Miner automatically selects output for query. The default outputs are listed in the Cluster Prediction Outputs section in the Cluster Predictions tab. The defaults are:
Cluster Details
Cluster Distance
Cluster ID
Cluster Probability
You can also select Cluster Set, and edit parameters of the output functions. You can perform the following tasks:
Delete: To delete an output, select the output and click .
Add: To add an output, click . Use Add Cluster Function dialog box to select an output.
Edit: To edit an output, either double-click the function or select the function and click . Use the Edit Cluster Function dialog box to make changes.
In the Add Cluster Function dialog box, you can add cluster functions. To add Cluster function:
In the Function field, select a function from the drop-down list. The options are:
Cluster ID
Cluster Probability
Cluster Details
Cluster Distance
Cluster Set
To specify a default name instead of using the default name, deselect Auto. This turns off automatic selection.
Click OK.
To edit Cluster function:
Select the change that you want to make to the function.
To specify a name instead of using the default name, deselect Auto. This turns off automatic selection.
Click OK.
The Clustering Query properties enables you to view and change information about this Predictive Query node. If the Properties pane is closed, then go to View and click Properties. Alternately, right-click the node and click Go to Properties.
The Clustering Query Properties pane has these sections:
Displays the predictions produced by the query.
See Also:
"Edit a Clustering Query"Displays the output specified.
See Also:
"Edit Cluster Prediction Outputs"To view the Clustering Query node context menu, right-click the node. The following options are available in the context menu:
Edit. Opens Edit a Clustering Query dialog box.
View Data. It perform two functions:
Runs the node on a small sample of data.
Opens the View Data for a Predictive Query dialog box.
Parallel Query. See "About Parallel Processing" for more information.
Show Runtime Errors. Displayed only if there is an error.
Show Validation Errors. Displayed if there are validation errors.
A Feature Extraction Query extracts features from the input.
Note:
Predictive Query nodes require Oracle Database 12c or later.A Feature Extraction Query node can run in parallel.
This section on Feature Extraction Query node contains the following topics:
To create a Feature Extraction Query to an existing workflow:
In the Components pane, go to Workflow Editor. If the Components pane is not visible, then go to View and click Components.
Alternately, press Ctrl+Shift+P to dock the Components pane.
Create a Data Source node containing the input data.
Expand the Predictive Queries section in the Components pane.
Drag and drop a Feature Extraction Query node from the Components pane to the Workflow pane.
Connect the Data Source node to the Feature Extraction Query node.
Edit the Feature Extraction Query node.
Run the Predictive Query node and view the data.
To save the results of the query, use a Create Table or View node.
To edit a a Feature Extraction Query Node:
Double-click the Feature Extraction Query node or right-click the node and click Edit. The Edit Feature Extraction Query Node dialog box opens.
In the Feature Extraction Query Node dialog box, enter the details in the following tabs:
Feature Predictions tab:
In the Case ID field, select a case ID from the drop-down list. The case ID is optional. It is recommended that you specify a case ID to uniquely define each record. The case ID helps with model repeatability and is consistent with good data mining practices.
In the Number of Features to Extract field, specify an input. The default is 10.
Edit Feature Prediction Outputs. Oracle Data Miner automatically determines the output for the query. You can modify the output.
Partitions tab:
Add one or more Partition attributes. This is optional. Selecting a Partition attribute directs the predictive query to build a virtual model for each unique partition.
To add partitions, click in the Partitions tab. Use the Add Partitioning Columns dialog box.
Specify partitioning expressions.
Input tab: You can perform the following tasks:
Add input
Modify input
Change the mining type
Additional Output tab: You can add output. This is optional. By default, all target columns, the Case ID column, and partitioning columns are automatically added to Additional Output. To make changes, click Additional Output.
Click OK.
The output of a Predictive Query is the output of an Apply (scoring) operation. Oracle Data Miner automatically selects output for query. The default output is Feature Set. You can also select feature ID, feature details, and feature value. You can edit the parameters of the functions and perform the following tasks:
Delete: To delete an output, select the output and click .
Add: To add an output, click . Use the Add Feature Function dialog box to select an output.
Edit: To edit an output, either double-click the function or select the function and click . Use the Edit Feature Function dialog box to make changes.
To add a Feature Function:
In the Function field, select a function from the drop-down list. The options are:
Feature ID
Feature Value
Feature Details
Feature Set
To specify a default name instead of using the default name, deselect Auto. This turns off automatic selection.
Click OK.
To edit a Feature Function:
Select the change that you want to make to the function.
To specify a name instead of using the default name, deselect Auto. This turns off automatic selection.
Click OK.
The Feature Extraction Query properties enables you to view and change information about this Predictive Query node. If the Properties pane is closed, then go to View and Properties. Alternately, right-click the node and click Go to Properties.
The Feature Extraction Query Properties has these sections:
Displays the predictions produced by the query.
See Also:
"Edit Feature Extraction Query"Displays the output specified.
See Also:
"Edit Feature Prediction Outputs"To view the Feature Extraction Query node context menu, right-click the node. The following options are available in the context menu:
Edit. Opens Edit Feature Extraction Query dialog box.
View Data. It perform two functions:
Runs the node on a small sample of the data.
Opens the View Data for a Predictive Query dialog box.
Parallel Query. See "About Parallel Processing" for more information.
Show Runtime Errors. Displayed only if there is an error.
Show Validation Errors. Displayed if there are validation errors.
A Prediction Query node performs classification and regression using the input. The data type of the target determines whether classification or regression is performed.
A Prediction Query can run in parallel.
Note:
Predictive Query nodes require Oracle Database 12c or higher.This section on Prediction Query node contains the following topics:
To create a Prediction Query in an existing workflow:
In the Components pane, go to Workflow Editor. If the Components pane is not visible, then go to View and click Components.
Alternately, press Ctrl+Shift+P to dock the Components pane.
Create a Data Source node containing the input data.
Expand the Predictive Queries section in the Components pane.
Drag and drop the node from the Components pane to the Workflow pane.
Connect the Data Source Node to the Prediction Query Node.
Edit the Prediction Query node.
Run the Predictive Query node and view the date.
To save the results of the query, use a Create Table or View node.
To edit a Prediction Query node:
Double-click the Prediction Query node or right-click the node and click Edit. The Edit Prediction Query Node dialog box opens.
In the Edit Prediction Query Node dialog box, enter the details in the following tabs:
Predictions tab:
In the Case ID field, select a case ID from the drop-down list. The case ID is optional. It is recommended that you specify a case ID to uniquely define each record. The case ID helps with model repeatability and is consistent with good data mining practices.
In the Targets section, you can add one or more targets. To add target, click . Use the Add Target dialog box to define targets.
In the Targets section, change the Mining Type of a target, if necessary. Each attribute has an associated data type and mining type. The mining type defines how the attribute is treated in the predictive query.
For Regression query analysis, the mining type must be Numerical.
For Classification query analysis, the mining type must be Categorical.
If you make changes, click OK.
In the Prediction Output section, you can edit the prediction output. Oracle Data Miner automatically determines the output for each target. You can modify the output.
Partition tab: You can perform the following tasks:
Add one or more Partition attributes. This is optional. Selecting a Partition attribute directs the predictive query to build a virtual model for each unique partition.
To add partitions, click . Use the Add Partitioning Columns dialog box to select partitions.
Specify partitioning expressions.
Input tab: You can perform the following tasks:
Modify input
Add or remove inputs
Change mining types of inputs
Additional Output tab: You can add outputs (optional). By default, all target columns, the Case ID column, and partitioning columns are automatically added to Additional Output. To make changes, click Additional Output.
Click OK.
Tip:
You must add at least one target. Targets can have different mining types. To add a target:
Select one or more attributes in the Available Attributes list to serve as prediction targets. The targets do not have to have the same data type.
You can select nested attributes as targets if they are of type ODMR_NETSTED_*
. For example, you could use a join to create a nested attribute consisting of all products purchased by a customer; this attribute would be nested and have the data type ODMR_NESTED_VARCHAR2.
Move the target columns to the Selected Attributes list using the arrows.
Click OK. The selected attributes are added to the Targets list.
Partitioning columns result in building a virtual model for each unique partition. Because the virtual model uses data only from a specific partition, it can potentially predict cases more accurately than if you did not select a partition.
In addition to selecting attributes, you can specify partitioning expressions. Partitioning expressions are concatenated and the result expression is the same for all predictive functions.
Select one or more attributes in the Available Attributes list to serve as partitions.
Move the selected columns to the Selected Attributes list using the arrows.
Click OK. The attributes are moved to the Partition list.
Optionally, you can add partitioning expressions.
See Also:
"Add Partitioning Expressions"To specify a partitioning expression, click . Use Expression Builder to create an expression.
Suppose one of the partitions is AGE. Here is a sample partitioning expression:
CASE WHEN AGE < 20 THEN 1 WHEN AGE >=20 AND AGE < 40 THEN 2 WHEN AGE >=40 AND AGE < 60 THEN 3 ELSE 4 END
Suppose this expression is named Expression_1
. After you run the node, the output includes a column titled Expression_1
. This column will contain the value 1
if AGE
is less than 20
, 2
if AGE
is 20
or larger but less than 40
, and so forth.
See Also:
"Expression Builder"The output of a Predictive Query is the output of an Apply (Scoring) operation. Oracle Data Miner automatically selects output for the target. The default output is listed in the Prediction Outputs section in the Predictions tab.
For Classification, the default output are:
Prediction
Prediction Details
Prediction Probability
You can also select Prediction Set.
For Regression, the default outputs are:
Prediction
Prediction Details
For Classification or Regression, you can edit the parameters of functions and the output for each target one at a time. You can perform the following tasks:
To edit Prediction Output Function, select a target and click . You edit output for each target one at a time. Use the Edit Prediction Function dialog box to make changes.
To delete an output, select the output and click .
To add an output, select the target in the Targets section and click . Use the Add Prediction Output Function dialog box to select an output.
To add Prediction Output Function:
In the Function field, select a function from the drop-down list.
To specify a default name instead of using the default name, deselect Auto. This turns off automatic selection.
Click OK.
To edit Prediction Function:
Select the change that you want to make to the function.
To specify a name instead of using the default name, deselect Auto. This turns off automatic selection.
Click OK.
The Input tab shows all columns that are used as input for the Predictive Query. Target (for Prediction Query) and Case ID columns are identified with special icons. The Rule column in the grid explains why an attribute is not used.
By default, Determine inputs automatically (using heuristics) is selected. After you run the node, you can click the link to View Heuristic Results Details. To change the inputs, deselect Determine inputs automatically (using heuristics). You can perform the following tasks:
Override defaults, and add or remove input columns.
Change Mining Types: To change the mining type of a column, click the Mining Type entry for the column, and select a new mining type from the drop-down list.
Ignore columns: If you do not want to use a column as input, click the Input entry for the attribute and select from the drop-down list. It ignored the selected column, and not used for input.
To use a column, select from the drop-down list.
Search column: To search for columns, use the Find field.
Provides detailed information about automatic changes made to the input. For example, the mining type is changed to Categorical
when the number of unique values is less than the threshold value of 5.
A column that has a constant value is excluded (not used as input).
The Output tab shows the columns that will be used in the output to identify the prediction data. By default, all target columns for Prediction Query, the Case ID column, and partitioning columns are automatically added to Additional Output. You can perform the following tasks:
Add Additional Output: To add additional output, click Automatic to turn off automatic selection. Then click . Use the Add Supplemental Dialog to add columns to the output.
Remove Output Columns: To remove columns, select the column and click .
Enables you to move columns from the Available Attributes list to the Selected Attributes list.
To run Predictive Query nodes, right-click the node and select either Run or View Data. Virtual models may take a while to be formulated. The View Data option generates a small sample output of the query.
Regardless of how you run it, select View Data to view the results of the query.
See Also:
"View Data for a Predictive Query"To view data for a Predictive Query node:
Right-click the node and select View Data. For Predictive Query Nodes, View Data displays the output from a node that has run or displays the results when the query is applied to a small subset of the data.
You can either sort or filter the data.
Click OK.
You can view prediction details to see Prediction Details in a separate window.
To view the prediction details:
Click the details that you want to view.
Then, click . The details are displayed in the View Value dialog box. You can also search for specific values.
The Prediction Query properties enables you to view and change information about this Predictive Query node. If the Properties pane is closed, then go to View and click Properties. Alternately, right-click the node and click Go to Properties.
The Prediction Query Node Properties has these sections:
Displays the predictions produced by the query.
See Also:
"Modify Input"Displays the columns selected to be used for partitioning. You can also delete partitioning columns and edit expressions for partitioning columns.
See Also:
"Add Partitioning Columns"Displays the output specified.
See Also:
"Add Prediction Output Function"To view the Prediction Query node context menu, right-click the node. The following options are available in the context menu:
Edit. Opens Edit a Prediction Query dialog box.
View Data. It perform two functions:
Runs the node on a small sample of the data.
Opens the View Data for a Predictive Query dialog box.
Parallel Query. See "About Parallel Processing" for more information.
Show Runtime Errors. Displayed only if there is an error.
Show Validation Errors. Displayed if there are validation errors.