9 Evaluate and Apply Nodes

Oracle Data Mining enables you to test Classification and Regression models. A Test node is one of several ways to test a model. After you build a model, you apply the model to new data using an Apply node. Evaluate and Apply data must be prepared in the same way that build data was prepared.

The Evaluate and Apply nodes are:

Apply Node
Test Node

See Also:

"Evaluate and Apply Data"

9.1 Apply Node

The Apply node takes a collection of models and returns a single score.The Apply node produces a query as the result. The result can further transformed or connected to a Create Table or View node to save the data as a table.

To make predictions using a model, you must apply the model to new data. This process is also called scoring the new data.

An Apply node generates the SQL for Scoring using one or more models. The SQL includes pass-through (supplemental) attributes and columns generated using Scoring functions.

Note:

You cannot apply Association or Attribute Importance models.

This section on Apply node contains the following topics:

Apply Preferences
Apply Node Input
Apply Node Output
Creating an Apply Node
Apply and Output Specifications
Edit Apply Node
Apply Node Properties
Apply Node Context Menu

See Also:

"About Parallel Processing"

9.1.1 Apply Preferences

To apply preferences to an Apply Node:

In Tools menu option, click Preferences.
In the Preferences dialog box, click Data Miner. You can view and change preferences for Apply operations. The default preferences for Data Miner are:
- Automatic Apply Settings
- Data Columns First
Click OK.

9.1.2 Apply Node Input

An Apply node requires the following input:

One or more of the following:
- Model node
- Model Build node
You must specify at least one model to apply. You can apply several models at the same time.
Any node that generates data as an output such as a Data node, a Transforms node, or an appropriate Text node.

Only one input node is permitted.

When you apply a model to new data, the new data must be transformed in the same way as the data used to build the model.

Note:

You cannot apply Association or Attribute Importance models.

9.1.3 Apply Node Output

An Apply node generates a data flow based on the Apply and Output Specifications.

9.1.4 Creating an Apply Node

Before creating an Apply node, you must connect a Data node and a Model node or Build node to the Apply node. To create an Apply node:

Open the Components pane and select Workflow Editor.
If the Components pane is not visible, then go to View and click Components.
Either identify Apply Data or create a Data Source Node containing the Apply Data. Ensure that the Apply Data is prepared in the same way as the Build Data.
Create a Model Node, a Model Build node (such as a Classification Node), or a combination of these nodes. At least one model must be successfully built before it can be applied.
You cannot apply Association models.
In the Workflow Editor, expand Evaluate and Apply, and click Apply.
Drag and drop the Apply node in the Workflow pane.
Link the Data node, Model nodes, and Build nodes to the Apply Node.

See Also:

"Evaluate and Apply Data"
"Edit Apply Node"

9.1.5 Apply and Output Specifications

There are two ways to create Apply and Output specifications:

Use the Automatic Settings.
Create specifications using the Edit Apply Node.
Add Additional Output to help identify output.

9.1.5.1 Automatic Settings

By default, Automatic Settings is used.

9.1.5.2 Edit Apply Node

To edit or view an Apply specification, either double-click the Apply node or right-click the Apply node and select Edit. The Edit Apply Node dialog box opens.

The Edit Apply Node dialog box has two tabs:

Predictions: Defines the Apply Scoring specifications.

An Apply specification consists of several Output Apply columns. The column names are generated automatically.
- You can specify names. The names must not be more than 30 characters.
- You can then select a model from the list of models in all Input nodes and an Apply function.
  The Apply functions that you can select depend on the selected models.
Additional Output: Specifies pass-through columns from the Input node. You can select as many columns as you want. You can specify that these selected columns are displayed before the Apply columns (the default) or after the Apply columns.

These columns are often used to identify the Apply output. For example, you can use the Case ID column to identify the Apply output.

The default is to not specify any additional output.

The Default Column Order, at the bottom of the Edit Apply Node dialog box, is Data Columns First in the output. You can change this to Apply Columns First.
See Also:

9.1.5.2.1 Predictions

To define specific Apply settings or to edit the default settings, deselect Automatic Settings. You then add new Apply functions or edit existing ones.

You can edit settings in several ways:

Add a setting: Click to open the Add Output Apply Column dialog box.
Edit an existing setting: Select the setting and click . The Edit Output Data Column dialog box opens.
Delete a specification: Select it and click .
Define Apply Columns: Click . In the Define Apply Columns Wizard, click the Define Apply Columns icon.

See Also:

"Add Output Apply Column Dialog"
"Apply Functions"
"Edit Output Data Column Dialog"
"Define Apply Columns Wizard"

9.1.5.2.2 Apply Functions

The Apply functions that you can choose depend on the models that you apply.

Note:

Certain Apply functions are available only if you are connected to Oracle Database 12c.

The Apply functions, arranged according to Model node are:

Anomaly Detection Models
- Prediction: An automatic setting that returns the best prediction for the model. The data type returned depends on the target value type used during the build of the model. For Regression models, this function returns the expected value. The function returns the lowest cost prediction using the stored cost matrix if a cost matrix exists. If no stored cost matrix exists, then the function returns the highest probability prediction.
- Prediction Details: Returns prediction details. The return value describes the attributes of the prediction. For Anomaly Detection, the returned details refer to the highest probability class or the specified class value.
  
  Note:
  Prediction Details requires a connection to Oracle Database 12 c.
  
  The defaults for Predictions Details are:
  - Target Value: Most Likely
  - Sort by Weights: Absolute value
  - Maximum Length of Ranked Attribute List: 5
  Prediction Details output is in XML format (XMLType data type). You must parse the output to find the data that you need.
- Prediction Probability: An automatic setting that returns the probability associated with the best prediction.
- Prediction Set: Returns a varray of objects containing all classes in a multiclass classification scenario. The object fields are named PREDICTION, PROBABILITY, and COST. The data type of the PREDICTION field depends on the target value type used during the build of the model. The other two fields are both Oracle NUMBER. The elements are returned in the order of best prediction to worst prediction.
Clustering Models
- Cluster Details: The return value describes the attributes of the highest probability cluster or the specified cluster ID. If you specify a value for TopN, then the function returns the N attributes that most influence the cluster assignment (the score). If you do not specify TopN, then the function returns the five most influential attributes.
  
  Note:
  Cluster Details requires a connection to Oracle Database 12c.
  
  The defaults for Predictions Details are as follows:
  - Cluster ID: Most Likely
  - Sort by Weight: Absolute value
  - Maximum Length of Ranked Attribute List: 5
  The returned attributes are ordered by weight. The weight of an attribute expresses its positive or negative impact on cluster assignment. A positive weight indicates an increased likelihood of assignment. A negative weight indicates a decreased likelihood of assignment.
  
  Cluster Details output is in XML format (XMLType data type). You must parse the output to find the data that you need.
- Cluster Distance: Returns a cluster distance for each row in the selection. The cluster distance is the distance between the row and the centroid of the highest probability cluster or the specified cluster ID.
  
  Note:
  Custer Distance requires connection to Oracle Database 12c.
  
  The defaults for Predictions Details are as follows:
  - Cluster ID: Most Likely
- Cluster ID: An automatic setting that returns the NUMBER of the most probable cluster ID. If the cluster ID has been renamed, then a VARCHAR2 is returned instead.
- Cluster Probability: An automatic setting that returns a measure of the degree of confidence of membership (NUMBER) of an input row in a cluster associated with the specified model.
- Cluster Set: Returns a varray of objects containing all possible clusters that a given row belongs to given the parameter specifications. Each object in the varray is a pair of scalar values containing the cluster ID and the cluster probability. The object fields are named CLUSTER_ID and PROBABILITY, and both are Oracle NUMBER Clustering models only.
Feature Extraction Models
- Feature ID: Returns an Oracle NUMBER that is the identifier of the feature with the highest value for the row.
  Can we rename feature ids? In which case this should also return a VARCHAR2.
- Feature Set: An automatic setting that is similar to Cluster Set.
- Feature Value: Returns the value of a given feature. If you omit the feature ID argument, then the function returns the highest feature value.
- Feature Details: The return value describes the attributes of the highest value feature or the specified feature ID. If you specify a value for TopN, the function returns the N attributes that most influence the feature value. If you do not specify TopN, the function returns the 5 most influential attributes.
  
  Note:
  Feature Extraction Model requires connection to Oracle Database 12c.
  
  The returned attributes are ordered by weight. The weight of an attribute expresses its positive or negative impact on the value of the feature. A positive weight indicates a higher feature value. A negative weight indicates a lower feature value.
  
  The defaults for Predictions Details are as follows:
  - Feature ID: Most Likely
  - Sort by Weight: Absolute value
  - Maximum Length of Ranked Attribute List: 5
  Feature Details output is in XML format (XMLType data type). You must parse the output to find the data that you need.
Classification and Regression Models
- Prediction: An automatic setting that returns the best prediction for the model. The data type returned depends on the target value type used during the build of the model.
  - For Regression models, this function returns the expected value.
  - For Classification models, the returned details refer to the highest probability class or the specified class value.
    
    The function returns the lowest cost prediction using the stored cost matrix if a cost matrix exists. If no stored cost matrix exists, then the function returns the highest probability prediction.
- Prediction Bounds: For generalized linear models, it returns an object with two NUMBER fields LOWER and UPPER. If the GLM was built using Ridge Regression, or if the Covariance Matrix is found to be singular during the build, then this function returns NULL for both fields.
  - For a Regression mining function, the bounds apply to value of the prediction.
  - For a Classification mining function, the bounds apply to the probability value.
- Prediction Bounds Lower: Same as Prediction Bounds but only returns the lower bounds as a scalar column. Automatic Setting for GLM models.
- Prediction Bounds Upper: Same as Prediction Bounds but only returns the upper bounds as a scalar column. Automatic Setting for GLM models.
- Prediction Details: Requires connection to Oracle Database 12c except for Decision Tree.
  
  The defaults for Predictions Details for Classification are as follows:
  - Target Value: Most Likely
  - Sort by Weights: Absolute value
  - Maximum Length of Ranked Attribute List: 5
  The defaults for Predictions Details for Regression are as follows:
  - Sort by Weights: Absolute value
  - Maximum Length of Ranked Attribute List: 5
  DT Prediction Details: Returns a string containing model-specific information related to the scoring of the input row. In Oracle Data Miner releases earlier than 4.0, the return value is in the form <Node id = "integer"/>.
  
  Note:
  DT Prediction Details requires a connection to Oracle Database 11g Release 2 (11.2)
Classification
- Prediction Costs: Returns a measure of cost for a given prediction as a NUMBER. Classification models only. Automatic Setting for DT models.
- Prediction Probability: Returns the probability associated with the best prediction.
  The Automatic Setting for is Most Likely.
- Prediction Set: Returns a varray of objects containing all classes in a multiclass classification scenario. The object fields are named PREDICTION, PROBABILITY, and COST. The data type of the PREDICTION field depends on the target value type used during the build of the model. The other two fields are both Oracle NUMBER. The elements are returned in the order of best prediction to worst prediction.

See Also:

"Apply Functions Parameters"

9.1.5.2.3 Apply Functions Parameters

The Apply Function parameters that can be specified:

Cluster ID: The default is Most Probable. No other parameters are supported.
Cluster Probability: The default is Most Probable.
You can also select a specific cluster ID or specify NULL or Most Likely to return the bounds for the most likely cluster.
Cluster Set: The default is All Clusters.
You can also specify either or both of the following:
- TopN: Where N is between one and the number of clusters. The optional TopN argument is a positive integer that restricts the set of features to those that have one of the top N values. If there is a tie at the Nth value, then the database still returns only N values. If you omit this argument, then the function returns all features.
- Probability Cutoff: It is a number strictly greater than zero and less than or equal to 1. The optional cutoff argument restricts the returned features to only those that have a feature value greater than or equal to the specified cutoff. To filter only by cutoff, specify Null for TopN and the desired cutoff for cutoff.
Feature ID: The default is Most Probable. No other values are supported.
Feature Set: The default is All Feature IDs. You can also specify either or both of the following:
- TopN: Where N is between 1 and the number of clusters. The optional TopN argument is a positive integer that restricts the set of features to those that have one of the top N values. If there is a tie at the Nth value, then the database still returns only N values. If you omit this argument, then the function returns all features.
- Probability Cutoff: It is a number strictly greater than zero and less than or equal to one. The optional cutoff argument restricts the returned features to only those that have a feature value greater than or equal to the specified cutoff. To filter only by cutoff, specify Null for TopN and the desired cutoff.
Feature Value: The default is Highest Value.
You can also select a specific feature ID value or specify anyone of the following value to return the bounds for the most likely feature:
- NULL
- Most Likely
Prediction: The default is Best Prediction to consider the cost matrix.
Prediction Upper Bounds or Prediction Lower Bounds: The default is Best Prediction with Confidence Level 95%.
You can change Confidence Level to any number strictly greater than zero and less than or equal to one. For Classification models only, you can use the Target Value Selection dialog box option to pick a specific target value. You can also specify Null or Most Likely to return the bounds for the most likely target value.
Prediction Costs: The default is Best Prediction.
Applicable for Classification models only. You can use the Target Value Selection option to pick a specific target value.
Prediction Details: Only value is the details for the Best Prediction.
Prediction Probability: The default is Best Prediction.
Applicable for Classification models only. You can use the Target Value Selection option to pick a specific target value.
Prediction Set: The default is All Target Values.
You can also specify one or both of the following:
- bestN: Where N is between one and the number of targets. The optional bestN argument is a positive integer that restricts the returned target classes to the N having the highest probability, or lowest cost if cost matrix clause is specified. If multiple classes are tied in the Nth value, then the database still returns only N values.
  To filter only by cutoff, specify Null for this parameter.
- Probability Cutoff: Is a number strictly greater than zero and less than or equal to one. The optional cutoff argument restricts the returned target classes to those with a probability greater than or equal to (or a cost less than or equal to if cost matrix clause is specified) the specified cutoff value.
  You can filter solely by cutoff by specifying Null for this value.

9.1.5.2.4 Default Apply Column Names

The syntax of the default Apply column name is:

        "<FUNCTION ABBREVIATION>_<MODEL NAME><SEQUENCE>

SEQUENCE is used only if necessary to avoid a conflict. A sequence number may force the model name to be partially truncated.

FUNCTION ABBREVIATION is one of the following:

Cluster Details: CDET
Cluster Distance: CDST
Cluster ID: CLID
Cluster Probability: PROB
Cluster Set: CSET
Feature Details: FDET
Feature ID: FEID
Feature Set: FSET
Feature Value: FVAL
Prediction: PRED
Prediction Bounds: PBND
Prediction Upper Bounds: PBUP
Prediction Lower Bounds: PBLW
Prediction Costs: PCST
Prediction Details: PDET
Prediction Probability: PROB
Prediction Set: PSET

Specific target, feature, or cluster default names are abbreviated in one of two ways.

The first approach attempts to integrate the value of the target, feature, or cluster into the column name. This approach is used if the maximum value of the target, cluster, or feature does not exceed the remaining character spaces available in the name. The name must be 30 or fewer characters.
The second approach substitutes the target, cluster, or feature with a sequence ID. This approach is used if the first approach is not possible.

9.1.5.2.5 Add or Edit Apply Output Column

The Add Apply Output dialog box or the Edit Apply Output dialog box enables you to add manually or edit a single column Apply definition. You can edit or add Apply definitions one at a time.

Before you add or edit columns, you must deselect Automatic Settings.

You can perform the following tasks:

Add an Apply Output column: Click .
Edit an Apply Output column: Click .When you edit a column, only the Function selection box and its parameters can be edited.

The following controls are available:

Column: Name of column to be generated.
Auto:
- If selected, you cannot edit the name of the column.
- If deselected, auto naming is disabled and you can rename the column. Column names are validated to ensure that they are unique.
Node: List of Model Input nodes connected to node. If there is only one Input node, then it is selected by default.
Model: List of models for the selected node. If there is only one model, then it is selected by default.
Function: List of model scoring functions for the selected model.
Parameters: Displays 0 or more controls necessary to support the parameter requirements of the selected function.

When you have finished defining the output column, click OK.

See Also:

"Default Apply Column Names"
"Apply Functions"
"Default Apply Column Names"

9.1.5.2.6 Add Output Apply Column Dialog

The default is to automatically name the output column.

To add a column:

In the Column field, provide a name.
Deselect Auto.
In the Node field, select one of the node connected to the Apply node. The type of the node that you select determines the choices in the Model and Function fields.
In the Model field, select a model.
In the Function field, select a function.
After you are done, click OK.

See Also:

"Apply and Output Specifications"

9.1.5.3 Define Apply Columns Wizard

The Define Apply Column wizard has two steps:

Models
Output Specifications

9.1.5.3.1 Models

In the Model section:

Select the Models for which you want to define output specification.
Click Next.

9.1.5.3.2 Output Specifications

In Output Specifications, the possible output specifications are listed with the defaults selected.

Make the changes as applicable.
Click Finish to complete the definition.

See Also:

"Apply and Output Specifications"

9.1.5.4 Additional Output

Additional Output consists of columns that are passed unchanged through the Apply operation.

See Also:

"Additional Output"

9.1.6 Evaluate and Apply Data

Test and Apply data for a model must be prepared in the same way that build data for the model was prepared. To properly prepare test and apply data, duplicate the transformation chains of build data for test and apply data by copying and pasting build Transforms nodes.

9.1.7 Edit Apply Node

The Edit Apply Node dialog box has the following tabs:

Predictions
Additional Output

The default value for Default Column Order is Data Columns First, which means that any data columns that you add come first in the output. You can change this to Apply Columns First.

9.1.7.1 Apply Columns

To create an Apply specification, deselect Automatic Settings. By default, Automatic Settings is selected.

You can perform the following tasks:

To define Apply columns, click .
The Define Apply Column wizard opens.
To add an Output Apply column, click .

The Add Output Apply Column dialog box opens.
To delete an Output Apply column, click .
To edit an Output Apply column specification, select the specification. Click . The Add or Edit Apply Output Column dialog box opens.

See Also:

"Add Output Apply Column Dialog"
"Add or Edit Apply Output Column"
"Add Output Apply Column Dialog"

9.1.7.2 Additional Output

In the Additional Output tab, you can specify pass-through attributes from Data Source nodes.

To add a columns:

Click . The Edit Output Data Column dialog box opens.
In the Edit Apply Node dialog box, the Default Column Order is Data Columns First. You can change this to Apply Columns First.
After you are done, click OK.

See Also:

"Edit Output Data Column Dialog"

9.1.7.2.1 Edit Output Data Column Dialog

By default, no data columns are specified. To specify or include data columns:

Move the attributes from the Available Attributes list to the Selected Attributes list.
Click OK. Data columns are passed through the apply operation unchanged. Certain attributes, such as the case ID, can be useful in interpreting apply output.

9.1.8 Apply Node Properties

To view properties of the Apply node, right-click the node and click Go to Properties.
Alternately, go to View and click Properties.

Apply Node Properties has the following sections:

Predictions: Displays the Output Apply columns defined on the Apply Columns. You can edit these details. The option Automatic Selection is selected if selections were not modified.

For each Output Apply Column, the Name, Function, Parameters, and Node are listed.
Additional Output: Lists the Output Data Columns that are passed through. For each column, the Name, Alias (if any), and Data Type are listed.
Cache
Details: Displays the name of the node and comments.

9.1.9 Apply Node Context Menu

To view the Apply node context menu, right-click the node. The following options are available in the context menu:

Connect
Edit. Opens Edit Apply Node.
Validate Parents
Run
View Data. Opens Apply Data Viewer.
Force Run
Deploy
Show Graph
Generate Apply Chain
Cut
Copy
Paste
Select All
Parallel Query. See "About Parallel Processing" for more information.
Show Event Log. Displayed only if the running of the node fails.
Show Runtime Errors. Displayed only if there is an error.
Show Validation Errors. Displayed only if there is an error.
Navigate

9.1.9.1 Apply Data Viewer

The Apply Data viewer opens in a new tab. The viewer has these tabs:

Data: Displays rows of data. The default is to view the cache data. You can perform the following tasks:
- View actual data.
- Sort data.
- Filter data with a SQL expression.
- Refresh the display. To refresh, click .
Columns: Lists the columns in the apply.
SQL: Lists the SQL used to generate the Apply Output.

9.2 Test Node

Oracle Data Mining enables you to test Classification and Regression models. You cannot test other kinds of models.

Note:

All of the models tested in a node must be either Classification or Regression Models. You cannot test both kinds of models in the same test node.

A Test node can test several models using the same test set.

If Automatic Settings are on, the test node specification is generated when you connect the input nodes.

A Test node can run in parallel.

This section on Test node consists of the following topics:

Support for Testing Classification and Regression Models
Test Node Input
Automatic Settings
Creating a Test Node
Edit Test Node
Compare Test Results Viewer
Test Node Properties
Test Node Context Menu

See Also:

9.2.1 Support for Testing Classification and Regression Models

Oracle Data Miner supports testing of a Classification or Regression models in the following ways:

Test the model as part of the Build node using any one of the following ways:
- Split the Build data into build and test subsets.
- Use all of the Build data as test data.
- Connect a second Data Source node, the test Data Source node to the Build node
Test the model in a Test node. In this case, the test data can be any table that is compatible with the Build data.
After you have tested a Classification Model, you can tune it.

Note:
You cannot tune Regression models.

See Also:

9.2.2 Test Node Input

A Test node has the following input:

At least one node that identifies one or more models. The nodes can be a Model node, a Classification node, or a Regression node. A Model node must contain either Classification or Regression model, but not both.
Any node that generates data as an output such as a Data node, a Transform node, or an appropriate Text node. This node contains the test data.
It is recommended that a case ID is specified. If you do not specify a Case ID, then the processing will take longer.

You can test several Classification or several Regression Models at the same time. The models to be tested can be in different nodes. The models to be tested must satisfy these conditions:

The nodes that contain the models must have the same function type. That is, they must be all Classification Build nodes or all Regression Build nodes.

Classification Models must also have the same list of target attribute values.
The models must have the same target attribute with the same data type.
The Data Source node for Test must contain the target of the models.
The test data must be compatible with the models. That is, it should have been transformed in the same way as the data used to build the model.

9.2.3 Automatic Settings

By default, the option Automatic Settings are selected for a Test node. Automatic Settings result in the following behavior:

When a model input node is connected, all models are added to the specification.
When a model input node is disconnected, all models are removed from the specification. The test node may become invalid.
When a model input node is edited in the following ways, the resultant behavior is as follows:
- If models are added, then model specifications are automatically added to the Test node.
- If models are removed, then the specifications are removed from the Test node.
- If models are changed, then the following is done:
  - The Test node is updated to ensure the algorithm is consistent.
  - If the target changes and there is only one node as input to the Test node, then the node is updated to reflect the new target and keep all the models. Also, the test input data is validated to ensure that it still has the new column target.
  - If there are multiple Model nodes as input to the Test nodes, then the models with the changed target are automatically removed.

If Automatic Settings is deselected, then you must edit the node to reflect all changes to the input. Models are validated if they are added.

9.2.4 Creating a Test Node

Before you create a Test node, you must connect or a Data Source node and a Model node or Build node to the Test node. To create a Test node:

In the Components pane, click Workflow Editor.
If the Components pane is not visible, then go to View and click Components.
Either identify or create a Data Source Node containing the test data.
Ensure that the test data is prepared in the same way as the build data.
Select at least one Model Node, Classification Node, or Regression Node.
A model must be successfully built before it can be tested.

Note:
You can test either Classification or Regression models but not both kinds of models in one Test node.
In the Workflow Editor, expand Evaluate and Apply, and click Test.
Drag and drop the Test node from the Workflow Editor to the workflow pane.
Link the Data node, the Model node or Build node to the Test Node.
Characteristics of the Test node are set by default. You can also edit the node.

See Also:

"Link Nodes"
"Edit Test Node"
"Evaluate and Apply Data"

9.2.5 Edit Test Node

To edit the Test Node, right-click the node and select Edit or double-click the node. The Edit Test Node dialog box opens.

The Edit Test Node dialog box displays the following:

Function (CLASSIFICATION or REGRESSION)
Target and the Target type (data type of the target)
The Case ID (if there is one)

It is recommended that you specify a case ID. If you do not specify a case ID, processing will be slower. The case ID that you specify for the Text node should be the same as the case ID specified for the Build node.

By default, the option Automatic Settings is selected.

You can perform the following tasks:

Compare test results and view individual models even when Automatic Settings is selected. The models tested are listed in the Selected Models grid.
Make change to the list of the models. Deselect Automatic Settings and make changes in the Selected Models grid.

See Also:

"No Case ID"
"Automatic Settings"

9.2.5.1 Selected Models

For each model, the grid lists the following:

The model name
The node containing the model
Test status of the model
The algorithm that is used to build the model

You can perform the following tasks:

View Models: You can view models that are successfully built. Select the model in the grid and click .
Compare Test Results: Click . The test results are displayed in Compare Test Results Viewer.
Add Model: Click . You can only add models that have the same function.
Before adding a model, deselect Automatic Settings.
Delete Model: Select it and click .
Before deleting a model, deselect Automatic Settings.

9.2.5.2 Select Model

The Select Model dialog box lists the models that are available for testing. To select models:

Move the Models from Available Models to Selected Models.
Click OK.

9.2.6 Compare Test Results Viewer

The Compare Test Result viewer displays test results for one or more models in the same node. The following test results are displayed:

See Also:

9.2.7 Test Node Properties

To view properties, right-click the node and select Go to Properties. If the Properties pane is closed, then go to View and click Properties.

The Test node Properties pane has these sections:

Models: Lists the models to test in the Selected Models grid.
Test
Details: Displays the name of the node and comments.

9.2.7.1 Models

The Models tab lists the models to test in the Selected Models grid.

9.2.7.2 Test

The Test section describes how testing is performed. Test contains these information:

Function: CLASSIFICATION or REGRESSION.
Target: The name of the target.
Data Type: The data type of the target
For CLASSIFICATION, these test results are calculated by default:
- Performance Metrics
- ROC Curve (Binary Target Only)
- Lift and Profit
You can deselect Metrics.

By default, the top 100 target values by frequency is specified. To change this value, click Edit. Edit the value in the Target Values Selection dialog box.
For REGRESSION, Accuracy Matrix and Residuals are selected. You can deselect Metrics.
- The Performance Metrics are the metrics displayed on the Performance tab of the Test Viewer.
- Residuals are displayed on the Residual tab of the Test Viewer.

See Also:

9.2.7.2.1 Target Values Selection

The Target Values Selection dialog box displays the number of target values selected. The default setting is Automatic.

It uses the top 10 Target Class Values by frequency. You can change the number of target values by changing Frequency Count. You can also select Use Lowest Occurring.

To select custom values:

Select Custom.
Move the values from Available Values to Selected Values.
After you are done, click OK.

9.2.8 Test Node Context Menu

To view the Test node context menu, right-click the node. The following options are available in the context menu:

Connect
Edit. Opens Edit Test Node.
Validate Parents
Run
Force Run
View Models
View Test Results
Compare Test Results. Opens the Compare Test Results Viewer
Generate Apply Chain
Cut
Copy
Paste
Select All
Parallel Query. See "About Parallel Processing" for more information.
Navigate
Show Runtime Errors. Displayed only if there is an error.
Show Validation Errors. Displayed only if there are validation errors.
Show Event Log. Displayed only if there is an error.