Model Definition Maintenance

5.4.1.1 Model Definition Maintenance

This topic provides the systematic instructions to maintain the use case details, define the use case type, and data source details.

Specify User ID and Password, and login to Home screen.

On Home screen, click Machine Learning. Under Machine Learning, click Model Definition.
On View Model Definition screen, click button on the Use case tile to Unlock or click button to create the new model definition.
The Model Definition screen displays.

Figure 5-2 Model Definition

Description of "Figure 5-2 Model Definition"

Specify the fields on Model Definition screen.

Note:

The fields marked as Required are mandatory.

For more information on fields, refer to the field description table.

Table 5-7 Model Definition – Field Description

Field	Description
Use Case Name	Specify the name of the Use Case.
Description	Specify the description of the Use Case.
Use Case Type	Select the type of Use Case. Refer Frameworks Supported for details.
Product Processor	Select the product to which the use case belongs.
Training Data Source	Specify the Table or View name used as data source to train the model.
Unique Identifier	Select the column name to uniquely identify a record. Note: Column name is a function of table/view design.
Target Column	Select the value of the column which is predicted by training the model. Note: Column name is a function of table/view design.
Positive Target Value	If Use Case Type selected is CLASSIFICATION, then this field is enabled else disabled for REGRESSION. It will display distinct values from the target column
Tablespace	Specify the valid tablespace and all model related data will be persisted in this table space.
Inference Data Source	Specify the Table or View that capture the data to be used for making predictions. Inference data source will be the current data where we are trying to predict the target using the built model, unlike the training data where target is already provided.
Partition Column Names	Specify the column names to slice data. Refer Partitioned Model for details.
Selected Algorithm	Select the algorithm from the list and build the model. For REGRESSION, this field should be null and allow the framework to select the best fit algorithm to build the model.
Model Error Statistics	Select the model error statistics. By Default, the value is selected as RMSE for REGRESSION. The user can also select MAE. Note: It will be disabled for CLASSIFICATION

Click Save to save the details.
The user can view the configured details in the Model Definition screen.

Cost Matrix:

This button is enabled ONLY for CLASSIFICATION type of use cases.

Any classification model can make two kinds of error

Table 5-8 Classification Type - Error

Actual Value	Predicted Value	Error Type
1	0	False Negative
0	1	False Positive

This screen is used to bias the model into minimizing one of the error types, by adding a penalty cost.

All penalty cost has to be positive.

Table 5-9 Classification Type - Penalty

Actual Value	Predicted Value	Penalty Cost
1	0	6
0	1	2

The default is zero cost for all combinations.

Biasing the model is a trade-off with accuracy of prediction. Business determines if a classification model is required to be biased or not.

Click Cost Matrix button to launch the screen.
The Cost Matrix screen displays.

Figure 5-3 Cost Matrix

Description of "Figure 5-3 Cost Matrix"
On Cost Value screen, specify the relevant penalty cost.
Click Save to save and close the Cost Matrix screen and back to the Model Definition screen.

Correlation:Multicollinearity occurs when two or more independent variables are highly correlated with one another in a model.

Figure 5-4 Correlation

Description of "Figure 5-4 Correlation"

Multicollinearity may not affect the accuracy of the model as much, but we might lose reliability in model interpretation.

Irrespective of CLASSIFICATION or REGRESSION, all use cases must be evaluated for Correlation.

This button will display Orange mark if evaluation is pending.

Click Correlation button to launch the screen.
The Correlation Analysis screen displays.

Figure 5-5 Correlation Analysis

Description of "Figure 5-5 Correlation Analysis"

Select the required fields on Correlation Analysis screen.

For more information on fields, refer to the field description table.

Table 5-10 Correlation Analysis – Field Description

Field	Description
Threshold Value	Select the threshold value. The Value can be set between 0.1 to 0.9. Note: By default, the value is set as 0.5.
Type of Correlation	Select the type of correlation. By default, the option is selected as Pearson. The formula used for calculation is different for each type
Pairwise Correlation	Displays the output of the Correlation Validation.
Analyzed Features	Displays the distinct analysed Features from Pairwise Correlation.
Ignore Features	User defined list created from Analysed Features.

Click to initiate the evaluation process.
The Correlation Analysis - Pairwise Correlation screen displays.

Figure 5-6 Correlation Analysis - Pairwise Correlation

Description of "Figure 5-6 Correlation Analysis - Pairwise Correlation"
Move ONE of the Analyzed Features to Ignore Features List.
Click and re-evaluate Correlation as mentioned in Step 8.
Rinse and repeat the Step 9 and 10 for each feature addition to the Ignore feature list, until Pairwise Correlation displays zero correlated pair.
Attempting to exit the screen midway without achieving zero Pairwise Correlation, will display the following error message.
The Error Message screen displays.

Figure 5-7 Error Message

Description of "Figure 5-7 Error Message"
After successful Correlation Evaluation, the orange highlight on the Correlation button is removed.
After Correlation Evaluation and Cost Matrix definition (for CLASSIFICATION)
Click Save to create the new Model Definition.
The user can view the configured details in the View Model Definition screen.

Model Metrices

Once the user has successfully trained Machine Learning model, the user can score/predict the model outcomes as required by the use case. The user can view the Model Metrices screen only after training the model successfully. Refer to Model Training and Scoring section for training the model.

Click Model Metrices to view the Model Metrices details.

The Model Metrices screen displays.

Figure 5-8 Model Metrices

Description of "Figure 5-8 Model Metrices"

For more information on fields, refer to the field description table.

Table 5-11 Model Metrices – Field Description

Field	Description
Model Partitions	Select the model partitions from the drop-down list. If the model has been designed to have partitions, it will display the partitioned values based on underlying data of the defined partition column else display FULL MODEL.
Metrices	Displays the various model attributes, as per the best model identified and trained. The number of model attributes is a function of algorithm and underlying pattern of data. Some attributes are common for all models as below. Model Name Algorithm INF_TIME (Inference Time) <Model metric>(Train) <Model metric>(Test)
Value	Displays the value of the attribute.

Field

Description

Model Partitions

Select the model partitions from the drop-down list.

If the model has been designed to have partitions, it will display the partitioned values based on underlying data of the defined partition column else display FULL MODEL.

Metrices

Displays the various model attributes, as per the best model identified and trained. The number of model attributes is a function of algorithm and underlying pattern of data.

Some attributes are common for all models as below.

Model Name

Algorithm

INF_TIME (Inference Time)

<Model metric>(Train)

<Model metric>(Test)

Value

Displays the value of the attribute.

Parent topic: Model Definition