8 Using PL/SQL API to Manage Workflows

This chapter explains how to manage workflows using the PL/SQL APIs.

About PL/SQL APIs

Oracle Data Miner ships with a set of repository PL/SQL APIs that enable applications to manage Oracle Data Miner projects and workflow directly.

The project PL/SQL APIs is in the ODMR_PROJECT package, and the workflow PL/SQL APIs is in the ODMR_WORKFLOW package. Both the packages are defined in the ODMRSYS schema in the Oracle Data Miner repository.

The PL/SQL APIs enable you to:

  • Manage Data Miner projects and workflows

  • Schedule workflows

  • Run workflows

  • Query project and workflow information

  • Monitor workflow execution status

  • Query generated results

PL/SQL APIs

You can use the PL/SQL APIs to manage projects and workflows.

Use the PL/SQL APIs to perform the following tasks:

PROJECT_CREATE

The function PROJECT_CREATE creates a project using the project name that you provide. The function returns a project ID. If the project already exists, the function raises an exception.

Function:

FUNCTION PROJECT_CREATE(p_project_name IN VARCHAR2,
p_comment IN VARCHAR2 DEFAULT NULL) RETURN NUMBER

Table 8-1 lists the parameters that are used in the PROJECT_CREATE function.

Table 8-1 List of Parameters for PROJECT_CREATE Function

Parameter Description

p_project_name

Assign a name to the project that is created.

p_comment

Specify any comment that is to be applied to the project.

PROJECT_RENAME

The PROJECT_RENAME procedure renames an existing project. If a project with the new name already exists, then the procedure raises an exception.

Procedure:

PROCEDURE PROJECT_RENAME(p_project_id IN NUMBER, p_project_name IN VARCHAR2)

Table 8-2 lists the parameters that are used in the PROJECT_RENAME procedure.

Table 8-2 List of Parameters for PROJECT_RENAME Procedure

Parameters Description

p_project_id

Specify the project ID of the project to rename.

p_project_name

Specify the new name for the project.

PROJECT_DELETE

The procedure PROJECT_DELETE enables you to delete one or more projects along with the workflows contained in it. If any workflow is running or is opened by Oracle Data Miner, then the procedure raises an exception.

Procedure to delete a project:

PROCEDURE PROJECT_DELETE(p_project_id IN NUMBER)

Procedure to delete multiple projects:

PROCEDURE PROJECT_DELETE(p_project_ids IN ODMR_OBJECT_IDS)

Table 8-3 lists the parameters that are used in the PROJECT_DELETE procedure.

Table 8-3 List of Parameters for PROJECT_DELETE procedure

Parameters Description

p_project_id

Specify the project ID of the project to delete.

p_project_ids

Specify the project IDs of the projects to delete.

WF_RUN

The function WF_RUN that runs a workflow contains signatures that accepts names, project IDs, workflow and specific nodes to run.

The project ID, workflow ID, and node IDs can be queried using the ODMR_USER_WORKFLOW_NODES view.

WF_RUN Parameters

You can execute the WF_RUN function using different parameter combinations:

  • WF_RUN with project name, workflow name and node name

  • WF_RUN with project ID, workflow ID and node IDs

  • WF_RUN with project name, workflow name, node name and time interval

  • WF_RUN with project ID, workflow ID, node ID and time interval

The RERUN_WORKFLOW RUN mode runs all nodes in a workflow regardless of how these nodes are connected. If a workflow contains two or more separate lineage of nodes, all lineages will be run, but the order of lineage execution is not deterministic. That is, the user cannot set the order for the lineage to run.

Table 8-4 lists the parameters that are used in the WF_RUN function.

Table 8-4 List of Parameters in the WF_RUN function

Parameters Description

P_PROJECT_NAME

Specify the project name that the workflow was created in.

P_PROJECT_ID

Specify the project ID that the workflow was created in.

P_WORKFLOW_NAME

Specify the workflow name to run.

P_WORKFLOW_ID

Specify the workflow ID to run.

P_NODE_NAMES

Specify the node names in the workflow to run.

P_NODE_IDS

Specify the node IDs in the workflow to run.

P_RUN_MODE

  • VALIDATE_ONLY: Validates parent nodes of the specified nodes.

  • RUN_NODE_ONLY: Runs the specified nodes. If the nodes have already run, they will not run again. If parent nodes have not run, then they will be run, otherwise they will be ignored.

  • RERUN_NODE_ONLY: Resets the status of the specified nodes to READY state. Then these nodes are run again.

  • RERUN_NODE_CHILDREN: Resets the status of the specified nodes and their children nodes to READY state. Then these nodes are run again.

  • RERUN_NODE_PARENTS: Resets the status of the specified nodes and their parent nodes to READY state. Then these nodes are run again.

  • RERUN_WORKFLOW: Resets the status of all nodes to READY state. Then the nodes are run (complete workflow run). Note: p_node_names is ignored.

P_MAX_NUM_THREADS

Specify the maximum number of parallel model builds across all workflows. Specify NULL for system determined. Use this parameter only if your system has plenty of resources, otherwise set this value to NULL to use the default value.

P_SCHEDULE

Specify existing schedule object defined in the Scheduler. If no value is specified for P_SCHEDULE, then the workflow is scheduled to run as soon as possible.

P_START_DATE

Specify the date and time on which this workflow is scheduled to start for the first time. If P_START_DATE and P_REPEAT_INTERVAL are set to NULL, then the workflow is scheduled to run as soon as possible.

P_REPEAT_INTERVAL

Specify how often the workflow repeats. You can specify the repeat interval by using the calendar or PL/SQL expressions. The expression specified is evaluated to determine the next time the workflow should run. If P_REPEAT_INTERVAL is not specified, then the workflow runs only once at the specified start date.

P_END_DATE

Specify the date and time after which the workflow expires and is no longer run. If no value for P_END_DATE is specified, then the job repeats indefinitely.

P_JOB_CLASS

Specify existing job class to run the workflow. If no value for P_JOB_CLASS is specified, then the default job class is used.

WF_RUN with Project Name, Workflow Name, and Node Name

The WF_RUN function with the project name, workflow name and node name parameters:

FUNCTION WF_RUN(P_PROJECT_NAME IN VARCHAR2,
                                P_WORKFLOW_NAME IN VARCHAR2,
                                P_NODE_NAMES IN ODMR_OBJECT_NAMES,
                                P_RUN_MODE IN VARCHAR2 DEFAULT 'RUN_NODE_ONLY',
                                P_MAX_NUM_THREADS IN NUMBER DEFAULT NULL,
                                P_SCHEDULE IN VARCHAR2 DEFAULT NULL,
                                P_JOB_CLASS IN VARCHAR2 DEFAULT NULL
RETURN VARCHAR2

WF_RUN with Project Name, Workflow Name Node Name and Time Interval

The WF_RUN function with the name parameters and start date and end date:

FUNCTION WF_RUN(P_PROJECT_NAME IN VARCHAR2,
                   P_WORKFLOW_NAME IN VARCHAR2,
                   P_NODE_NAMES IN ODMR_OBJECT_NAMES,
                   P_RUN_MODE IN VARCHAR2 DEFAULT 'RUN_NODE_ONLY',
                   P_MAX_NUM_THREADS IN NUMBER DEFAULT NULL,
                   P_START_DATE IN TIMESTAMP WITH TIME ZONE DEFAULT NULL,
                   P_REPEAT_INTERVAL IN VARCHAR2 DEFAULT NULL,
                   P_END_DATE IN TIMESTAMP WITH TIME ZONE DEFAULT NULL,
                   P_JOB_CLASS IN VARCHAR2 DEFAULT NULL)
RETURN VARCHAR2

WF_RUN with Project ID, Workflow ID, Node ID and Time Interval

The WF_RUN function with the IDs and start date and end date parameters:

FUNCTION WF_RUN(P_PROJECT_ID IN NUMBER,
                P_WORKFLOW_ID IN NUMBER,
                P_NODE_IDS IN ODMR_OBJECT_IDS,
                P_RUN_MODE IN VARCHAR2 DEFAULT 'RUN_NODE_ONLY',
                P_MAX_NUM_THREADS IN NUMBER DEFAULT NULL,
                P_START_DATE IN TIMESTAMP WITH TIME ZONE DEFAULT NULL,
                P_REPEAT_INTERVAL IN VARCHAR2 DEFAULT NULL,
                P_END_DATE IN TIMESTAMP WITH TIME ZONE DEFAULT NULL,
                P_JOB_CLASS IN VARCHAR2 DEFAULT NULL)
RETURN VARCHAR2

WF_RUN with Project ID, Workflow ID and Node IDs

The WF_RUN function with the project ID, workflow ID and node ID parameters:

FUNCTION WF_RUN(P_PROJECT_ID IN NUMBER,
                                 P_WORKFLOW_ID IN NUMBER,
                                 P_NODE_IDS IN ODMR_OBJECT_IDS,
                                 P_RUN_MODE IN VARCHAR2 DEFAULT 'RUN_NODE_ONLY',
                                 P_MAX_NUM_THREADS IN NUMBER DEFAULT NULL,
                                 P_SCHEDULE IN VARCHAR2 DEFAULT NULL,
                                 P_JOB_CLASS IN VARCHAR2 DEFAULT NULL)
RETURN VARCHAR

WF_STOP

The procedure WF_STOP enables you to stop or cancel a workflow that is scheduled to run. If the workflow is not already running or scheduled, then the procedure raises an exception.

The procedure is:

PROCEDURE WF_STOP(p_workflowId IN NUMBER)

Table 8-5 lists the parameters that are used in the WF_STOP procedure.

Table 8-5 List of Parameters for WF_STOP Procedure

Parameter Description

p_workflow_id

Specify the workflow ID of the workflow to cancel.

WF_RENAME

The procedure WF_RENAME renames an existing workflow.

The procedure raises an exception under the following conditions:

  • If a workflow with the new name already exists.

  • If the workflow is either already running or opened by Oracle Data Miner.

The procedure to rename the workflow:

PROCEDURE WF_RENAME(p_workflowId IN NUMBER,
                                  p_workflow_name IN VARCHAR2,
                                  p_mode IN CHAR DEFAULT 'R')

Table 8-6 lists the parameters that are used in the WF_RENAME procedure.

Table 8-6 List of Parameters for WF_RENAME Procedure

Parameters Description

p_workflow_id

Specify the workflow ID to rename.

p_workflow_name

Specify the new workflow name.

p_mode

This parameter is for internal use only.

WF_DELETE

The procedure WF_DELETE deletes a workflow along with all the generated objects such as tables, views, models, test results, and so on. If the workflow is either already running or opened by the Oracle Data Miner, then it raises an exception.

Procedure:

PROCEDURE WF_DELETE(p_workflowId IN NUMBER)

Table 8-7 lists the parameters that are used in the WF_DELETE procedure.

Table 8-7 List of Parameters for WF_DELETE Procedure

Parameter Description

p_workflow_id

Specify the ID of the workflow that is to be deleted.

WF_IMPORT

The WF_IMPORT function imports a workflow (exported by the Oracle Data Miner) to the specified project. Since the workflow is backward compatible, you can import an older version of a workflow to a newer Oracle Data Miner Repository.

During import, the function detects if the workflow has any object name conflicts with the existing workflows in the repository. The p_force parameter determines whether to terminate the import or not.

Exceptions are raised under the following conditions:

  • If the project does not exist

  • If the workflow metadata is invalid or incompatible with the current repository

  • If a workflow with the same name already exists

Function:

FUNCTION WF_IMPORT(p_project_id IN NUMBER,
                       p_workflow_name IN VARCHAR2,
                       p_workflow_data IN XMLType,
                       p_comment IN VARCHAR2,
                       p_force IN BOOLEAN DEFAULT FALSE) RETURN NUMBER;

Table 8-8 lists the parameters that are used in the WF_IMPORT function.

Table 8-8 List of Parameters for WF_IMPORT Function

Parameters Description

p_project_id

Specify the ID of the project in to which the workflow will be imported.

p_workflow_name

Specify the workflow to import.

p_workflow_data

Specify the workflow metadata. This workflow should be previously exported by the Oracle Data Miner and the workflow version should not be newer than what the Repository supports.

p_comment

Specify the comment to be applied to the workflow.

p_force

Determines whether to force import if the workflow has object name conflicts with existing workflows in the repository. The applicable values are:

  • If p_force = FALSE, then it raises an exception with a list of conflicting object names.

  • If p_force = TRUE, then it proceeds with the import of the workflow.

WF_EXPORT

The WF_EXPORT function exports a specified workflow. If the workflow is either already running or opened by the Oracle Data Miner, then it raises an exception. Alternatively, you can query the ODMR_USER_PROJECT_WORKFLOW for workflows to export.

Function:

FUNCTION WF_EXPORT(p_workflow_id IN NUMBER) RETURN XMLType;

Table 8-9 lists the parameters that are used in the WF_EXPORT function.

Table 8-9 List of Parameters for WF_EXPORT Function

Parameters Description

p_workflow_id

Specify the ID of the workflow to export.

Repository Views

In the Repository View, you can query information related to workflows and projects, and also monitor the status.

The Repository Views enable you to:

  • Query workflow and project information

  • Monitor workflow execution status

  • Query generated results

The following repository APIs or views are available:

ODMR_USER_PROJECT_WORKFLOW

You can query all workflows that belong to a specific project or all projects by using the ODMR_USER_PROJECT_WORKFLOW repository view. This view provides information about the workflow, such as status, creation time, update time, and so on.

Table 8-10 provides more information about this view.

Table 8-10 ODMR_USER_PROJECT_WORKFLOW Repository View

Column Data Type Description

PROJECT_ID

NUMBER

This is the ID of the project in which the workflow was created in.

PROJECT_NAME

VARCHAR2 (128 CHAR)

This is the name of the project in which the workflow was created in.

PJ_CREATION_TIME

TIMESTAMP (6)

This is the project creation time.

PJ_LAST_UPDATED_TIME

TIMESTAMP (6)

Project last modified time stamp.

PJ_COMMENTS

VARCHAR2 (4000 CHAR)

These are comments related to the project, if any.

WORKFLOW_ID

NUMBER

This is the workflow ID.

WORKFLOW_NAME

VARCHAR2 (128 CHAR)

This is the name of the workflow.

WORKFLOW_DATA

XMLTYPE

This is the workflow metadata in XML format.

CHAIN_NAME

VARCHAR (128 CHAR)

This is for internal use only.

STATUS

VARCHAR (30 CHAR)

  • INACTIVE: Indicates that the workflow is idle.

  • ACTIVE: Indicates that the workflow is running.

  • QUEUED: Indicates that the workflow is in queue to run.

  • STOPPING: Indicates that the workflow is being stopped.

  • STOPPED: Indicates that the running workflow has stopped.

  • SCHEDULED: Indicates that the workflow is scheduled to run.

WF_CREATION_TIME

TIMESTAMP (6)

This is the workflow creation time stamp.

WF_LAST_UPDATED_TIME

TIMESTAMP (6)

This is the workflow last modified time stamp.

WF_COMMENTS

VARCHAR2 (4000 CHAR)

These are comments related to the workflow, if any.

ODMR_USER_WORKFLOW_ALL

You can query individual workflow node status after the workflow is complete by using the ODMR_USER_WORKFLOW_ALL repository view.

For example, you can query the nodes that failed along with the associated error details, in the last workflow run. Table 8-11 provides more information about this view.

Table 8-11 ODMR_USER-WORKFLOW_ALL Repository View

Column Data Type Description

WORKFLOW_ID

NUMBER

This is the ID of the workflow.

WF_JOB_NAME

VARCHAR2 (261)

This is the Scheduler Job that runs the workflow.

LOG_DATE

TIMESTAMP (6) WITH TIME ZONE

This is the log entry time stamp.

LOG_ID

NUMBER

This is the log entry ID.

NODE_ID

VARCHAR2 (261)

This is the workflow node ID.

SUBNODE_ID

VARCHAR2 (261)

This is the workflow sub node. For example, a Model node in a Build node.

NODE_STATUS

VARCHAR2 (11)

  • RUNNING: Indicates that the node is running.

  • SUCCEEDED: Indicates that the node run has completed successfully.

  • FAILED: Indicates that the node has failed to complete.

  • NOT_STARTED: Indicates that the node is waiting to run.

  • SCHEDULED: Indicates that the node is scheduled to run.

  • PAUSED: Indicates that the node is paused. This is an exception.

  • STOPPED: Indicates that the node run has stopped.

  • STALLED: Indicates that the node has stalled. This is an exception.

SUBNODE_STATUS

VARCHAR2 (30)

Same as Node status

NODE_START_TIME

TIMESTAMP (6) WITH TIME ZONE

This is the workflow node start time stamp.

NODE_RUN_TIME

INTERVAL DAY (9) TO SECOND (6)

This is the node run time.

ERROR_CODE

NUMBER

This is the error code returned by the node.

LOG_MESSAGE

VARCHAR2 (4000 CHAR)

This is the log message generated by the node.

ODMR_USER_WORKFLOW_LOG

You can query the logs of workflow run by using the ODMR_USER_WORKFLOW_LOG repository view.

Oracle Data Miner uses this view to extract and display the workflow event log. Table 8-12 provides more information about this view.

Table 8-12 ODMR_USER_WORKFLOW_LOG Repository Views

Column Data Type Description

LOG_ID

NUMBER

This is the log entry ID.

JOB_NAME

VARCHAR2 (128 CHAR)

This is the Scheduler Job that runs the workflow.

PROJ_NAME

VARCHAR2 (128 CHAR)

This is the project in which the workflow was created in.

PRO_ID

NUMBER

This is the project ID in which the workflow was created in.

WF_NAME

VARCHAR2 (128 CHAR)

This is the workflow name.

WF_ID

NUMBER

This is the workflow ID.

NODE_NAME

VARCHAR2 (128 CHAR)

This is the name of the node in the workflow.

NODE_ID

VARCHAR2 (30)

This is the node ID.

SUBNODE_NAME

VARCHAR2 (128 CHAR)

This is the workflow sub node name. For example, Model name in a Build node.

SUBNODE_ID

VARCHAR2 (30)

This is the workflow sub node ID. for example, the Model ID in a Build Node.

LOG_TIMESTAMP

TIMESTAMP (6) WITH TIME ZONE

This is the log entry time stamp.

LOG_DURATION

INTERVAL DAY (3) TO SECOND (0)

This is the log entry duration in days and seconds.

LOG TYPE

VARCHAR2 (30 CHAR)
  • WARN: Indicates warning.

  • ERR: Indicates error.

  • INFO: Indicates informational content.

LOG SUBTYPE

VARCHAR2 (30 CHAR)
  • START: Indicates start of a task.

  • END: Indicates end of a task.

LOG_MESSAGE

NVARCHAR2 (2000)

This is the log message generated by the node.

LOG_MESSAGE_DETAILS

VARCHAR2 (4000 CHAR)

This is the log message details generated by the node.

LOG_TASK

VARCHAR2 (30 CHAR)

When a node is running, it performs one or more of the following tasks:

  • PROJECT

  • WORKFLOW

  • NODE

  • SUBNODE

  • VALIDATE

  • SAMPLE

  • CACHE

  • STATISTICS

  • FEATURES

  • DATAPREP

  • BUILD

  • TEST

  • APPLY

  • TRANSFORM

  • TEXT

  • BUILDTEXT

  • APPLYTEXT

  • OUTPUT

  • CLEANUP

  • CREATE EXPLORE STATISTICS

  • CREATE HISTOGRAM

  • CREATE SAMPLE DATA

  • CREATE HISTOGRAM SAMPLE

  • CREATE DATA GUIDE

  • CREATE PROJECT

  • DELETE PROJECT

  • RENAME PROJECT

  • SET COMMENT

  • CREATE WORKFLOW

  • RUN WORKFLOW

  • RENAME WORKFLOW

  • DELETE WORKFLOW

  • IMPORT WORKFLOW

  • EXPORT WORKFLOW

ODMR_USER_WORKFLOW_NODES

You can query information about the individual nodes, such as node name, node status, node ID and so on, that are part of a workflow, by using the ODMR_USER_WORKFLOW_NODES repository view.

Table 8-13 provides more information about this view.

Table 8-13 ODMR_USER_WORKFLOW_NODES Repository Views

Column Data Types Description

PROJECT_ID

NUMBER

This is the ID of the project in which the workflow was created in.

PROJECT_NAME

VARCHAR2 (128 CHAR)

This is the name of the project in which the workflow was created in.

WORKFLOW_ID

NUMBER

This is the ID of the workflow.

WORKFLOW_NAME

VARCHAR2 (128 CHAR)

This is the name of the workflow.

NODE_TYPE

VARCHAR2 (30 CHAR)

This is the node type.

NODE_NAME

VARCHAR2 (128 CHAR)

This is the name of the node.

NODE_ID

NUMBER

This is the ID of the node.

NODE_STATUS

VARCHAR2 (10 CHAR)

  • INVALID: Indicates that the node is not valid, and it cannot be run.

  • WARNING: Indicates that the node has run but with warning.

  • READY: Indicates that the node is ready to run.

  • FAILURE: Indicates that the node run has failed.

  • COMPLETE: Indicates that the node run has completed successfully.

ODMR_USER_WORKFLOW_MODELS

You can query mining models that belong to a specific Build or Model node of a workflow by using the ODMR_USER_WORKFLOW_MODELS repository view.

Table 8-14 provides more information about this view.

Table 8-14 ODMR_USER_WORKFLOW_MODELS Repository Views

Column Data Types Description

PROJECT_ID

NUMBER

This is the project ID in which the workflow was created in.

PROJECT_NAME

VARCHAR2 (128 CHAR)

This is the name of the project in which the workflow was created in.

WORKFLOW_ID

NUMBER

This is the ID of the workflow.

WORKFLOW_NAME

VARCHAR2 (128 CHAR)

This is the name of the workflow.

NODE_TYPE

VARCHAR2 (30 CHAR)

This is the node type.

NODE_ID

NUMBER

This is the workflow node ID.

NODE_NAME

VARCHAR2 (128 CHAR)

This is the name of the workflow node.

NODE_STATUS

VARCHAR2 (30 CHAR)

  • INVALID: Indicates that the node is not valid, and cannot be run.

  • WARNING: Indicates that the node has run but with warning.

  • READY: Indicates that the node is ready to run.

  • FAILURE: Indicates that the node run has failed.

  • COMPLETE: Indicates that the node run has completed successfully.

MODEL_TYPE

VARCHAR2 (30 CHAR)

This is the model type. For example, Naive Bayes Model.

MODEL_ID

NUMBER

This is the ID of the model.

MODEL_NAME

VARCHAR2 (128 CHAR)

This is the name of the model.

MODEL_STATUS

VARCHAR2 (30 CHAR)

Same as node status.

MODEL_CREATIONDATE

VARCHAR2 (30 CHAR)

This is the date when the model is created.

ODMR_USER_WF_CLAS_TEST_RESULTS

By using the ODMR_USER_WF_CLAS_TEST_RESULTS repository view, you can query the generated classification results for a specific mining model in the last workflow run.

Table 8-15 provides more information about this .

Table 8-15 ODMR_USER_WF_CLAS_TEST_RESULTS

Column Data Type Description

PROJECT_ID

NUMBER

This is the ID of the project in which the workflow was created in.

PROJECT_NAME

VARCHAR2 (128 CHAR)

This is the name of the project in which the workflow was created in.

WORKFLOW_ID

NUMBER

This is the ID of the workflow.

WORKFLOW_NAME

VARCHAR2 (128 CHAR)

This is the name of the workflow.

NODE_TYPE

VARCHAR2 (30 CHAR)

This is the node type.

NODE_ID

NUMBER

This is the ID of the node.

NODE_NAME

VARCHAR2 (128 CHAR)

This is the name of the node.

NODE_STATUS

VARCHAR2 (10 CHAR)

  • INVALID: Indicates that the node is not ready to be run.

  • WARNING: Indicates that the node has run completely, but with warning.

  • READY: Indicates that the node is ready to be run.

  • FAILURE: Indicates that the node run has failed.

  • COMPLETE: Indicates that the node has run successfully.

MODEL_ID

NUMBER

This is the ID of the model.

MODEL_NAME

VARCHAR2 (128 CHAR)

This is the name of the model.

MODEL_STATUS

VARCHAR2 (10 CHAR)

  • WARNING: Indicates that the model has run, but with warning.

  • READY: Indicates that the model is ready to be run.

  • FAILURE: Indicates that the model run has failed.

  • COMPLETE: Indicates that the model run has completed successfully.

MODEL_CREATIONDATE

VARCHAR2 (30 CHAR)

This is the creation time of the model.

TEST_METRICS

VARCHAR2 (128 CHAR)

The test metric result table that contains Predictive Confidence, accuracy, and so on.

CONFUSION_MATRIX

VARCHAR2 (128 CHAR)

The Confusion Matrix result table.

LIFTS

DM_NESTED_CATEGORICALS

The table of DM_NESTED_CATEGORICAL, where:

  • ATTRIBUTE_NAME contains target class

  • VALUE contains lift result table

ROCS

DM_NESTED_CATEGORICALS

The table of DM_NESTED_CATEGORICAL, where:

  • ATTRIBUTE_NAME contains target class

  • VALUE contains ROC result table

ROC_AREA

DM_NESTED_NUMERICALS

The table of DM_NESTED_NUMERICAL, where:

  • ATTRIBUTE_NAME contains target class

  • VALUE contains ROC area under curve value

ODMR_USER_WF_REGR_TEST_RESULTS

You can query the generated regression results for a specific mining model in the last workflow run by using the ODMR_USER_WF_REGR_TEST_RESULTS repository view.

Table 8-16 provides more information about this view.

Table 8-16 ODMR_USER_WF_REGR_TEST_RESULTS Repository Views

Column Data Type Description

PROJECT_ID

NUMBER

This is the ID of the project under which the workflow is created.

PROJECT_NAME

VARCHAR2 (128 CHAR)

This is the name of the project under which the workflow is created.

WORKFLOW_ID

NUMBER

This is the workflow ID.

WORKFLOW_NAME

VARCHAR2 (128 CHAR)

This is the name of the workflow.

NODE_TYPE

VARCHAR2 (30 CHAR)

This is the type of the node. For example, Build node, Model node, and so on.

NODE_ID

NUMBER

This is the ID of the node.

NODE_NAME

VARCHAR2 (128 CHAR)

This is the name of the node.

NODE_STATUS

VARCHAR2 (10 CHAR)

  • INVALID: Indicates that the node is not ready to be run.

  • WARNING: Indicates that the node has run completely, but with warning.

  • READY: Indicates that the node is ready to be run.

  • FAILURE: Indicates that the node run has failed.

  • COMPLETE: Indicates that the node has run successfully.

MODEL_ID

NUMBER

This is the ID of the model.

MODEL_NAME

VARCHAR2 (128 CHAR)

This is the name of the model.

MODEL_STATUS

VARCHAR2 (10 CHAR)

  • WARNING: Indicates that the model has run, but with warning.

  • READY: Indicates that the model is ready to be run.

  • FAILURE: Indicates that the model run has failed.

  • COMPLETE: Indicates that the model run has completed successfully.

MODEL_CREATION_DATE

VARCHAR2 (30 CHAR)

This is the creation date of the model.

TEST_METRICS

VARCHAR2 (128 CHAR)

This is the test metrics result table that contains Predictive Confidence, Root Mean Square error, and so on.

RESIDUAL_PLOT

VARCHAR2 (128 CHAR)

This is the test Residual Plot table.

ODMR_USER_WF_TEST_RESULTS

You can query the combined results of the ODMR_USER_WF_CLAS_TEST_RESULTS and ODMR_USER_WF_REGR_TEST_RESULTS by using the ODMR_USER_WF_TEST_RESULTS repository view.

Table 8-17 provides more information about this view.

Table 8-17 ODMR_USER_WF_TEST_RESULTS Repository Views

Column Data Type Description

PROJECT_ID

NUMBER

The project ID of the project under which the workflow is created.

PROJECT_NAME

VARCHAR2 (128 CHAR)

The name of the project under which the workflow is created.

WORKFLOW_ID

NUMBER

This is the workflow ID.

WORKFLOW_NAME

VARCHAR2 (128 CHAR)

This is the name of the workflow.

NODE_TYPE

VARCHAR2 (30 CHAR)

This is the type of the node. For example, Build node, Model node and so on.

NODE_ID

NUMBER

This is the ID of the node.

NODE_NAME

VARCHAR2 (128 CHAR)

This is the name of the node.

NODE_STATUS

VARCHAR2 (10 CHAR)

  • INVALID: Indicates that the node is not ready to be run.

  • WARNING: Indicates that the node has run completely, but with warning.

  • READY: Indicates that the node is ready to be run.

  • FAILURE: Indicates that the node run has failed.

  • COMPLETE: Indicates that the node has run successfully.

MODEL_ID

NUMBER

This is the ID of the model.

MODEL_NAME

VARCHAR2 (128 CHAR)

This is the name of the model.

MODEL_STATUS

VARCHAR2 (10 CHAR)

  • WARNING: Indicates that the model has run, but with warning.

  • READY: Indicates that the model is ready to be run.

  • FAILURE: Indicates that the model run has failed.

  • COMPLETE: Indicates that the model run has completed successfully.

MODEL_CREATION_DATE

VARCHAR2 (30 CHAR)

This is the creation date of the model.

TEST_METRICS

VARCHAR2 (128 CHAR)

This is the test metrics result table that contains Predictive Confidence, Root Mean Square error and so on.

CONFUSION_MATRIX

VARCHAR2 (128 CHAR)

This is the test Confusion Matrix result table.

LIFTS

DM_NESTED_CATEGORICALS

Table of DM_NESTED_CATEGORICAL, where:

  • ATTRIBUTE_NAME contains target class

  • VALUE contains lift result table

ROCS

DM_NESTED_CATEGORICALS

Table of DM_NESTED_CATEGORICAL, where:

  • ATTRIBUTE_NAME contains target class

  • VALUE contains ROC result table

ROC_AREA

DM_NESTED_NUMERICALS

Table of DM_NESTED_NUMERICAL, where:

  • ATTRIBUTE_NAME contains target class

  • VALUE contains ROC area under curve value

RESIDUAL_PLOT

VARCHAR2 (128 CHAR)

This is the test Residual Plot table.

ODMR_USER_WORKFLOW_ALL_POLL

You can use the ODMR_USER_WORKFLOW_ALL_POLL view the status of Oracle Data Miner workflows, and thereby minimize the use of Oracle Scheduler views.

However, you must use Oracle Scheduler view to determine the status of workflow jobs. The ODMR_USER_WORKFLOW_ALL_POLL repository view can provide complete information about the status of failed workflows. The repository view ODMR_USER_WORKFLOW_ALL_POLL for workflow status polling contains the following repository properties:

  • POLLING_IDLE_RATE: Determines the rate at which the client will poll the database when there are no workflows detected as running.

  • POLLING_ACTIVE_RATE: Determines the rate at which the client will poll the database when there are workflows detected running.

  • POLLING_IDLE_ENABLED: Determines automatic updates to scheduled workflow jobs.

  • POLLING_COMPLETED_WINDOW: Determines the time required to include completed workfows in the polling query result.

  • PURGE_WORKFLOW_SCHEDULER_JOBS: Purges old Oracle Scheduler objects generated by the running of Data Miner workflows.

  • PURGE_WORKFLOW_EVENT_LOG: Controls how many workflow runs are preserved for each workflow in the event log. The events of the older workflow are purged to keep within the limit.

PL/SQL APIs Use Cases

The PL/SQL API use cases demonstrate how to run the PL/SQL APIs to schedule and run a Build workflow and an Apply workflow.

The use cases are:

Premise of the PL/SQL Use Cases

The PL/SQL API use case is built on two predefined workflows, which are available in the SQL Developer installation location.

Predefined Workflows

The two predefined workflow files are:
  • apply_workflow.xml: Uses the Model node to reference the model built by the build_workflow. Then it uses it for scoring.

  • build_workflow.xml: Builds a Server Vector Machine Classification model, and then stores the model details or coefficients to a table.

Location of the Demonstration Files

The workflow files apply_workflow.xml and build_workflow.xml which contain the predefined workflows are available in the SQL Developer installation location at: sqldeveloper home\dataminer\demos\workflows.

Use case to Schedule and Run Apply Workflows

This use case demonstrates how to run a lineage of the APPLY workflow, which is scheduled to run the WF_RUN_API.

To run the lineage, specify the INSUR_CUST_LTV_SAMPLE APPLY node and use the RERUN_NODE_CHILDREN run mode. The use case, as demonstrated in the example, does the following:

  • Schedules the APPLY workflow to run daily from mid night of 12/31/2014 to 12/31/2015 in EST zone

  • Executes the WF_RUN API

  • Polls the status of the workflow from the ODMR_USER_PROJECT_WORKFLOW view

  • Prints the failed nodes

Example 8-1 Schedule and Run the Apply Workflow, Poll Status and Print Node Failures

CONNECT DMUSER/DMUSER
SET SERVEROUTPUT ON 
DECLARE
    v_jobId VARCHAR2(30) := NULL;
    v_status VARCHAR2(30) := NULL;
    v_projectName VARCHAR2(30) := 'Project';
    v_workflow_name VARCHAR2(30) := 'apply_workflow';
    v_node VARCHAR2(30) := 'INSUR_CUST_LTV_SAMPLE APPLY';
    v_run_mode VARCHAR2(30) := ODMRSYS.ODMR_WORKFLOW.RERUN_NODE_CHILDREN;
    v_failure NUMBER := 0;
    v_nodes ODMRSYS.ODMR_OBJECT_NAMES := ODMRSYS.ODMR_OBJECT_NAMES();
BEGIN
    v_nodes.extend();
    v_nodes(v_nodes.count) := v_node;
    v_jobId := ODMRSYS.ODMR_WORKFLOW.WF_RUN(p_project_name => v_projectName,
                       p_workflow_name => v_workflow_name,
                       p_node_names => v_nodes,
                       p_run_mode => v_run_mode,
                       p_start_date => '31-DEC-14 12.00.00 AM AMERICA/NEW_YORK',
                       p_repeat_interval => 'FREQ= DAILY',
                       p_end_date => '31-DEC-15 12.00.00 AM AMERICA/NEW_YORK'));
    DBMS_OUTPUT.PUT_LINE('Job: '||v_jobId);
    -- wait for workflow to run to completion
    LOOP
      SELECT STATUS INTO v_status FROM ODMR_USER_PROJECT_WORKFLOW
      WHERE WORKFLOW_NAME = v_workflow_name;
      IF (v_status IN ('SCHEDULED', 'ACTIVE')) THEN
         DBMS_LOCK.SLEEP(10); -- wait for 10 secs
      ELSE
        EXIT; -- workflow run completes
      END IF;
    END LOOP;
    -- print all failed nodes (see example above)
  EXCEPTION WHEN OTHERS THEN
     DBMS_OUTPUT.PUT_LINE('Error: '||SUBSTR(DBMS_UTILITY.FORMAT_ERROR_STACK(), 1, 4000));
END;

Querying Scoring Result

You can query the scoring result after the workflow run completes successfully.

To query the scoring results:

SELECT * FROM SCORED_CUSTOMERS_TBL

The output of the query is displayed in a table, as shown in Figure 8-1.

Figure 8-1 Query Output for Scoring Results


Description of Figure 8-1 follows
Description of "Figure 8-1 Query Output for Scoring Results"

Use Case to Schedule and Run a Build Workflow

This use case demonstrates how to run the WF_RUN_API in a lineage of a workflow, poll the status and print node failure from the event log.

You can run the WF_RUN_API in a lineage of a workflow, using the run modes RERUN_NODE_ONLY or RERUN_NODE_PARENTS.

The methods to run the WF_RUN_API using the two run modes generate the same result, where all the four nodes in the lineage are run. The methods are:

  • Select all nodes in the lineage and use the RERUN_NODE_ONLY run mode.

  • Select the MODEL_COEFFICIENTS node and use the RERUN_NODE_PARENTS run mode.

The use case, as demonstrated in Example 8-2, does the following:

  • Schedules the workflow to run monthly on the last day of the month (BYMONTHDAY=-1) starting at mid night from 12/31/2014 to 12/31/2015 in EST zone.

  • Executes the WF_RUN API. In this use case, the API WF_RUN with Project Name, Workflow Name Node Name and Time Interval schedules the workflow to run.

  • Polls the status of the workflow from the ODMR_USER_PROJECT_WORKFLOW view to determine whether the workflow run is complete.

  • Prints out any node failure from the event log along with error message.

Example 8-2 Schedule and Run a Workflow, Poll Status and Print Node Failures

CONNECT DMUSER/DMUSER
SET SERVEROUTPUT ON 
DECLARE
   v_jobId VARCHAR2(30) := NULL;
   v_status VARCHAR2(30) := NULL;
   v_projectName VARCHAR2(128) := 'Project';
   v_workflow_name VARCHAR2(128) := 'build_workflow';
   v_node VARCHAR2(30) := 'MODEL_COEFFCIENTS';
   v_run_mode VARCHAR2(30) := ODMRSYS.ODMR_WORKFLOW.RERUN_NODE_PARENTS;
   v_failure NUMBER := 0;
   v_nodes ODMRSYS.ODMR_OBJECT_NAMES := ODMRSYS.ODMR_OBJECT_NAMES();
BEGIN
   v_nodes.extend();
   v_nodes(v_nodes.count) := v_node;
   v_jobId := ODMRSYS.ODMR_WORKFLOW.WF_RUN(p_project_name => v_projectName,
           p_workflow_name => v_workflow_name,
           p_node_names => v_nodes,
           p_run_mode => v_run_mode,
           p_start_date => '31-DEC-14 12.00.00 AM AMERICA/NEW_YORK',
           p_repeat_interval => 'FREQ=MONTHLY;BYMONTHDAY=-1',
           p_end_date => '31-DEC-15 12.00.00 AM AMERICA/NEW_YORK');
   DBMS_OUTPUT.PUT_LINE('Job: '||v_jobId);
   -- wait for workflow to run to completion
LOOP
   SELECT STATUS INTO v_status FROM ODMR_USER_PROJECT_WORKFLOW
   WHERE WORKFLOW_NAME = v_workflow_name;
   IF (v_status IN ('SCHEDULED', 'ACTIVE')) THEN
      DBMS_LOCK.SLEEP(10); -- wait for 10 secs
   ELSE
      EXIT; -- workflow run completes
   END IF;
END LOOP;
-- print all failed nodes from the event log
FOR wf_log IN (
   SELECT node_id, node_name, subnode_id, subnode_name, log_message, log_message_ details
   FROM ODMR_USER_WORKFLOW_LOG
   WHERE job_name=v_jobId and log_type='ERR' and log_message IS NOT NULL)
LOOP
   DBMS_OUTPUT.PUT_LINE('Node Id: '||wf_log.node_id||', '||'Node Name: '||wf_log.node_name);
   IF (wf_log.subnode_id IS NOT NULL) THEN
      DBMS_OUTPUT.PUT_LINE( 
         'Subnode Id: '||wf_log.subnode_id||', '||'Subnode Name: '||wf_log.subnode_name);
   END IF;
   DBMS_OUTPUT.PUT_LINE('Message: '||wf_log.log_message);
   v_failure := v_failure + 1;
 END LOOP;
IF (v_failure = 0) THEN
   DBMS_OUTPUT.PUT_LINE('Workflow Status: SUCCEEDED');
 ELSE
   DBMS_OUTPUT.PUT_LINE('Workflow Status: FAILURE');
 END IF;
EXCEPTION WHEN OTHERS THEN
   DBMS_OUTPUT.PUT_LINE('Error: '||SUBSTR(DBMS_UTILITY.FORMAT_ERROR_STACK(), 1, 4000));
END;

Querying the MODEL_COEFFICIENT Table

You can fetch data from the MODEL_COEFFICIENTS table by running a query.

To query data from the MODEL_COEFFICIENTS table, run the following query:

SELECT * FROM MODEL_COEFFCIENTS

The output of this query is displayed in a table, as shown in Figure 8-2.

Figure 8-2 Query Output for MODEL_COEFFICIENT Table



Querying Named Objects

After the workflow run completes successfully, you can query all named objects generated by the workflow, such as table or view in the Create Table node, models in the Build nodes, and so on.

The following query returns the model CLAS_SVM_MODEL_2 information.

SELECT * FROM USER_MINING_MODELS WHERE MODEL_NAME = 'CLAS_SVM_MODEL_2

The query result is displayed in Figure 8-3.

Figure 8-3 Query Output for Named Objects



Querying Test Results

You can query test results for Confusion Matrix and Test Metrics from ODMR_USER_WF_CLAS_TEST_RESULTS and ODMR_USER_WF_REGR_TEST_RESULTS repository views.

You can query the test results from:

  • ODMR_USER_WF_CLAS_TEST_RESULTS Repository View

  • ODMR_USER_WF_REGR_TEST_RESULTS Repository View

This section contains examples to query the following:

Example 8-3 Querying Test Metrics and Confusion Matrix Results

SELECT TEST_METRICS, CONFUSION_MATRIX FROM ODMR_USER_WF_CLAS_TEST_RESULTS WHERE WORKFLOW_NAME = 'build_workflow' AND NODE_NAME = 'Class Build'

The output of this query is shown in the figure below. The query fetches the Test Metrics and Confusion Matrix from the ODMR_USER_WF_CLAS_TEST_RESULTS.

Figure 8-4 Query Output for Test Metrics and Confusion Matrix



Example 8-4 Querying TEST_METRICS

SELECT * FROM ODMR$18_51_18_106346IFHRNMF

The output of this query is shown in the screenshot below. It queries the Test Metrics ODMR$18_51_18_106346IFHRNMF. It fetches the Metric name, the metric VARCHAR value and the metric NUM value.

Figure 8-5 Query Output for TEST METRICS


Description of Figure 8-5 follows
Description of "Figure 8-5 Query Output for TEST METRICS"

Example 8-5 Querying CONFUSION_MATRIX

SELECT * FROM ODMR$18_51_17_954530VMUXPWL

The output of this query is shown in the screenshot below. It queries the Confusion Matrix ODMR$18_51_17_954530VMUXPWL. It fetches the actual target name, and the predicted target value.

Figure 8-6 Query Output for CONFUSION_MATRIX

Description of Figure 8-6 follows
Description of "Figure 8-6 Query Output for CONFUSION_MATRIX"

Example 8-6 Querying Lift Result Table from CLAS_SVM_MODEL_2

SELECT
MODEL_NAME, a.ATTRIBUTE_NAME "target value", a.VALUE "lift result table"
FROM
ODMR_USER_WF_CLAS_TEST_RESULTS, TABLE(LIFTS) a
WHERE
WORKFLOW_NAME = 'build_workflow' AND NODE_NAME = 'Class Build' AND ATTRIBUTE_NAME='Yes'

The output of this query is shown in the screenshot below. It queries the Lift Result table from the CLAS_SVM_MODEL_2.

Figure 8-7 Query Output for Lift Result Table from CLAS_SVM_MODEL_2

Description of Figure 8-7 follows
Description of "Figure 8-7 Query Output for Lift Result Table from CLAS_SVM_MODEL_2"

Related Topics