Step 7: Submit the data mining run

After naming a data mining run that you are creating, you confirm the run parameters and submit the run.

  1. On the Confirm Run Parameters page, review the parameters chosen for your run to make sure that they are correct. Different parameters display for MGPS runs and logistic regression runs.
  2. To change any parameters, click Back until you are on the appropriate page and make the necessary changes. Then click Next until you are back on the Confirm Run Parameters page
  3. When you are satisfied with the run parameters, click Submit.
  4. Click Continue.

    The run status appears on the Data Mining Runs page. To monitor the progress of a run, you can view jobs for the run.

    All data mining runs are batch jobs, which continue to run on the Oracle Empirica Signal server even if you log out of the application. The next time you log in to the Oracle Empirica Signal application, you can check on the Data Mining Runs home page to determine if your run has completed. If you selected the Email me when complete run option, the application notifies you by email when your run completes.

MGPS run parameters

Field Description

Type

MGPS.

Name

Name supplied for the run.

Description

Description supplied for the run.

Project

Name of the project to which the run is assigned.

Configuration

Name of the data configuration used for the run.

Configuration description

Description of the data configuration used for the run.

As of date

As Of date for the run, if the run is for timestamped data.

Database restriction

Database restriction, if any, associated with the run.

Item variables

Names of the item variables to be used in the run.

Drug Hierarchy

Name and version of the drug hierarchy used by this run if the data configuration specifies a drug hierarchy.

Event Hierarchy

Name and version of the event hierarchy used by this run if the data configuration specifies an event hierarchy.

Custom terms

Custom terms, if any, specified for the run.

Stratification variables

Stratification variables, if any, to be used for the run.

Subsets

Subset variable, if any, as well as whether the subsets are cumulative, the order of subsets, and the subset labels and values.

Highest dimension

The maximum number of ways in which items are combined. See Specify data mining parameters.

Minimum count

Minimum number of cases in which a combination of items must occur in order for the combination to be included in the run's MGPS computations. See Specify data mining parameters.

Calculate PRR

Whether the run includes PRR computations.

Calculate ROR

Whether the run includes ROR computations.

Base counts on cases

For a run that includes PRR and ROR computations, indicates if counts are based on cases rather than drug-event combinations.

Use "all drugs" comparator

For a run that includes PRR computations, indicates whether the drug of interest are included in the comparator set.

Apply Yates correction

For a run that includes PRR computations, indicates whether the Yates correction is applied.

Stratify PRR and ROR

For a run that includes PRR and ROR computations, indicates whether the PRR or ROR computations are stratified.

Include IC

Whether the run includes Information Component computations.

Include RGPS

Whether the run includes RGPS computations.

Calculate RGPS interactions

Whether the run includes Drug+Drug RGPS interaction scores.

Minimum interaction count

Minimum number of times that a drug must appear in Drug+Event reports for the application to calculate RGPS interaction estimates for the drug.

Fill in hierarchy values

Whether the run option to use hierarchy information was checked.

Limit results to

Limitations, if any, on which results will be kept based on statistical thresholds or specified values of item variables. See Specify data mining parameters.

Exclude single itemtypes

Whether the run excludes combinations of items of the same type. See Specify data mining parameters.

Fit separate distributions

Indicates the run's setting for the advanced parameter to fit separate distributions for the different item type combinations.

Save intermediate files

Whether intermediate processing files for the run are saved. See Define data mining run options.

Source database

Information about the source data (from the source description table).

Scheduled to run

Date and time at which the run is scheduled to be run. See Define data mining run options.

Logistic regression run parameters

Field Description

Type

LR. For runs completed prior to the installation of Oracle Empirica Signal version 7.1, LR (Legacy) appears.

Name

Name supplied for the run.

Description

Description supplied for the run.

Project

The name of the project to which the run is assigned.

LR type

Indicates the algorithm type selected for the run: standard or extended. See Logistic regression computations.

For runs completed prior to the installation ofOracle Empirica Signal version 7.1, an Extended logistic regression field appears instead, with a Yes or No value.

Configuration

Name of the data configuration used for the run.

Configuration description

Description of the configuration used for the run.

As of date

As Of date for the run, if the run is for timestamped data.

Database restriction

Database restriction, if any, associated with the run.

Item variables

Names of the run's selected event and drug variables.

Custom terms

Custom terms , if any, specified for the run.

Covariates

Variables, if any, selected as covariates for the run.

Drug values

Explicitly specified values of the drug variable included in the run, even if they do not meet the minimum number of times a drug must occur in combination with specified events.

Event values

Values of the event variable used in the run.

Minimum count

Minimum number of cases in which a drug must occur in combination with specified events in order to be included in the run (except for drugs specifically selected as Drug values). See Select drugs for logistic regression.

Number of events

Number of event values specified.

Save intermediate files

Indicates whether intermediate processing files for the run are saved. See Define data mining run options.

Run interactions

Indicates whether the run calculates statistics for two predictors (such as Drug+Drug or Drug+Covariate) and a response.

Save coefficients

Indicates whether the lr_coefficients.log file produced for the run include the coefficient and standard deviation values calculated for the run.

Source database

Information about the source data (from the source description table).

Scheduled to run

Date and time at which the run is scheduled to be run. See Define data mining run options.