In this section Hide
After naming a data mining run that you are creating, you confirm run parameters and submit the run.
1. On the Confirm Run Parameters page, review the parameters chosen for your run to make sure that they are correct. Different parameters display for MGPS runs and LR runs.
2. To change any parameters, click Back until you are on the appropriate page and make the necessary changes. Then click Next until you are back on the Confirm Run Parameters page
3. When you are satisfied with the run parameters, click Submit.
The application notes that the run has been submitted.
4. Click Continue.
The run status appears on the Data Mining Runs page. To monitor the progress of a run, you can view jobs for the run.
All data mining runs are batch jobs, which continue to run on the Empirica Signal server even if you log out of the application. The next time you log in to the Empirica Signal application, you can check on the Data Mining Runs tab to determine if your run has completed. If you selected the Email me when complete run option, the application notifies you by email when your run completes.
Field |
Description |
Type |
MGPS. |
Name |
Name supplied for the run. |
Description |
Description supplied for the run. |
Project |
Name of the project to which the run is assigned. |
Configuration |
Name of the data configuration used for the run. |
Configuration description |
Description of the data configuration used for the run. |
As of date |
As Of date for the run, if the run is for timestamped data. |
Database restriction |
Database restriction, if any, associated with the run. |
Item variables |
Names of the item variables to be used in the run. |
Drug Hierarchy |
Name and version of the drug hierarchy used by this run if the data configuration specifies a drug hierarchy. |
Event Hierarchy |
Name and version of the event hierarchy used by this run if the data configuration specifies an event hierarchy. |
Custom terms |
Custom terms, if any, specified for the run. |
Stratification variables |
Stratification variables, if any, to be used for the run. |
Subsets |
Subset variable, if any, as well as whether the subsets are cumulative, the order of subsets, and the subset labels and values. |
Highest dimension |
The maximum number of ways in which items are combined. See Specifying data mining parameters. |
Minimum count |
Minimum number of cases in which a combination of items must occur in order for the combination to be included in the run's MGPS computations. See Specifying data mining parameters. |
Calculate PRR |
Whether the run includes PRR computations. |
Calculate ROR |
Whether the run includes ROR computations. |
Base counts on cases |
For a run that includes PRR and ROR computations, indicates if counts are based on cases rather than drug-event combinations. |
Use "all drugs" comparator |
For a run that includes PRR computations, indicates whether the drug of interest are included in the comparator set. |
Apply Yates correction |
For a run that includes PRR computations, indicates whether the Yates correction is applied. |
Stratify PRR and ROR |
For a run that includes PRR and ROR computations, indicates whether the PRR or ROR computations are stratified. |
Include IC |
Whether the run includes Information Component computations. |
Include RGPS |
Whether the run includes RGPS computations. |
Calculate RGPS interactions |
Whether the run includes Drug+Drug RGPS interaction scores. |
Minimum interaction count |
Minimum number of times that a drug must appear in Drug+Event reports for the application to calculate RGPS interaction estimates for the drug. |
Fill in hierarchy values |
Whether the run option to use hierarchy information was checked. |
Limit results to |
Limitations, if any, on which results will be kept based on statistical thresholds or specified values of item variables. See Specifying data mining parameters. |
Exclude single itemtypes |
Whether the run excludes combinations of items of the same type. See Specifying data mining parameters. |
Fit separate distributions |
Indicates the run's setting for the advanced parameter to fit separate distributions for the different item type combinations. |
Save intermediate files |
Whether intermediate processing files for the run are saved. See Defining data mining run options. |
Source database |
Information about the source data (from the source description table). |
Scheduled to run |
Date and time at which the run is scheduled to be run. See Defining data mining run options. |
Field |
Description |
Type |
LR. For runs completed prior to the installation of Empirica Signal version 7.1, LR (Legacy) appears. |
Name |
Name supplied for the run. |
Description |
Description supplied for the run. |
Project |
The name of the project to which the run is assigned. |
LR type |
Indicates the algorithm type selected for the run: standard or extended. See Logistic regression computations. For runs completed prior to the installation of Empirica Signal version 7.1, an Extended logistic regression field appears instead, with a Yes or No value. |
Configuration |
Name of the data configuration used for the run. |
Configuration description |
Description of the configuration used for the run. |
As of date |
As Of date for the run, if the run is for timestamped data. |
Database restriction |
Database restriction, if any, associated with the run. |
Item variables |
Names of the run's selected event and drug variables. |
Custom terms |
Custom terms, if any, specified for the run. |
Covariates |
Variables, if any, selected as covariates for the run. |
Drug values |
Explicitly specified values of the drug variable included in the run, even if they do not meet the minimum number of times a drug must occur in combination with specified events. |
Event values |
Values of the event variable used in the run. |
Minimum count |
Minimum number of cases in which a drug must occur in combination with specified events in order to be included in the run (except for drugs specifically selected as Drug values). See Selecting drugs for logistic regression. |
Number of events |
Number of event values specified. |
Save intermediate files |
Indicates whether intermediate processing files for the run are saved. See Defining data mining run options. |
Run interactions |
Indicates whether the run calculates statistics for two predictors (such as Drug+Drug or Drug+Covariate) and a response. |
Save coefficients |
Indicates whether the lr_coefficients.log file produced for the run include the coefficient and standard deviation values calculated for the run. |
Source database |
Information about the source data (from the source description table). |
Scheduled to run |
Date and time at which the run is scheduled to be run. See Defining data mining run options. |