Step 1: Select the run type and data configuration

To create a data mining run, you start by selecting the type of run and the data configuration to use. After these selections, you specify a series of parameters that tell the application how to restrict or combine data, and which types of statistics to compute. You then submit the run and it is executed in batch mode. When the run is complete, you can view the results in tabular or graphical format.

If you have superuser permissions, you can create a run definition file to execute MGPS or logistic regression data mining runs from outside of the application. To do so, use the Create Definition File link. For more information about executing data mining runs externally, contact Oracle.

Note: If you have exceeded the disk space quota for your username, an error message appears when you attempt to create a run. To proceed, delete one or more of your existing runs or contact your system administrator to increase your quota.

1.         In the left navigation pane, hover on the Data Analysis icon (Data Analysis icon), then click Data Mining Runs.

2.         At the top left of the Data Mining Runs home page, click Create Run.

3.         On the Create Data Mining Run page, select the type of run to create:

·      MGPS data mining run—The analytical core of Empirica Signal is a high-performance implementation of the MGPS (Multi-Item Gamma Poisson Shrinker) algorithm. MGPS is based on the metaphor of the market basket problem, in which a database of transactions (adverse event reports) is mined for the occurrence of interesting (unexpectedly frequent) itemsets. For more information about MGPS runs, see MGPS computations.

·      Logistic regression data mining run—Logistic regression is a statistical tool for modeling how the probability of a response depends on the presence of multiple predictors, or risk factors. In Empirica Signal, the predictors are drugs and, optionally, covariates such as report year, gender, or age group, and the responses are adverse events. For more information about logistic regression, see Logistic regression computations.

4.         If you selected Logistic regression (LR) data mining run, select the LR Type:

·      Extended—Use the best alpha computational model.

·      Standard—Use 0.5 as the alpha value for every response.

5.         From the Configuration drop-down, enter a value or click Browse to select from a list of data configurations.

Tip: For a logistic regression data mining run, we recommend a data configuration that includes concomitant drugs, such as an AERS S+C configuration. For more information, see Logistic regression computations.

6.         To limit the run to cases that meet specified criteria only, check Define database restriction.

7.         Click Next.

·      If you selected a data configuration that supports timestamped data, the Select As Of Date page appears so that you can specify the As Of date.

·      If you selected the Define data base restriction box, follow the steps in Define a database restriction.

·      Then select variables.

Note: Once you have selected a run type and continued to the next page, you cannot change the run type.