Step 3: Select the variables
A variable is a data item in the source data referenced by the configuration that you selected for the data mining run.
- An MGPS data mining run includes a drug (or vaccine) variable and an event (or symptom) variable.
- For logistic regression runs, you select the drug (predictor) and event (response) variables. You also have the option to select one or more variables as covariates (additional predictors).
- In the left navigation pane, hover on the Data Analysis
                    icon ( ), then click Data Mining Runs. ), then click Data Mining Runs.
- At the top left of the Data Mining Runs home page, click Create Run.
- On the Create Data Mining Run page, select the type of run to create and click Next.
- (For timestamped data) On the Select "As Of" Date page, choose the latest date and time or select Other to choose a date and time. Then click Next.
For an MGPS data mining run:
- (For timestamped data) On the Select "As Of" Date page, choose the latest date and time or select Other to choose a date and time. Then click Next.
- On the Select Variables page, select at least one Item Variable by clicking its name. 
                        Tip: To study drug-event combinations, select two or more items that are listed as Drug or Event variables.
-  If you want to use stratification, select one or more Stratification variables.
                        - You cannot select the same variable as both an item variable and a stratification variable.
- A warning message appears if there are more than 500 values for all selected stratification variables. Too many stratification values can cause reduced sensitivity in the EBGM and EB05 statistics. To avoid reduced sensitivity, the number of cases divided by the number of unique stratification values should be at least 100. For example, if there are 5,000 cases in the source data, there should not be more than 50 strata values.
 
- To define custom terms, follow the steps in Define custom terms.
- To define subsets, check Define subsets.
- Click Next. 
                        If you didn't choose to define custom terms or subsets, the Data Mining Parameters page appears and you can go on to Step 4 for MGPS runs: Specify data mining parameters. 
For a logistic regression data mining run:
- On the Select Variables page, click one variable in the Event variable list and one variable in the Drug variable list. 
                        Tip: To treat combination drugs as if the subject had taken separate drugs, select a variable, such as Single Ingredient, as the drug variable.
- To include more variables, select them from the Additional covariates list.
                        Tip: Typically, a logistic regression run includes one or more additional covariates as predictors. Otherwise, the logistic regression can be subject to confounding due to potentially highly associated covariates being left out. Recommended covariates for an AERS-based configuration include the AgeGroup4, Gender, and Subset_Year variables.
- (Optional) To set up a custom term for this data mining run, check Define custom terms.
- Click Next.
- If you selected Define custom terms, follow the steps in Define custom terms.
- Click Next to go to Step 4 for logistic regression runs: Select the events and specify drugs.
Parent topic: Create a Data Mining Run