Data mining results for logistic regression runs

The following columns are available for inclusion in the results table for a logistic regression run.

By default, a limited set of columns is included in each of these tables. To include more columns in the results table, click Columns above the table and, optionally, check Show All Columns.

Note:

Hover over a column heading to display a description of the column.

Results Table

The following columns are available for inclusion in the results table for a logistic regression run. For information on the columns available for an MGPS run, see Data mining results for MGPS runs or Logistic regression computations.

Column Description

<predictor variable>

Logistic regression runs include a drug variable (the predictor) and an event variable (the response). The results table includes a column with the name of each variable selected for the run; for example, Single Ingredient and PT.

<response variable>

Logistic regression runs can also include covariate variables as predictors. If the run includes covariates, click View Covariates to review results for each covariate-event pair.

N

Observed number of cases with the combination of items. You can click the value of N to display a menu from which you can drill down to view or download case information or run reports.

Note:

On the Covariates table, the value of N is not hyperlinked.

E

The expected number of cases with the combination, computed as:

(Observed # cases with predictor/Total # cases ) x (Observed # cases with response/Total # cases) x Total # cases

When E < .001, displays in scientific notation.

RR

Relative Ratio. Observed number of cases with the combination divided by the expected number of cases with the combination. (The same as N/E.) This is a sampling estimate of the true Relative Ratio (that would be observed if the database were much larger, but drawn from the same conceptual population of reports) for the particular combination of predictor and response and ignoring the LR adjustments for covariates and other drugs.

LROR

Logistic Regression Odds Ratio. Estimated odds ratio relating occurrence of response to occurrence of predictor, adjusting for other predictors.

LR05

Lower 5% confidence limit for the logistic regression odds ratio.

LR95

Upper 95% confidence limit for the logistic regression odds ratio.

The interval from LR05 to LR95 is the 90% confidence interval.

JOB_ID

The identifier assigned by the listener to the sub-job in the run.

ID

Identifies the unique row number assigned as Oracle loads data from the run's output files.

ROW_NUM

Identifies the row number assigned in one of the run's output files. The value in this column does not have to be unique in the results table for a given run.

Covariates Table

For logistic regression runs that include computations for covariates, such as gender, age group, or report receipt year, you can view a results table for each covariate-event combination. The rows in this table reflect the scores calculated for each event and each possible value for the covariates of the run, with the exception of the value that occurs most frequently. (The value that occurs most frequently for each covariate is used as the standard for the odds ratio calculated for each of the other values.) For example, if you use Seriousness as a covariate and Seriousness has the unique values YES and NO, with NO occurring most frequently, this table only includes rows for YES in combination with each event.

Note:

Only the event values specified as results criteria apply to the table of covariate results.

In addition to the columns shown in the results table, the following additional columns are available in the table of covariate results:

Column Description

P_ITEM

The name of the selected covariate variable, such as Gender or Seriousness.

ITEM

The value of the covariate, such as M, F, or U for a covariate of Gender, or YES or NO for a covariate of Seriousness.

Interactions Table

For logistic regression runs that include the computation of interactions, you can view a results table of scores calculated for each predictor+predictor+response (drug+drug+event, drug+covariate+event, or covariate+covariate+event).

Only the drug(s) and event(s) that you selected as criteria are included in the table of interaction results. To include only scores that exceed a minimum threshold value in this table, click the Columns and Rows link above the table and use the Limit to field to select a statistic and supply that minimum number. By default, this limit is set to N > 0.0: to include rows in this table for reports in which the Item1 value and Item2 value did not occur with the event value, you must change this limit to a negative number such as N > -1.0.

If your  user preference, Include SQL WHERE Clause for Advanced Results Selection, is checked, you can also specify a SQL WHERE clause. For example, to view only rows that have a certain drug name as a value for either the first or second predictor, enter a SQL WHERE clause of ITEM1='<drug name>' OR ITEM2='<drug name>'.

The following columns are available in the table of interaction results:

Column Description

P_ITEM1

For the value that appears in the ITEM1 column, either provides the name of the covariate variable or the configuration prefix for drug (D).

ITEM1

The value of the first predictor in the interaction. This value can be a covariate (if the run included one or more covariates) or a drug.

P_ITEM2

For the value that appears in the ITEM2 column, provides the configuration prefix for drug (D) or the name of the covariate variable.

ITEM2

The value of the second predictor in the interaction. This value can be a drug or a covariate (if the run included multiple covariates).

<response variable name>

Event (response) variable selected for the run, such as PT.

N

Observed number of cases with the combination of all three values (ITEM1, ITEM2, response). You can click the value of N to display a menu from which you can drill down to view or download case information or run reports.

N_TOT

Observed number of cases in the data that have both ITEM1 and ITEM2.

To specify a SQL WHERE clause that includes this column, use the underlying column name of INTSS.

E

The predicted value of N based on LR without interactions: the sum, over all N_TOT reports that have both ITEM1 and ITEM2, of the predicted probability of the response using the LR prediction formula.

When E < .001, the value displays in scientific notation.

INT_LOF

Empirical Bayes shrinkage estimate of additional interaction due to lack of fit to the LR model, based on the ratio N/E.

To specify a SQL WHERE clause that includes this column, use the underlying column name of EXCESS2.

INT_REGR

Predicted ratio of probability of the response given both ITEMs, divided by probability given worst ITEM, if LR model is correct.

To specify a SQL WHERE clause that includes this column, use the underlying column name of EXCESS.

INT_TOT

Total interaction, computed as INT_REGR * INT_LOF.

To specify a SQL WHERE clause that includes this column, use the underlying column name of LROR.

INT05

Lower 5% confidence limit for INT_TOT.

To specify a SQL WHERE clause that includes this column, use the underlying column name of LR05.

INT95

Upper 95% confidence limit for INT_TOT.

To specify a SQL WHERE clause that includes this column, use the underlying column name of LR95.

LROR1

Logistic regression odds ratio relating the response event to ITEM1.

To specify a SQL WHERE clause that includes this column, use the underlying column name of PRR_A.

LROR2

Logistic regression odds ratio relating the response event to ITEM2.

To specify a SQL WHERE clause that includes this column, use the underlying column name of PRR_B.

JOB_ID

The identifier assigned by the listener to the sub-job in the run.

ID

Identifies the unique row number assigned as Oracle loads data from the output files of the run.

ROW_NUM

Identifies the row number assigned in one of the output files of the run. The value in this column may not be unique in the results table for a given run.