Subset variables
When you use a subset variable for an MGPS data mining run, Oracle Empirica Signal divides source data into separate backgrounds, one for each value of the subset variable, and performs a separate run for each of those different backgrounds.
You use subsets to study how the strength of a combination of variables evolved temporally. You can also use subsets to study the strength of variable combinations for age differences, gender differences, and so on. For example, to review separate signal scores for males and for females, you subset the data mining run by gender. Two separate sets of statistics result, one set computed against the background of cases involving males and the other against the background of cases with females.
You can select only one subset variable for a MGPS data mining run. That variable must have only one value for each case in the database. For example, indication, route, or dose cannot be used directly as subset variables because any case with more than one drug has multiple values for these variables, one for each reported drug. After selecting a subset variable, choose which values of that variable to use for subsetting. The application generates a set of results for each subset.
Note:
You cannot select the same variable as both an item variable and a subset variable.Example
The database contains information on the age of patients reporting adverse reactions. The age has been categorized as Pediatric, Adult, and Elderly, and stored in a variable named AgeGroup. To compare associations of variables for different age categories, you can select AgeGroup as the subset variable (assuming that AgeGroup has been made available as a subset variable by the configuration).
With subsetting, the application computes expected counts, Relative Ratio, and EBGM for each subset separately:
All Cases
(Pediatric, Adult, Elderly)Cases for Pediatric
Actual counts and expected countsCases for Pediatric
Relative Ration and EBGM
Cases for Adult
Actual counts and expected countsCases for Adult
Relative Ration and EBGM
Cases for Elderly
Actual counts and expected countsCases for Elderly
Relative Ration and EBGM

For a data mining run that uses cumulative subsets, there is one set of results for each subset and the computations to determine results for each subset include data from all previous subsets. A previous subset may or may not be previous chronologically. For more information, see Define subset values. Typically, a variable used for cumulative subsetting represents a time period.
For example, suppose that you use Report Year as the subset variable and the possible report years are 2002 through 2004:

All Cases
(2002, 2003, 2004)Cases received in 2002
Actual counts and expected countsCases received in 2002
Relative Ratio and EBGM
Cases received in 2002-2003
Actual counts and expected countsCases received in 2002-2003
Relative Ratio and EBGM
Cases received in 2002-2004
Actual counts and expected countsCases received in 2002-2004
Relative Ratio and EBGM
Parent topic: Define subsets for an MGPS run