1.2.3.3 Dashboard Indexes

Indexes aggregate Rule Results, though it is also possible to aggregate indexes hierarchically to create an index of indexes. For example, a data quality index could be constructed for each of a number of source systems, or each of a number of types of data (customer, product etc.). An overall data quality index could then be constructed as an aggregation of these indexes.

Indexes are always configured in Dashboard Administration.

Index Calculation

The index value means little in isolation. However, as the score is calculated from the results of a number of executions of a process or processes (over time), trend analysis will allow the business user to monitor whether the index has gone up or down. This is analogous to monitoring the FTSE100 index.

A higher index value represents a higher data quality score. By default, a 'perfect' DQ index score is 1000.

Index of Rule Results

Where an index is made up of a number of Rule Results, it is calculated as a weighted average across the contributing results.

For example, a Customer Data DQ index may be made up of the following Rule Results and Weightings:

Table 1-8 Rule Results and Weightings

Contributing Rule Weighting

Validate email address

12.5%

Validate address

25%

Title/gender mismatches

37.5%

Validate name

25%

In this configuration, the Validate address and Validate name rules have the default weighting of 25% (a quarter of the overall weight across four rules), but the administrator has specified different weightings for the other rules – the Validate email address rule is interpreted as less important, and the Title/Gender mismatch as more important.

The actual index score is then calculated as a weighted average across internally calculated index scores for each contributing rule.

For each rule, an index score out of 1000 (or the configured base perfect score) is calculated as follows, where 10 points are awarded for a pass, 5 points for a warning, and no points are awarded for an alert:

(((# of passes * 10) + (# of warnings * 5)) / (# of checks *10)) * 1000

For example, if the results of the contributing rules are as follows:

Table 1-9 Results of Contributing Rules

Rule Checks Passes Warnings Alerts

Validate email address

1000

800 (80%)

100 (10.0%)

100 (10.0%)

Validate address

1000

800 (80%)

0 (0%)

200 (20.0%)

Title/gender mismatches

1000

800 (80%)

0 (0%)

200 (20.0%)

Validate name

1000

800 (80%)

0 (0%)

200 (20.0%)

The index scores of each contributing rule will be as shown below:

Table 1-10 Index Scores

Rule Index Score Calculation Index Score

Validate email address

800 passes * 10 points = 8000 +

100 warnings * 5 points = 500

Total = 8500

1000 checks * 10 = 10000

8500/10000 = 0.85 * 1000 = 850

850

Validate address

800 passes * 10 points = 8000+

0 warnings * 5 points = 0

Total = 8000

1000 checks * 10 = 10000

8000/10000 = 0.8 * 1000 = 800

800

Title/gender mismatches

800 passes * 10 points = 8000+

0 warnings * 5 points = 0

Total = 8000

1000 checks * 10 = 10000

8000/10000 = 0.8 * 1000 = 800

800

Validate name

800 passes * 10 points = 8000+

0 warnings * 5 points = 0

Total = 8000

1000 checks * 10 = 10000

8000/10000 = 0.8 * 1000 = 800

800

The overall index score is then calculated using the weightings, as follows:

Validate email address score (850) * Validate email address weight (0.125) = 106.25 +
Validate address score (800) * Validate address weight (0.25) = 200 +
Title/gender mismatch score (800) * Title/gender mismatch weight (0.375) = 300 +
Validate name score (800) * Validate name weight (0.25) = 200

The total Customer Data DQ index score is 806.25, and is rounded up to 806.3 for display purposes.

Index of Indexes

If an index is created to aggregate other indexes, the index is calculated simply as a weighted average of the contributing indexes. For example, the user might set up an index across a number of other indexes as follows:

Table 1-11 Contributing Indexes

Contributing Index Weighting

Customer data index

50%

Contact data index

25%

Order data index

25%

If the index values of each indexes are as follows:

Table 1-12 Weighted Average of Contributing Indexes

Contributing Index Index Score

Customer data index

825.0

Contact data index

756.8

Order data index

928.2

The index would be calculated as follows:

Customer data index (825) * Customer data index weight (0.50) = 412.5 +
Contact data index (756.8) * Contact data index weight (0.25) = 189.2 +
Order data index (928.2) * Order data index weight (0.25) = 232.5

The overall data quality index would have a value of 834.2.

Indexes of Staggered Audit Results

Indexes may aggregate results from a number of processes. Normally, it is expected that this form of aggregation will be used when the processes are executed at the same intervals. However, this cannot be guaranteed. In some cases, the processes contributing to an index will be out of step. For example, two data quality audit processes are executed. An index is configured to aggregate rule results from both processes, and results for the index history are published as follows:

Table 1-13 Results for Index History

Date Results from Customer audit process run on Results from Contact audit process run on

12/06/05

12/06/05

12/06/05

13/06/05

13/06/05

12/06/05

14/06/05

13/06/05

14/06/05

15/06/05

15/06/05

14/06/05

16/06/05

16/06/05

16/06/05

This works by recalculating the results for the index every time one of its contributing processes is run. The results from the last run of each process are then used, and any previously calculated index results for a distinct date (day) are overwritten.