Specific Check

This check is used to define conditions based on individual checks on a single column.

Figure 7-42 Check Type pane


This image displays the Check Type pane.

If Specific Check is selected, perform the following steps:

  1. Select Table and Base Column Name from the drop-down list. The list displays all the tables that are marked for Data Quality Rule in a data model; that is, based on ENABLE_CLASSIFICATION parameter. For more information, see Table Classification.
  2. Click and select the Identifier Columns. The list displays all PK columns of the selected base table. This feature allows you to view the DQ results report based on the selected identifier columns apart from the PK columns. You can select up to 8 Identifier columns including the PK columns. It is mandatory to select the PK Columns.
  3. If the selected Base Column is of Varchar/Char data type, select the Substring checkbox and enter numeric values in the Parameters Position and Length characters fields.
  4. Click and define the Filter condition using the Specify Expression window. For more information, see Specify Expression.

    Note:

    NOTE While defining the filter condition, you can also include the Runtime Parameter name, which you will be specifying in Additional Parameters condition while executing the DQ Rule.
  5. Define the required Validation Checks by selecting the appropriate grid and specify the details. You can define nine specific validation checks based on Range, Data Length, Column Reference/Specific Value, List of Value/Code, Null Value, Blank Value, Referential Integrity, Duplicity, and Custom Check/Business.

    Note:

    A minimum of one Validation check must be defined to generate a query.
    • Ensure that you select the Enable checkbox for every check to be applied as a part of rule.
    • While defining any of the validation checks, you must specify the Severity (Error, Warning, or Information. You can add an Assignment only when the Severity is selected as Warning or Information. Assignments are added when you want to correct or update record(s) in the base column data / selected column data. However, selecting severity as Error indicates there are no corrections and only facilitates in reporting the quantity of bad records.

Figure 7-43 Validation Checks window


This image displays the Validation Checks window.

Table 7-15 Fields in the Validation Checks window and their Descriptions

Check Type Description
Range Check

Range Check identifies if the base column data falls outside a specified range of Minimum and Maximum value.

Example: If the Base Table is STG_CASA, Base Column is N_MIN_BALANCE_YTD, Minimum value is 9, and Maximum value is 99, then the check with the Inclusive checkbox enabled (by default) is defined as ‘STG_CASA.N_MIN_BALANCE_YTD < 9 and STG_CASA.N_MIN_BALANCE_YTD > 99’. Here the base column data less than 9 and greater than 99 are identified as invalid.

If the Inclusive checkbox is not selected for Minimum and Maximum, then the check is defined as, ‘If STG_CASA.N_MIN_BALANCE_YTD <= 9 and STG_CASA.N_MIN_BALANCE_YTD >= 99’. Here the base column data less than 10 and greater than 98 are identified as invalid, where 9 and 99 are also included in the validation and considered as invalid.

  • Select Enabled checkbox. This option is available only if the selected Base Column is either of Date or Number data type.

Select the Severity as Error, Warning, or Information.

If the selected Base Column is of “Date” type, select Minimum and Maximum date range using the Calendar. If the selected base column is of “Number” type, enter the Range value. You can specify numeric, decimal, and negative values for number Data type. The Inclusive checkbox is selected by default and you can deselect the same to include the specified date/value during the validation check.

Click and specify an expression for Additional Condition using the Specify Expression window. For more information, see Define Expression.

(Optional) If the Severity is set to Warning/Information:

  1. Select the Assignment checkbox.
  2. Select the Assignment Type from the drop-down list. For more information, see Populating Assignment Type Details in the References section.
  3. Specify the Assignment Value.
  4. Select the Message Severity as 1 or 2 from the drop-down list.
  5. Select the Message to be displayed from the drop-down list.
Data Length Check

Data Length Check checks for the length of the base column data using a minimum and maximum value and identifies if it falls outside the specified range.

Example: If the Base Table is STG_CASA, Base Column is N_MIN_BALANCE_YTD, the Minimum value is 9, and the Maximum value is 12, then the check is defined as ‘If the length of STG_CASA.N_MIN_BALANCE_YTD < 9 and > 12’. Here the base column data with characters less than 9 and greater than 12 are identified as invalid.

Select Enabled checkbox.

Select the Severity as Error, Warning, or Information.

Specify the Minimum data length characters.

Specify the Maximum data length characters.

Click and specify an expression for Additional Condition using Specify Expression window. For more information, see Define Expression.

Column Reference / Specific Value Check

Column Reference / Specific Value Check compares the base column data with another column of the base table or with a specified direct value using the list of pre-defined operators.

Example: If the Base Table is STG_CASA, Base Column is N_MIN_BALANCE_YTD, and if Column Reference check is defined against a specific value ‘100’ with the operator ‘>=’ then the check is defined as, ‘If STG_CASA.N_MIN_BALANCE_YTD < 100’. Here the base column data with value less than 100 are considered as invalid.

Or, if Column Reference check is defined against another column N_MIN_BALANCE_MTD with the operator ‘=’ then the check is defined as, ‘If STG_CASA.N_MIN_BALANCE_YTD <> STG_CASA.N_MIN_BALANCE_MTD’. Here the reference column data not equal to the base column data is considered as invalid.

Select Enabled checkbox. This option is available only if the selected Base Column is either of Date or Number data type.

Select the Severity as Error, Warning, or Information.

Select the Mathematical Operator from the drop-down list.

Select the Filter Type as one of the following:

Select Specific Value and specify the Value. You can specify numeric, decimal, and negative values for number Data type.

Select Another Column and select Column Name form the drop-down list.

Click and specify an expression for Additional Condition using Specify Expression window. For more information, see Define Expression.

(Optional) If the Severity is set to Warning/Information:

Select the Assignment checkbox.

Select the Assignment Type from the drop-down list. For more information, see Populating Assignment Type Details in Reference section.

Specify the Assignment Value.

Select the Message Severity from the drop-down list.

Select the Message from the drop-down list.

List of Value / Code Check

List of Value / Code Check can be used to verify values where a dimension / master table is not present. This check identifies if the base column data does not matches with any value or code specified in a list of values.

Example: If the Base Table is STG_CASA, Base Column is N_MIN_BALANCE_YTD, and the list of values is mentioned are “100, 101, 102, 103, 104”, then the check is defined as, ‘If STG_CASA.N_MIN_BALANCE_YTD is NOT IN (‘100, 101, 102, 103, 104)’. Here the base column data apart from the one specified (i.e. 100, 101, 102, 103, 104) are considered as invalid.

Or, for Code Check,

If the Base Table is CURRENCY_MASTER, Base Column is COUNTRY_CODE, and the list of values is mentioned are ‘IN’, ‘US’, ‘JP’, then the check is defined as, ‘If CURRENCY_MASTER.COUNTRY_CODE is NOT IN (‘IN’, ‘US’, ‘JP’)’. Here the base column data apart from the one specified (i.e. ‘IN’, ‘US’, ‘JP’) are considered as invalid.

Select Enabled checkbox.

Select the Severity as Error, Warning, or Information.

Select the Filter Type as one of the following:

Select Input Values and specify the List of Values. You can specify numeric, decimal, string (Varchar /char), and negative values.

Select Code and click in the List of Values column. The Code Selection window is displayed. Select the required code and click . You can also click to select all the available codes. Click OK.

Click and specify an expression for Additional Condition using Specify Expression window. For more information, see Define Expression.

(Optional) If the Severity is set to Warning or Information:

Select the Assignment checkbox.

Select the Assignment Type from the drop-down list. For more information, see Populating Assignment Type Details in the References section.

Specify the Assignment Value.

Select the Message Severity from the drop-down list.

Select the Message from the drop-down list.

Null Value Check

Null Value Check identifies if “NULL” is specified in the base column.

Example: If the Base Table is STG_CASA and the Base Column is N_MIN_BALANCE_YTD, then the check is defined as, ‘If STG_CASA.N_MIN_BALANCE_YTD is NULL’. Here the base column data, which is null, are considered as invalid.

Select Enabled checkbox.

Select the Severity as Error, Warning, or Information.

Click and specify an expression for Additional Condition using Specify Expression window. For more information, see Define Expression.

(Optional) If the Severity is set to Warning or Information:

Select the Assignment checkbox.

Select the Assignment Type from the drop-down list. For more information, see Populating Assignment Type Details in the References section.

Specify the Assignment Value.

Select the Message Severity from the drop-down list.

Select the Message from the drop-down list.

Note: The Null Check support TIMESTAMP datatype.

Blank Value Check

Blank Value Check identifies if the base column is blank without any values considering the blank space.

Example: If the Base Table is STG_CASA and Base Column is N_MIN_BALANCE_YTD, then the check is defined as, ‘If Length of data of STG_CASA.N_MIN_BALANCE_YTD after trim is null’. Here the base column data that is blank/empty are considered as invalid.

Select Enabled checkbox.

Select the Severity as Error, Warning, or Information.

Click and specify an expression for Additional Condition using Specify Expression window. For more information, see Define Expression.

(Optional) If the Severity is set to Warning or Information:

Select the Assignment checkbox.

Select the Assignment Type from the drop-down list. For more information, see Populating Assignment Type Details in the References section.

Specify the Assignment Value.

Select the Message Severity from the drop-down list.

Select the Message from the drop-down list.

Note: The Blank Check support TIMESTAMP datatype.

Referential Integrity Check

Referential Integrity Check identifies all base column data which has not been referenced by the selected column of the referenced table. Here, the reference table and columns are user specified.

Example: If the Base Table is STG_CASA, Base Column is N_MIN_BALANCE_YTD, Reference table is STG_CASA_TXNS, and reference column is N_TXN_AMOUNT_NCY, then the check is defined as, ‘(not exists (select STG_CASA_TXNS.N_TXN_AMOUNT_NCY from STG_CASA_TXNS where STG_CASA_TXNS.N_TXN_AMOUNT_NCY=STG_CASA.n_min_

balance_ytd))’. Here, if the STG_CASA. N_MIN_BALANCE_YTD column value does not match with STG_CASA_TXNS. N_TXN_AMOUNT_NCY, then those base table records are considered as invalid.

This check can be used to validate attributes like Geography dimension, currency dimension, and so on.

Select Enabled checkbox.

Select the Severity as Error, Warning, or Information.

Select the Table (Referential Integrity Check dimension table) from the drop-down list.

The base table selected under the Select grid is excluded from the drop-down list.

Select the Column from the drop-down list.

The list displays those columns that have the same Data Type as that of the Base Column selected under Select grid.

Select the Is Composite Key checkbox if the base column is part of a Composite Key.

Enter the Additional Reference Condition for the Composite Key. For example, baseTable.column2=refTable.column2 and baseTable.column3=refTable.column3 where column1, column2, column3 are part of the Composite Keys, baseTable.column1 is the base column and refTable.column1 is the reference column.

Click and specify an expression for Additional Condition using Specify Expression window. For more information, see Define Expression.

Note: SELECT privilege should be granted to METADOM (atomic schema) user on Base Table and Reference Table for all DQ rules which are defined on “Data Management Sources”.

Duplicate Check

Duplicate Check can be used when a combination of column is unique and identifies all the duplicate data of the base table in terms of the columns selected for the duplicate check.

Example: If the Base Table is STG_CASA, base column is N_MIN_BALANCE_YTD, and duplicity columns are selected as N_MIN_BALANCE_MTD and N_MIN_BALANCE_ITD, then the check is defined as, ‘If there are duplicate values for the combination of columns STG_CASA. N_MIN_BALANCE_YTD, STG_CASA.N_MIN_BALANCE_MTD, and STG_CASA. N_MIN_BALANCE_ITD are considered as invalid’.

Select Enabled checkbox.

Select the Severity as Error, Warning, or Information.

Click in Column list and select the required column.

Click and specify an expression for Additional Condition using Specify Expression window. For more information, see Define Expression.

Custom Check/Business Check

Custom Check/Business Check is a valid SQL query to identify the data with the query specified as the Custom/business SQL. You can define the SQL, but the Select clause of the query has to follow the order as specified in the template of the Custom Check panel.

Example: When you want all the bad records based on two column selection from same table, such as - Identify all the error records from Investments table where the account number is not null and account group code is null:

  • select PK_NAMES,PK_1,PK_2,PK_3,PK_4,PK_5,PK_6,PK_7,PK_8,ERROR_COLUMN from (SELECT NULL PK_NAMES, NULL PK_1,NULL PK_2,NULL PK_3,NULL PK_4,NULL PK_5,NULL PK_6,ACCOUNT_NUMBER PK_7, ACCOUNT_GROUP_CD PK_8,1 ERROR_COLUMN FROM FSI_D_INVESTMENTS WHERE ACCOUNT_GROUP_CD IS NULL AND ACCOUNT_NUMBER IS NOT NULL)
  • Select Enabled checkbox.
  • Select the Severity as Error, Warning, or Information.
  • Enter the Custom/Business Check parameters within the brackets. Ensure that each parameter is separated by a comma.

Note: Threshold check is performed based on the value set to Y for the following parameter DQ_ENABLE_CUSTOM_THRESHOLD. By default, the value is N.

  1. Click Generate Query.

    The details are validated and the validated query along with the status is displayed in the Generated Query section.

  2. Click Save.

    The defined Data Quality Rule definition is displayed in the Data Quality Rule Summary window with the Status as “Saved” and Active as "N". After it is approved, it becomes active.

  3. Additional conditions would be appended to the RI check criteria, that is, to the NOT EXISTS clause in conjunction with an AND.

Note:

For all checks except Referential Integrity Check, the additional condition is expected to be defined on the base table; whereas for RI check, it can be done on the base table as well as the reference table.