5 Data Validation Framework

The foundation file interfaces (such as product and organization hierarchies) have a set of validations and error checking jobs that execute with them to ensure the data is accurate, complete, and follows all basic requirements for RAP application usage. Review the contents of this chapter to understand what validations exist and how to reconfigure them per your implementation needs.

Architecture Overview

The validation framework consists of POM batch jobs that execute the validations, and database tables that control the types of validation rules and what happens when the rule is triggered. Some validation rules may cause the POM job to fail, which means the data has a critical issue that needs to be corrected before the batch process can continue. Other rules will simply write warnings to the database but allow the batch to proceed. In both cases, there are tables that can be queried to check the validation results and determine what actions need to be taken.

The table below summarizes the POM jobs that execute the validations:

Table 5-1 Data Validation POM Jobs

Job Name Summary

DIM_ORG_VALIDATOR_JOB

Executes validations on the Organization Hierarchy data

DIM_PROD_VALIDATOR_JOB

Executes validations on the Product Hierarchy and Product dimension data

DIM_CALENDAR_VALIDATOR_JOB

Executes validations on the Calendar Hierarchy staging data

DIM_CALENDAR_LOAD_VALIDATOR_JOB

Executes validations on the Calendar Period data after the load has been run

DIM_EXCH_RATE_VALIDATOR_JOB

Executes validations on the Exchange Rate staging data

DIM_PROD_ATTR_VALIDATOR_JOB

Executes validations on the Product Attribute staging data

FACT_POSFACT_VALIDATOR_JOB

Executes validations on the positional fact staging data (inventory, POs, and so on). Refer to the POS DATA set of validation rules, such as POSINVATA_R1 later in this section, for details on what this job is looking for.

The jobs are included both in the nightly batch process flow and in separate ad hoc processes that can be executed as part of your historical data loads.

The configuration tables for the validation rules are called C_DIM_RULE_LIST and C_FACT_RULE_LIST. You can access these tables from the Control & Tactical Center’s Manage System Configurations screen. The tables allows you to edit the following fields:

  • Set the error message resulting from a validation rule (ERROR_DESC)

  • Set whether the POM job should have a hard failure or only capture a warning message (ERROR_TYPE) with a value of F or W

  • Set whether it is turned on or off (ON_IND) with a value of Y or N

The other important field in this table is the BAD_TBL_NAME, which tells you where the results of the validations will be written in the case of any errors or warnings. If a failure or warning does occur, you can directly query the database table listed in BAD_TBL_NAME using Data Visualizer or APEX.

Any time you execute one or more of the dimension validation jobs, there is also a database view that summarizes the results from the job executions. This view is RI_DIM_VALIDATION_V and can also be queried from DV as needed. An example of the data in this view is shown below:

Validation Database View

Using a combination of the data in RI_DIM_VALIDATION_V and the specified BAD_TBL_NAME table data, you will be able to identify the issues and take corrective action on the source data. In the case of job failures, you will need to reload the data file to proceed. It is also possible to skip the failed validation job in POM, but this should only be done if you have carefully reviewed the validation results and are confident the data will not cause any problems in your target applications.

Resolving Validation Issues

The validation rules scan your input data for a variety of common problems that may result in failures or inconsistencies in downstream applications such as AI Foundation or Planning modules. The table below describes what the rules are checking for and how to resolve the issues.

Table 5-2 Validation Rule Details

Rule ID Explanation Resolution

CAL_R1

The W_MCAL_PERIOD_D table does not contain any data after loading a calendar file. Your calendar file may have format or data issues that require correction, such as an incorrect value for MCAL_CAL_ID or missing dates that prevent it from loading properly.

Reload a corrected data file after reviewing the contents. All start/end date fields must be populated and all other fields should exactly match the file requirements as documented.

CAL_R2

The start and end dates for the fiscal periods, quarters, or years are overlapping, which will result in an invalid calendar.

Create and load a new calendar file where the period/quarter/year start and end dates are exactly aligned and don’t overlap or have gaps. Ensure all periods in one quarter/year have the same dates for those columns.

CAL_R3

The START_DT parameter set on C_ODI_PARAM_VW is not less than or equal to your first calendar period start date. This may result in missing calendar data.

Update the START_DT parameter from the Control Center to be earlier than your fiscal calendar start date. Detailed explanation of the start and end dates is provided in Chapter 2 of the RAP Implementation Guide.

CAL_R4

Your calendar file does not contain at least 2 years prior to the current system date. Many applications on the platform require at least 2 years before and after the current calendar year (5 years total).

Load a new calendar file having at least 2 years of fiscal periods prior to the current year.

PROD_STG_R1

Null descriptions were detected for attributes having a non-null ID column for one of the following: BRAND_DESC, SUPPLIER_DESC

Populate valid descriptions for all rows in the product data where BRAND_NAME or SUPPLIER_NUM are populated, and then reload the file.

PROD_R1

Many-to-many relationships exist in your product hierarchy, which is not allowed. This is generally due to the same child ID appearing below multiple parent IDs.

Review all hierarchy levels for instances of the same ID appearing under multiple parents (such as a department belonging to two different divisions or groups) and modify the data to remove the multi-parent issues.

PROD_R2

The same product hierarchy node has multiple descriptions on different rows of the input file.

Modify your product hierarchy file such that any given hierarchy ID has the same description on all rows.

PROD_R3

A node of the product hierarchy has no children under it. This could be due to a reclass that didn’t delete the old nodes, or when a new node is added but no items were created yet. Initially, it is set as a Failure (F) rule, but you may change it to a warning (W) in C_DIM_RULE_LIST if you are okay with the data in its current format.

If possible, remove all cases of nodes having no children (for example, if all items are reclassed out of a subclass, delete the old subclass). Some AI Foundation functionality may fail if you attempt to run it on empty nodes; for example, if a user attempts to run CDT on an empty subclass, it cannot complete the run because there will be no data found. MFCS may allow hierarchies to exist without items, but it can lead to user confusion in AIF, so it’s best to clean up this data regularly.

PROD_R4

Your product hierarchy levels use alphanumeric characters for the level IDs. This is not allowed if you are implementing Retail Insights; all levels must be numbers.

If you are implementing RI, you must alter your hierarchy to only use numbers for every level above item. Other characters are not allowed.

PROD_R5

You are attempting to delete an item while also sending data for that item in other files on the same batch run. You cannot delete an item if it is still actively sending data on other input interfaces.

Re-send the deleted item file, removing any items that are still active or posting new data to RI. If the item should be deleted, then re-send the other files having that item’s data to remove the item from all other files.

PROD_R6

This is an informational message and not a hard failure. Null values are being checked for product attribute columns that are critical to the operation of multiple RAP applications. The warning message columns map to the item level (ATTR11), tran level (ATTR12), diff aggregate (ATTR16), and item/parent/grandparent (ATTR13,14,15) fields in the PRODUCT.csv file. (which loads into the W_PRODUCT_ATTR_DS staging table)

Null values in the ATTR11 and ATTR12 counts means that the ITEM_LEVEL and TRAN_LEVEL input fields are not populated, and these must be populated for 100% of items at all levels. ATTR13/14/15 represent the item hierarchy of multi-level items and some null values are normal. The only check to perform is whether 100% of rows have nulls; if there are, then no parent/child relationship was set up for any items. ATTR16 matches with the DIFF_AGGREGATE input field (or DIFF columns in MFCS) and it must have non-null values for all fashion items (for example, the color IDs for SKUs within a style). MFCS allows you to configure fashion items to not have a diff aggregate, but it is invalid for RAP applications that require diff aggregates to function.

PROD_R7

Invalid hierarchy relationships exist for two or more SKUs having the same item-parents or grandparents but different hierarchy levels. This will break downstream integrations with AIF and Planning.

Correct the hierarchy levels so that all SKUs having the same item-parents also have the same subclass and above hierarchy levels. This includes both the incoming item data and items already in the database from prior loads; you must ensure consistent parents and hierarchy levels across all of them. The BAD_PRODUCT_DS table shows multiple messages for this validation depending on where the issue is, such as PRODUCT L3 AND ITEMPARENT L2 SUBCLASS MISMATCH for item level 2 and 3 items not having the same subclasses.

PROD_R8

You have more than 1 top level (company) ID, which is not allowed.

Correct the TOP_PRODCAT_ID to contain only one value on all rows.

PROD_R9

Your input file contains a different top level (company) ID than what is already in the database.

Correct the TOP_PRODCAT_ID to match the company ID already in the system, or erase the data in the system to perform a clean load of new hierarchy data.

PROD_R10

Your input file contains a different top level (company) domain member ID (on W_DOMAIN_MEMBER_DS_TL) than what is already in the database.

Correct the TOP_PRODCAT_ID to match the company ID already in the system, or erase the data in the system to perform a clean load of new hierarchy data.

PROD_R11

You have more than 1 top level (company) description, which is not allowed.

Correct the TOP_PRODCAT_DESC to contain only one value on all rows.

PROD_R12

You have duplicate item IDs when ignoring the case, such as XYZ and xyz, which is not allowed on the platform. Item IDs are not case-sensitive.

Delete duplicate items such that you only have a single ID across all lowercase/uppercase variations.

PROD_R13

You have a one-to-many relationship between the display IDs for a hierarchy level and their unique (UID) identifiers, such as LVL4_PRODCAT_UID and LVL4_PRODCAT_ID.

For a given value in the unique (UID) columns, you must have one and only one value in the associated display ID column. Correct either the UID or ID columns to ensure they align.

ORG_STG_R1

Null descriptions were detected for attributes having a non-null ID column, for one of the following: CHANNEL_NAME, PLANNING_CHANNEL_NAME, STORE_FORMAT_DESC

Populate valid values for all descriptions where CHANNEL_ID, PLANNING_CHANNEL_ID or STORE_FORMAT_ID has a non-null value, and then reload the file.

ORG_R1

Many-to-many relationships exist in your organization hierarchy, which is not allowed. This is generally due to the same child ID appearing below multiple parent IDs.

Review all hierarchy levels for instances of the same ID appearing under multiple parents (such as a district belonging to two different regions or areas) and modify the data to remove the multi-parent issues.

ORG_R2

The same organization hierarchy node has multiple descriptions on different rows of the input file.

Modify your organization hierarchy file such that any given hierarchy ID has the same description on all rows.

ORG_R3

A node of the organization hierarchy has no children under it. This could be due to a reclass that didn’t delete the old nodes, or when a new node is added but no stores were created yet.

If possible, remove all cases of nodes having no children (for example, if all stores are reclassed out of a district, delete the old district). Some AI Foundation functionality will fail if you attempt to run it on empty nodes.

ORG_R4

Your organization hierarchy levels use alphanumeric characters for the level IDs. This is not allowed if you are implementing Retail Insights; all levels must be numbers.

If you are implementing RI, you must alter your hierarchy to only use numbers for every level of the organization hierarchy. Other characters are not allowed.

ORG_R5

Your location type code (ORG_TYPE_CODE) contains invalid values.

Only specific codes S, W, or E are allowed in the ORG_TYPE_CODE field, so you must correct any other values and reload the file.

ORG_R6

You have more than 1 top level (company) ID, which is not allowed.

Correct the ORG_TOP_NUM to contain only one value on all rows of your input file and reload it.

ORG_R7

Your input file contains a different top level (company) ID than what is already in the database, which is not allowed.

Correct the ORG_TOP_NUM to match the company ID already in the database (W_INT_ORG_DH table) or erase the data in the system to perform a clean load of new hierarchy data, if you do not want to keep your existing dataset.

ORG_R8

Your input file contains a different top level (company) domain member ID or description (on W_DOMAIN_MEMBER_DS_TL) than what is already in the database.

Correct the ORG_TOP_NUM and ORG_TOP_DESC to match the company data already in the database (W_DOMAIN_MEMBER_LKP_TL table) or erase the data in the system to perform a clean load of new hierarchy data, if you do not want to keep your existing dataset.

ORG_R9

You have more than 1 top level (company) description, which is not allowed.

Correct the ORG_TOP_DESC to contain only one value on all rows of your input file and reload it.

ORG_R10

You are attempting to load a location in W_INT_ORG_DS which is already in the data warehouse and flagged inactive or deleted.

You may not reload or reuse location IDs that are already in the data warehouse and have been deleted or inactivated. Delete the location from your input data that is specified in BAD_ORG_INT_ORG_DS. If you believe this validation is incorrect you may raise an SR for assistance.

ORG_R11

You are attempting to load a location in W_INT_ORG_DHS which is already in the data warehouse and flagged inactive or deleted.

You may not reload or reuse location IDs that are already in the data warehouse and have been deleted or inactivated. Delete the location from your input data that is specified in BAD_ORG_INT_ORG_DHS. If you believe this validation is incorrect you may raise an SR for assistance.

EXCH_RATE_R1

Exchange rate dates are overlapping for the same conversion, which will result in multiple rates active for the same date and currency.

Modify the start/end dates for the exchange rate records to ensure there are no overlapping dates. Only one rate may be effective per day/currency combination.

EXCH_RATE_R2

Exchange rate dates have gaps which will result in no rate being active for one or more dates.

Modify the start/end dates for the exchange rate records to ensure they have no gaps between one end date and the next start date, for any given currency rate.

EXCH_RATE_R3

You have provided currency conversion in one direction (for example, USD > CAD) but you did not provide it in the alternate direction (CAD > USD).

The system requires that you provide currency rates going in both directions for each currency code pair, to ensure we are always able to convert into and out of any supported currency.

ATTR_R1

There is mismatched data between the ATTR.csv and PROD_ATTR.csv files

The ATTR.csv file must have a header record for all attribute groups and values found in PROD_ATTR.csv. Correct the ATTR.csv file to match exactly with PROD_ATTR.csv or delete the mismatched rows from PROD_ATTR.csv.

ATTR_R2

PROD_ATTR.csv column ATTR_GRP_TYPE contains an invalid type code.

The only valid codes for ATTR_GRP_TYPE are (ITEMDIFF, ITEMUDA, ITEMLIST, COLOR, and PRODUCT_ATTRIBUTES). Correct the PROD_ATTR.csv file and reload the data.

ATTR_R3

ATTR.csv column ATTR_TYPE_CODE contains an invalid type code.

The only valid codes for ATTR_TYPE_CODE are (FF, LV, DT, SIZE, FABRIC, SCENT, FLAVOR, STYLE, COLOR, and DIFF). Correct the ATTR.csv file and reload the data.

ATTR_R4

ATTR.csv columns ATTR_GROUP_ID or ATTR_VALUE_ID contain invalid characters.

Attribute group and value IDs are used as a hierarchy in Planning apps and are restricted from having any spaces, colons, or quotation marks as part of the IDs. Correct the ATTR.csv and PROD_ATTR.csv files and reload the data.

POSINVDATA_R1

Dates other than the current business date were found on W_RTL_INV_IT_LC_DY_FS. The DAY_DT is being compared to “select MCAL_NUM from W_RTL_CURR_MCAL_G where MCAL_TYPE = 'DT'”.

Daily positional fact data must only contain a DAY_DT value matching the current business date; correct the data and reload. This can be a sign that an incorrect file was used in the batch or the dates are out of sync.

POSINVUDATA_R2

Dates other than the current business date were found on W_RTL_INVU_IT_LC_DY_FS. The DAY_DT is being compared to “select MCAL_NUM from W_RTL_CURR_MCAL_G where MCAL_TYPE = 'DT'”.

Daily positional fact data must only contain a DAY_DT value matching the current business date; correct the data and reload. This can be a sign that an incorrect file was used in the batch or the dates are out of sync.

POSPRICEDATA_R3

Dates other than the current business date were found on W_RTL_PRICE_IT_LC_DY_FS. The DAY_DT is being compared to “select MCAL_NUM from W_RTL_CURR_MCAL_G where MCAL_TYPE = 'DT'”.

Daily positional fact data must only contain a DAY_DT value matching the current business date; correct the data and reload. This can be a sign that an incorrect file was used in the batch or the dates are out of sync.

POSNCOSTDATA_R4

Dates other than the current business date were found on W_RTL_NCOST_IT_LC_DY_FS. The DAY_DT is being compared to “select MCAL_NUM from W_RTL_CURR_MCAL_G where MCAL_TYPE = 'DT'”.

Daily positional fact data must only contain a DAY_DT value matching the current business date; correct the data and reload. This can be a sign that an incorrect file was used in the batch or the dates are out of sync.

POSBCOSTDATA_R5

Dates other than the current business date were found on W_RTL_BCOST_IT_LC_DY_FS. The DAY_DT is being compared to “select MCAL_NUM from W_RTL_CURR_MCAL_G where MCAL_TYPE = 'DT'”.

Daily positional fact data must only contain a DAY_DT value matching the current business date; correct the data and reload. This can be a sign that an incorrect file was used in the batch or the dates are out of sync.

POSPOONORDDATA_R6

Dates other than the current business date were found on W_RTL_PO_ONORD_IT_LC_DY_FS. The DAY_DT is being compared to “select MCAL_NUM from W_RTL_CURR_MCAL_G where MCAL_TYPE = 'DT'”.

Daily positional fact data must only contain a DAY_DT value matching the current business date; correct the data and reload. This can be a sign that an incorrect file was used in the batch or the dates are out of sync.

POSPOONALCDATA_R7

Dates other than the current business date were found on W_RTL_PO_ONALC_IT_LC_DY_FS. The DAY_DT is being compared to “select MCAL_NUM from W_RTL_CURR_MCAL_G where MCAL_TYPE = 'DT'”.

Daily positional fact data must only contain a DAY_DT value matching the current business date; correct the data and reload. This can be a sign that an incorrect file was used in the batch or the dates are out of sync.

POSCOMPPRICEDATA_R8

Dates other than the current business date were found on W_RTL_COMP_PRICE_IT_LC_DY_FS. The DAY_DT is being compared to “select MCAL_NUM from W_RTL_CURR_MCAL_G where MCAL_TYPE = 'DT'”.

Daily positional fact data must only contain a DAY_DT value matching the current business date, correct the data and reload. This can be a sign that an incorrect file was used in the batch or the dates are out of sync.

In all cases where you need to correct your file and reload it, you are expected to push only the corrected files into the system using the data reprocessing ad hoc programs in the POM AIF DATA schedule. Refer to the AI Foundation Data Standalone Processes chapter for details on these programs:

  1. REPROCESS_ZIP_FILE_PROCESS_ADHOC to upload your corrected file(s)

  2. CSV_REPROCESS_ADHOC or DAT_REPROCESS_ADHOC to import the corrected files and transform the data from staging to target tables (only running the required programs, not the entire ad hoc process)