Fact and Dimension Relationships

When a fact interface such as SALES.csv is loaded into the data warehouse, the programs perform many joins between the incoming data and the associated dimensions and hierarchies. The data warehouse enforces strict data integrity rules across tables; this requires that, for every fact record, all associated dimension records are present and active. For a record to be considered the active dimension record, it must have effective dates that encompass the same date on the fact record. For example:

  • A record in the SALES.csv file for a given nightly batch cycle has key values of:

    • ITEM = 1285001

    • ORG_NUM = 4400

    • DAY_DT = 05-MAY-23

  • For this record to be loaded successfully, the item, location, and date must all be present and active within the associated dimension tables:

    • For the item, a record must exist in W_PRODUCT_D_RTL_TMP having a matching item number in column PROD_IT_NUM, and the value of DAY_DT from the sale must be between SRC_EFF_FROM_DT and SRC_EFF_TO_DT.

    • For the location, a record must exist in W_INT_ORG_DH_RTL_TMP having a matching location number in column ORG_NUM, and the value of DAY_DT from the sale must be between EFFECTIVE_FROM_DT and EFFECTIVE_TO_DT.

    • For the date, a record must exist in W_MCAL_DAY_D having the same date on MCAL_DAY_DT.

  • If the fact record has other key columns such as suppliers, buyers, reasons, statuses, and so on, then similar dimension joins may be used across all of them.

If any join between the fact table and dimension tables is unable to find matches based on the above criteria, then the fact record will not load into the data warehouse and will instead be rejected and moved into separate tables for review.