Understanding Source System Deletes and the Source-Delete Diagnostic Staging Jobs
Depending on your business practices, it is sometimes necessary to delete records from your PeopleSoft source transaction system. Previously, incremental load jobs used in the ETL process might not reflect a deleted source record in an EPM Warehouse, which could cause the two systems to become unsynchronized. When the two systems are unsynchronized, an EPM Warehouse may produce reports with incorrect results. Therefore, to ensure synchronicity between the EPM Warehouses and your PeopleSoft source system, PeopleSoft provides new source-delete diagnostic staging jobs that identify physically deleted source records and reflect those changes in EPM fact tables.
To ensure synchronicity between the EPM Warehouses and your PeopleSoft source system, source-delete diagnostic staging jobs identify physically deleted source records and reflect those changes in EPM fact tables. Physically deleted source records are flagged in OWS tables and passed to MDW tables, where they are physically deleted from the table. This process varies slightly depending on whether it is executed in a staging job that uses the DTTM (date time stamp) or a staging job that uses Cyclic Redundancy Check (CRC). Both processes are discussed in more detail below.
If you wish to retain deleted source records in the Campus Solutions Warehouse, you can disable the source-delete diagnostic staging jobs. This process is discussed in more detail later in this documentation.
Note: MDW dimension load jobs do not use the source-delete diagnostic feature because doing so can cause linked facts to become orphaned and generate incorrect results in reports.
Identifying Source Record Deletes with CRC Staging Jobs
In CRC staging jobs, the ETL logic for the source-delete diagnostic feature is contained in a separate delete strategy (DEL_STRATEGY) path:
Image: DEL_STRATEGY path
This example illustrates the DEL_STRATEGY path.
This section uses the CRC staging job, J_Stage_PS_AGING_CATEG_TBL_FSCM91_EPM91, as an example.
Populating the Delete Hashed File
When you first implement the source-delete diagnostic feature, you run a separate setup job (J_DELSTRATEGY_InitialLoad) that populates the delete hashed file with the complete set of records from the corresponding HASH_CRC hashed file. The process also assigns each record a default value of ‘E’ for the Delete Flag (DELFLAG) column.
The following graphic depicts a hypothetical delete hashed file after it is populated with all the records from its corresponding HASH_CRC hashed file:
Image: Hypothetical delete hashed file
This example illustrates the Hypothetical delete hashed file after being populated with all the records from its corresponding hashed file.
Note that every record in the delete hashed file is assigned a value of 'E' for the Delete Flag (DELFLAG) column.
For more information about required steps to implement the source-delete diagnostic feature, see Enabling the Source-Delete Diagnostic Feature
Phase One: Trans_Check_CRC Stage in the CRC Staging Job
When you run a CRC staging job, the Trans_Check_CRC transformer stage uses the constraint P_HANDLE_DELETES = Y to determine whether the job should utilize the delete strategy (DEL_STRATEGY) path by allowing data loading to the delete hashed file:
Image: Trans_Check_CRC transformer stage, handle deletes constraint
Trans_Check_CRC transformer stage, handle deletes constraint
If the parameter for the P_HANDLE_DELETES is set to ‘Y,’ the Trans_Check_CRC transformer stage loads the delete hashed file with source records:
Image: Trans_Check_CRC transformer stage
This example illustrates the Trans_Check_CRC transformer stage.
Phase Two: HASH_DEL_PS_AGING_CATEG_TBL in the CRC Staging Job
The load process for the delete hashed file uses the business key (BUS_KEY) and source system ID (SRC_SYS_ID) values to compare incoming source records against the records in the delete hashed file.
If an incoming source record matches an existing record in the delete hashed file, the Delete Flag (DELFLAG) value for that record is overwritten with ‘EX.’ When a record is deleted in the source, its corresponding record in the delete hashed file will remain unchanged, with its Delete Flag (DELFLAG) value remaining ‘E.’
The following graphic depicts a hypothetical delete hashed file after being updated by the CRC staging job:
Image: Hypothetical delete hashed file
This example illustrates the Hypothetical delete hashed file after being updated by the CRC staging job.
Note that two records were deleted from the source and provide no match for the corresponding records in the delete hashed file. Hence, the DELFLAG values for the records remain ‘E.’
The DELFLAG value is used to determine which records will be selected for deletion from the fact table. The following graphic shows the selection criteria in the delete hashed file output, which selects only records with the DELFLAG value equal to P_CURR_DEL_FLAG:
Image: Selection criteria in the delete hashed file output
Selection criteria in the delete hashed file output
In our example, the P_CURR_DEL_FLAG value is ‘E.’ Therefore, only records with ‘E’ for the DELFLAG are selected for deletion and passed to the Trans_Delete transformer stage.
The current value for P_CURR_DEL_FLAG is determined using the routine RtnGetDelFlag:
Image: RtnGetDelFlag
This example illustrates the RtnGetDelFlag.
See Phase Six: J_Handle_Staging_Success Job for information about the methodology used to update delete hashed file records flagged for deletion.
Phase Three: Trans_Delete Transformer in the CRC Staging Delete Job
At this point in the process, only records with ‘E’ for the DELFLAG are moved to the Trans_Delete transformer stage for deletion:
Image: Trans_Delete Transformer
This example illustrates the Trans_Delete Transformer.
The Trans_Delete transformer stage calls the subroutine SBRtnWriteDelFlag, which updates the current DELFLAG value (stored in HASHED_STORE_DEL hashed file) from ‘E’ to ‘EX.”
Image: Trans_Delete transformer stage logic to call the subroutine SBRtnWriteDelFlag
Trans_Delete transformer stage logic to call the subroutine SBRtnWriteDelFlag
SBRtnWriteDelFlag logic:
Image: SBRtnWriteDelFlag logic
This example illustrates the SBRtnWriteDelFlag logic.
This way, when the job runs again in the future, the starting DELFLAG value will be ‘EX.’ For example:
Image: Sample Delete hashed file
This example illustrates the Sample Delete hashed file.
This swapping process occurs each time you run the CRC staging job.
Phase Four: Delete_Rows Out Link in the CRC Staging Delete Job
Since the target fact table uses the value in the DATA_ORIGIN column to identify records to delete, the Delete_Rows out link populates the DATA_ORIGIN column for outbound records with the value ‘D.’
Image: Trans_Delete Transformer to DRS_TGT_PS_GM_AWARD_DELETE
This example illustrates the Trans_Delete Transformer to DRS_TGT_PS_GM_AWARD_DELETE.
Image: Trans_DeleteStage
This example illustrates the Trans_DeleteStage.
Phase Five: DRS_TGT_PS_AGING_CATEG_TBL_DELETE in the CRC Staging Delete Job
The DRS target delete stage contains a generated SQL update action that updates DATA_ORIGIN value to 'D,' which identifies records to delete from the fact table.
Image: DRS target delete stage
This example illustrates the DRS target delete stage.
Phase Six: J_Handle_Staging_Success Job
After the delete strategy portion of the CRC staging job completes, the SEQ_J_Stage_PS_AGING_CATEG_TBL sequencer job calls the job J_Handle_Staging_Success:
Image: SEQ_J_Stage_PS_AGING_CATEG_TBL sequencer job
This example illustrates the SEQ_J_Stage_PS_AGING_CATEG_TBL sequencer job.
One of the tasks for the job J_Handle_Staging_Success is to update the delete flag for the Delete hashed file:
Image: J_Handle_Staging_Success job
This example illustrates the J_Handle_Staging_Success job.
Image: J_Handle_Staging_Success job detail
This example illustrates the J_Handle_Staging_Success job detail.
Recall that the delete hashed file in our example CRC staging job output records with delete flag equal to ‘E,’ since ‘E’ indicated records for deletion. However, the delete hashed file also contained records flagged with ‘EX,’ which indicated records to keep.
Also recall that the subroutine ‘SBRtnWriteDelFlag’ updated the starting delete flag value to ‘EX,’ so that the next time the CRC staging job runs, records with the delete flag value equal to ‘EX’ will be deleted (as opposed to records with ‘E’). Therefore, the delete hashed file records flagged with ‘E’ must be modified in some way to avoid causing conflict during the next run of the CRC staging job. The job J_Handle_Staging_Success updates these delete hashed file records from ‘E’ to ‘DEL.’ Delete hashed file records flagged with ‘DEL’ are not included in the delete strategy process during subsequent runs of the CRC staging job.
Identifying Source Record Deletes with Date-Time Staging Jobs
Date-Time staging jobs process source deletes in a two-phased approach:
Incremental load staging jobs include source and target row count logic to determine whether a source record delete occurred.
If a source record delete occurred, a separate staging delete job processes the records for deletion.
Phase One: Incremental Load Staging Jobs
Incremental load staging jobs first use a hashed file lookup (HASH_INCREMENTAL_LOAD) to compare row counts from the source with existing row counts from the hashed file:
Image: Incremental Load Staging Job
This example illustrates the Incremental Load Staging Job.
The DRS source stage contains a SQL select statement that obtains row count from the source table:
Image: DRS source stage
This example illustrates the DRS source stage.
The source row count value is output to the hashed file via the SRC_Row_Count out link:
Image: SourceRowCount column - DRS source stage
This example illustrates the SourceRowCount column - DRS source stage.
The Trans_Check_Deletes transformer stage contains the stage variables necessary to obtain source and target row counts:
Image: Trans_Check_Deletes Transformer Stage
This example illustrates the Trans_Check_Deletes Transformer Stage.
Image: Stage Variables for Trans_Check_Deletes Transformer
This example illustrates the Stage Variables for Trans_Check_Deletes Transformer.
Also note that the svHandleDeletes stage variable equals ‘Y’ when the target row count is greater than the source row count (due to source record deletes) and the constraint P_HANDLE_DELETES = Y.
The RunDelJob stage variable will then call the corresponding staging delete job (using the routine UtilityRunJob) when the svHandleDeletes stage variable is set to ‘Y.’
Phase Two: Staging Delete Job
The staging delete job performs a key compare between source records and hashed table records to determine which records require deletion:
Image: staging delete job
This example illustrates the staging delete job.
The Trans_Split_Keys transformer stage uses the date time key (DTTM_KEY), source system ID (SRC_SYS_ID), and delete flag (DELFLAG) values to update the incoming source records to the DTTM_DELETE_HASH hashed file.
If an incoming source record matches an existing record in the DTTM_DELETE_HASH hashed file, the Delete Flag (DELFLAG) value for that record is overwritten. When a record is deleted in the source, its corresponding record in the DTTM_DELETE_HASH hashed file will remain unchanged, thereby flagging it for deletion.
The following graphic shows the selection criteria in the DTTM_DELETE_HASH hashed file output, which selects only records with the DELFLAG value equal to P_CURR_DEL_FLAG:
Image: SelectionCriteria for DTTM_DELETE_HASH
This example illustrates the SelectionCriteria for DTTM_DELETE_HASH.
The Delete_Rows out link populates the DATA_ORIGIN column for outbound records with the value ‘D’:
Image: Trans_Delete Transformer to DRS_TGT_PS_GM_AWARD_DELETE
This example illustrates the Trans_Delete Transformer to DRS_TGT_PS_GM_AWARD_DELETE.
Image: Trans_Delete Transformer Stage
This example illustrates the Trans_Delete Transformer Stage.
The DRS target delete stage contains a generated SQL update action that updates the DATA_ORIGIN value to ‘D.’
Image: Generated SQL for DRS_TGT_PS_GM_AWARD_DELETE
This example illustrates the Generated SQL for DRS_TGT_PS_GM_AWARD_DELETE.