Support Utilities

Data Cleanup Utility

Because foundation data is always loaded first through the Retail Insights data warehouse, implementers often need to erase data from the RI tables in preparation for a new load. Database functions have been exposed to APEX to allow targeted deletion of data by table name. The deletion requires generation of a unique toke value to pass into the commands. This token generation (and the delete functions in general) can be disabled by Oracle upon request if you do not want the functionality exposed after customer go-live.

Note:

These utilities are only for data loaded using the AIF DATA schedule jobs in POM. If you have loaded data into one or more AIF applications (AIF APPS schedule) then there is a cleanup job named AIF_APPS_MAINT_DATA_CLEANUP_ADHOC_PROCESS in the AIF APPS schedule. Refer to the Data Cleanup Utility section in the AI Foundation Applications Standalone Processes chapter.

Before using the utility, run the token generation to verify the package is available, as shown below.

The command to run in APEX to generate tokens is:

select  ri_support_util.generate_token from dual

Once you have verified one time that the PL/SQL package is working without error, you can use the commands below to perform table cleanup. Specify the schema name and table name to be truncated, then run the PL/SQL block.

DECLARE
  TOKEN_VALUE VARCHAR2(200);
  SCHEMANAME VARCHAR2(200);
  TABLENAME VARCHAR2(200);
BEGIN
    TOKEN_VALUE := ri_support_util.generate_token;
    SCHEMANAME := 'RADM01';
    TABLENAME := 'W_RTL_SLS_TRX_IT_LC_DY_FS';
    RI_SUPPORT_UTIL.CLEAR_SELECTED_RI_TABLES(
        TOKEN_VALUE => TOKEN_VALUE,
        SCHEMANAME => SCHEMANAME,
        TABLENAME => TABLENAME
    );
END;

If the process is successful, you will see that the PL/SQL block was successfully executed with no further message or results. If the process encounters any error, it will display the error details in the results panel in APEX. For example, if the token used is not valid it will show the following error:

To quickly clean the entire database schema instead of individual tables, you may instead call the following command. This command will erase all customer data except for the calendar, system configuration tables, and seed data records. This command will also delete user data from the C_HIST_LOAD_STATUS table, which was generated for any history loads. Use this command if you need to reset the environment in preparation for a new dataload using a different dataset:

DECLARE 
v_token VARCHAR2(200); 
SCHEMANAME VARCHAR2(200); 

BEGIN 
    v_token := ri_support_util.generate_token; 
    SCHEMANAME := 'RADM01'; 
    RI_SUPPORT_UTIL.CLEAR_SELECTED_RI_TABLES( 
        TOKEN_VALUE => v_token, 
        SCHEMANAME => SCHEMANAME 
    ); 
END;

Calendar removal is provided as a separate function, because you cannot remove calendar information without also erasing all partitions (which are specific to your currently loaded calendar). The function name is CLEAR_RI_MCAL_TABLES and can be called the same way as the schema clear script above, passing in the token and schema name as the inputs. Before you perform any calendar cleanup, review the following:

Partition removal is based on the current partition configuration in C_MODULE_ARTIFACT; it will not modify tables that are not enabled for partitioning. Ensure the configuration table reflects your current cleanup needs.
Because calendar cleanup includes partition removal, you cannot use the system for a new data load without first re-partitioning the system. Refer to the RAP Implementation Guide for the steps to reload the calendar and partition the database.

There is a function named RI_SUBJECTAREA_TABLE that erases functional areas of the data warehouse one by one, which can be useful for targeted cleanup of related groups of tables. The function uses the list of tables and subject area names from the database table C_RI_SUBJECTAREA, which you can query from APEX to identify which values you want to use. If you run the command for the Price or Inventory Position subject areas, then it will also clean up the C_HIST_LOAD_STATUS table for all related entries.

The command syntax is shown below.

DECLARE
  TOKEN_VALUE VARCHAR2(200);
  SUBJECTAREA_NAME VARCHAR2(200);
  OWNER_NAME VARCHAR2(200);
BEGIN
    TOKEN_VALUE := ri_support_util.generate_token;
    OWNER_NAME := 'RADM01';
    SUBJECTAREA_NAME := 'Organization';
    RI_SUPPORT_UTIL.RI_SUBJECTAREA_TABLE(
        TOKEN_VALUE => TOKEN_VALUE,
        SUBJECTAREA_NAME => SUBJECTAREA_NAME,
        OWNER_NAME => OWNER_NAME
  );
END;

When you run any of the cleanup commands above, it may take an hour or longer to complete depending upon the amount of data in your schema and the number of partitions requiring deletion. Due to APEX limitations, your session may expire or timeout while waiting for the command to complete, but the process will continue to run in the database. To monitor the activity after a session timeout, you can query the RI_LOG_MSG table and check for new log messages:

select * from ri_log_msg order by msg_ts desc;

If the process is still running, you will see new log entries being added for ri_support_util methods. If no recent entries are added and the last set of messages show the END messages for a process step, then you can verify that all your tables are cleared and proceed with your implementation activities. When verifying table counts and contents after a cleanup script is run, you must include a hint in your SQL to prevent cached results or stale statistics from being returned.

For example:

SELECT /*+ OPT_PARAM('_optimizer_answering_query_using_stats' 'FALSE') */  COUNT(*) from w_product_d;

Aggregation Utility

Retail Insights has over 100 different tables for pre-calculating data at higher levels of aggregation, mainly for the purpose of BI reporting and analytics. These tables do not need to be populated during initial historical data loads but would be needed before end-users begin accessing data in RI. To populate these tables after history loads are complete, an aggregation utility is provided that can use the base intersection of a functional area to calculate all of the higher-level tables.

The utility currently supports the following subject areas in RI:

Name	Module Code
Sales	SLS
Sales Pack	SLSPK
Sales Promotion	SLSPR
Sales Wholesale	SLSWF
Inventory	INV
Inventory Receipts	INVRC
Inventory Transfers	INVTSF
Inventory Reclasses	INVRECLASS
Inventory Return to Vendor	INVRTV
Markdowns	MKDN
Net Profit	NPROF
Customer Loyalty Transactions	CUST

Within these subject areas, the aggregation does have some limitations on which columns are populated (relative to nightly batches). Aggregate columns that are derived by joining multiple tables together during batch processing are not included in this utility because the data may not be available or accurate for the calculations. This includes:

Inventory availability columns, such as the counts and amounts based on presentation stock and demo stock levels
Inventory age and weeks-in-store calculations based on new receipt activity
Any columns that join to the clearance dimension to get the clearance indicator and markdown event ID for a specific inventory or transaction record

Prerequisites for using the utility (all steps must be completed every time you want to use the utility):

Partitioning has been run for the target functional areas such as sales (SLS), inventory (INV), and so on. Follow the steps in the RAP Implementation Guide to perform additional partitioning as needed. If you have not used the utility since the last time you received a product patch, you should re-run the partitioning process again to ensure all tables are partitioned.
The base fact for the functional area has already been loaded with data for the entire date range you want to aggregate on. For example, the W_RTL_SLS_TRX_IT_LC_DY_F table is loaded before attempting to aggregate it to W_RTL_SLS_IT_LC_WK_A.
Database statistics have been collected recently using REFRESH_RADM_JOB and ANAYLZE_TEMP_TABLES_JOB (either as part of an ad hoc data load or automatically as part of nightly batch).

The configuration table to control the utility is C_RI_AGGREGATION_MAP, which is available from the Control Center in the AI Foundation user interface. It contains a list of aggregate tables in RI that can be processed by the utility. For each table you want to load, set the START_DT as the earliest date to process and the END_DT to the final date to process. The tables are grouped by functional area such as MODULE_CODE=SLS so you can update all tables relating to that fact.

When specifying the start/end dates, make sure to consider the calendar level of the table. Day level tables can have any start/end dates because they use daily partitions. Week level (WK) tables should use week starting/ending dates to ensure each full week of data is always aggregated into the table. Similarly, Gregorian month (GMH) tables should use month start/end dates. The utility also has functions to auto-extend your date ranges to encompass full weeks and months even if you make mistakes in the configuration. By default, dates will always be auto-extended so that full weeks/months are always loaded where needed. This can be changed using parameter RI_AGG_FULL_LOAD_TYPE on C_ODI_PARAM_VW if you only want the dates you specify to be included in the aggregations. Valid values include:

F – full auto-extend of dates
FE – extend end dates only
FS – extend start dates only

N/A (or any other values) – Use only the dates in the mapping table

Once the necessary updates are performed, you will execute an ad hoc process in POM named AGGREGATION_UTILITY_ADHOC. This process is a first-time manual run to validate the configuration is working as intended and to set up the temp tables. This process has 3 jobs in it:

AGG_UTILITY_PRE_JOB – Calculates a temporary lookup table for product hierarchy relationships

AGG_UTILITY_ORG_PRE_JOB – Calculates a temporary lookup table for organization hierarchy relationships

AGG_UTILITY_JOB – Performs an aggregation action for a specific table name and run type

The AGG_UTILITY_JOB requires two input parameters: the name of the table as found in C_RI_AGGREGATION_MAP and the type of aggregation to perform (FRESH or RESTART). When FRESH is specified, it assumes you want to aggregate the entire date range specified in the configuration table, even if it has been run before. If RESTART is specified, it will run only from the last completed period (the partition job aggregates one quarter at a time so it will not re-run earlier quarters that already completed). Also use the RESTART option if you changed the END_DT to some time further in the future and want to only process incomplete dates resulting from the change. In most use cases you can always specify RESTART as the option and it will perform the required actions.

Example Postman message body for the process call:

{
  "cycleName":"Adhoc", 
  "flowName":"Adhoc", 
  "requestType":"POM Scheduler", 
  "processName":"AGGREGATION_UTILITY_ADHOC",
  "requestParameters":"jobParams.AGG_UTILITY_JOB= W_RTL_SLS_CS_IT_LC_DY_A RESTART"
}

Once you have issued the command to start the process, you may monitor the detailed run status by querying the table C_BULK_LOAD_STATUS from APEX. A record will be inserted for each calendar quarter that has been processed until the entire date range is aggregated. The POM job will complete successfully after the table is loaded for all dates. You may then compare the base fact table with the target aggregate and confirm the values have been rolled up as expected.

The aggregate tables must be populated in a specific sequence based on the value in the AGGREGATION_LEVEL column in C_RI_AGGREGATION_MAP. For each MODULE_CODE, the level 1 tables must be populated first, then the level 2 tables, and so on. To automate this execution sequence, there is a separate job available in POM, named AGG_SRVC_JOB. The aggregation service job accepts a single input parameter for the MODULE_CODE value. The job will execute all tables in the associated record set in C_RI_AGGREGATION_MAP for that module, following the AGGREGATION_LEVEL sequence as needed. The two PRE jobs (AGG_UTILITY_PRE_JOB and AGG_UTILITY_ORG_PRE_JOB) are prerequisites for this job, so ensure you’ve already run those at least once before using AGG_SRVC_JOB.

Example Postman message body for the process call:

{
  "cycleName":"Adhoc", 
  "flowName":"Adhoc", 
  "requestType":"POM Scheduler", 
  "processName":"AGGREGATION_SRVC_ADHOC",
  "requestParameters":"jobParams.AGG_SRVC_JOB=INV"
}

If any processes in the AGG_SRVC_JOB have a failure or are taking too long to run, you may also check the following tables for more information:

C_RI_SRVC_REQ_QUEUE – Contains the status of individual service calls invoked by the aggregation process. A status of 6 means success while 7 means failed.
RI_LOG_MSG – If a process does fail, the detailed trace logs will be written to this table to help you identify the problem. Look for records where PROGRAM_UNIT = RI_AGGREGATION_UTIL.

Database Statistics Utility

A critical part of working with large datasets in Oracle Database is the collection of statistics on your database tables. The POM processes used to load data generally include a job to collect statistics on the entire database schema to ensure stats are always up-to-date. The drawback of this program is that it can take a significant amount of time to run, even if you only need to refresh statistics on a single table. To help implementers collect statistics on specific tables, a utility is provided using the POM standalone program COLLECT_STATS_JOB.

The COLLECT_STATS_JOB accepts a single input parameter for the database module code you wish to gather stats on. The module codes are defined from the configuration table C_MODULE_DBSTATS, which is available from the Control & Tactical Center in the AI Foundation UI. The configuration table will come pre-defined with some core modules that often need stats collected on them using the codes SLS, INV, and PRICE. You have the ability to insert new rows into the table to define your own custom values for MODULE_CODE. You may specify any value you wish for the MODULE_CODE, along with one or more tables you plan to collect stats on. You would then pass the MODULE_CODE value into the job parameters to collect stats on your chosen list of tables. TABLE_NAME and MODULE_CODE are the only required values for tables in the RADM01 schema. If you are collecting stats on a temp table (in the RABE01USER schema) then you must also populate the OWNER_TYPE as BATCH.

After reviewing the configuration, you may invoke the job from POM or Postman, providing your MODULE_CODE as the only input parameter.

Example Postman message body for the process call:

{
  "cycleName":"Adhoc", 
  "flowName":"Adhoc", 
  "requestType":"POM Scheduler", 
  "processName":"COLLECT_STATS_ADHOC",
  "requestParameters":"jobParams.COLLECT_STATS_JOB=SLS"
}

External Table Load Logs

The first step of importing a file into RAP applications is to map the raw file as an external table on the Oracle database. The file is then pulled from the external table into an actual staging table in the target database schema. From a batch job perspective, the external table steps are performed by the jobs having STG in the name, such as W_RTL_CMP_CLOSED_DS_STG_JOB or STG_SI_ORGANIZATION_JOB. Issues that occur during the external table setup and load process result in rejected records on the application server that are not immediately visible to the database, since no data is yet loaded into the system.

To access rejected records from external tables, a temporary link is created in the database that points to the log files. You must use a procedure in the ri_support_util package to access this data. The procedure is named get_file_load_result and it accepts two input parameters:

The log file type, using values LOG or BAD. LOG files are the detailed log messages, while BAD files are the actual rejected records from the source data.
The numerical sequence of the database object linked to the logs. This is obtained from the error message when a job fails in POM.

Here is an example log message you might get from a failed job in POM:

Status check shows failed job, due to [ORA-20003: Reject limit reached, query table "RADM01"."COPY$124_LOG" for error details

The table referenced in this message is actually an external table link to a log file on the server. To access the data, log into Innovation Workbench and call the support utility with this command:

create table BATCH_LOG124 as select * from table (ri_support_util.get_file_load_result('LOG', '124'));

Creating a table allows you to preserve the logs without re-querying the application server. If there are rejected records associated with the same load, then there will also be a BAD table, which can use the same command but replacing LOG with BAD. External table logs are temporary, and they will be erased frequently by automated processes. You will need to extract the relevant data from the logs the same day the job fails, or it may be deleted.

Rejected Record Reprocessing

Some subject areas have support utilities to aid with reloading records that were rejected because of bad or missing data in nightly batch executions. When records are rejected, they will first be placed into separate tables prefixed with E$. From here, you may review the data for issues and go back to the source systems to make corrections and avoid future batch problems. The rejected record utilities support the following areas:

Subject Area	Module Code	Rejected Records Table
Sales	SLS	E$_W_RTL_SLS_TRX_IT_LC_DY_TMP
Sales Promotion	SLSPR	E$_W_RTL_SLSPR_TX_IT_LC_DY_TMP
Inventory	INV	E$_W_RTL_INV_IT_LC_DY_TMP
Price	PRICE	E$_W_RTL_PRICE_IT_LC_DY_TMP

As a prerequisite to running these processes, some one-time cleanup must be done. The inventory and price reload jobs use the C_HIST_LOAD_STATUS table in the same manner as historical loads. For this reason, you must erase the values from the MAX_COMPLETED_DATE and HIST_LOAD_STATUS columns of this table. All rows should show as null values for these fields. This cleanup can be done using the Control & Tactical Center UI.

Before attempting to reload any rejections, you will also need to perform another batch run or ad hoc load to correct the associated dimensions, such as adding any missing items or locations. For example, for records that are rejected on Day 1, you must fix the source data and run a normal batch on Day 2; then you are ready to reprocess the older rejections on Day 2+ after the batch cycle. Once the dimensions are fixed, you can follow the steps below to reload the rejected records.

Run the E_FS_RELOAD_JOB in the process E_FS_RELOAD_PROCESS_ADHOC. This job accepts three input values: module code, start date, and end date. The module code is required and comes from the table above. The dates are optional and specify the range of CHECK_DATE values to extract from the E$ table. If no dates are provided to the job, then it will use trunc(sysdate-1) to trunc(sysdate) as the start and end date. The values should be entered as parameters on the job in the format:
```
SLS 20230808 20230809
```
The set of records found for the provided input parameters will be moved from the associated E$ table to another table prefixed with ERR. This table will keep the history of reprocessed records so the data is not lost. From there, it will be moved to the staging (FS) table. The specific tables used for each module are listed in the configuration table C_MODULE_REJECT_TABLE.
Verify the FS table now contains the data you want to load. You also have the ability to directly update the data in the staging tables from Innovation Workbench if any further changes need to be done to make it load successfully.
Run the reload process for the data to move the records from the staging table into the data warehouse tables. This will be one of the following ad hoc processes (make sure all jobs in these processes are enabled in Batch Administration before trying to run them):
- E_SLS_RELOAD_PROCESS_ADHOC
- E_SLSPR_RELOAD_PROCESS_ADHOC
- E_INV_RELOAD_PROCESS_ADHOC
- E_PRICE_RELOAD_PROCESS_ADHOC

For Sales and Sales Promotion, they are kept separate because it’s possible to have transactions that were loaded to the base sales tables but were rejected from promotional sales tables. In this case, you might only reprocess the SLSPR module, which will not load any new data into the base sales transaction tables. If you see rows rejected on both SLS and SLSPR E$ tables, then you want to reprocess both modules, as the tables loaded are differently.

For Inventory and Price, you can see the status of the reload using the C_HIST_LOAD_STATUS table, similar to how the historical load is performed. Once a range of dates is loaded successfully in this manner, you cannot go back and reprocess the same records again: the job will not allow you to insert any item/locations that already exist in the fact tables. If you have different, rejected item/location records that still need to be reprocessed, then you must first reset C_HIST_LOAD_STATUS to allow past dates to be reprocessed. The inventory reload is also used only to populate the core tables that are common to RAP (W_RTL_INV_IT_LC_G, W_RTL_INV_IT_LC_DY_F and W_RTL_INV_IT_LC_WK_A). For any other inventory aggregates that need to be reloaded, the Aggregation Utility must be used.

The intermediate ERR tables used to hold the reload history have a DELETE_FLG column to indicate that they’ve been reloaded to FS tables once and should not be used again on future runs of the jobs. If you do want to reprocess the same set of records again to FS tables, there is a separate process named E_FS_RESET_DELETE_FLG_PROCESS_ADHOC with one job (E_RESET_DELETE_FLG_JOB) that accepts a module code, start date, and end date similar to the E_FS_RELOAD_JOB parameters. This will change the delete flag back to N only for that subset of records, allowing you to start over from step 1 above.

Rejected Record Notifications

When records are rejected during a nightly batch, the jobs themselves do not generally fail. The batch will be allowed to complete but the rejected records are placed in separate tables for review. You may enable notifications that will alert your administrator users any time rejections happen in the AIF DATA batch cycles. The notifications are visible anywhere the Notifications panel is available on the left side of the screen, such as in the Retail Home and AI Foundation user interfaces. You can also customize the recipients and behavior from the Manage Notifications screen in Retail Home. Refer to "Notifications Administration" in the Retail Home Administration Guide for details. These notifications will be present under the Retail Insights application in the dropdown menu.

To enable the rejected record notifications, you must first enable the nightly jobs below in the AIF DATA nightly batch schedule. By default, all jobs should be enabled except E_INV_REJECT_DATA_NOTIF_JOB. This job is not needed unless you are configuring the job to fail when rejections are found (using the options described farther below).

Job	Purpose
E_REJECT_DATA_NOTIF_JOB	Checks for rejected records and triggers notifications when they are found. Will only check for rejections in data loaded in the last 24 hours, so that it does not trigger repeatedly for older data.
E_INV_REJECT_DATA_NOTIF_JOB	Checks for rejected records in inventory data and optionally causes the batch to fail immediately, giving you the opportunity to correct the data and reload it before resuming the batch. The batch failure trigger is configured separately, based on the severity configuration.
E_DIMM_LKUP_PROD_CHK_JOB	Compares the loaded product dimension data with the internal lookup table used for downstream integrations and logs any differences for review. Mismatches in the lookup table can cause missing data later in AIF or RPAS apps. The missing records will be in the table `RABE01USER.W_RTL_REJECT_DIMENSION_TMP` with `ERR_PROD_LKP_TMP` as the `TABLE_NAME`.
E_DIMM_LKUP_ORG_CHK_JOB	Compares the loaded organization dimension data with the internal lookup table used for downstream integrations and logs any differences for review. Mismatches in the lookup table can cause missing data later in AIF or RPAS apps. The missing records will be in the table `RABE01USER.W_RTL_REJECT_DIMENSION_TMP` with `ERR_ORG_LKP_TMP` as the `TABLE_NAME`.

Once enabled, the notifications will be issued based on the configuration table C_MODULE_REJECT_TABLE. This table has two columns that can be updated:

E_NOTIFICATION_ON – Set to N to disable the notifications or Y to enable them
E_SEVERITY_LEVEL – Set to 1 to mark the notification as Critical, which will also cause the POM job to fail. Set to 2 or 3 for lower severity messages, which will not cause the POM job to fail but will still issue notification messages.

When you set a notification to severity 1 and it causes the batch job to fail, then you must also mark that notification as read to prevent it from causing the job to keep failing in future runs. This can be done on the notifications user interface, either by clicking the X icon to clear the notification from the task panel or by opening the full notifications tab and marking them as read using the UI action for it. Refer to the Retail Home User Guide chapter on "Notifications" for details on marking notifications as read.

In addition to the Notifications panel in the UI, the messages are also logged in the database if you wish to access them from Innovation Workbench or create a custom service using the data. The table used for the messages themselves is RADM01.RAF_NOTIFICATION; you must use the database username as a prefix, because the same table exists for all Oracle users (RASE01 could be used instead for AIF Apps notifications). The table used to log the job activity is RI_LOG_MSG; you may query this table where PROGRAM_UNIT = ‘RI_NOTIFICATION_UTIL’ to see when messages are triggered.

Database Hints for SQL Jobs

Oracle Support may need to alter or add to the Oracle SQL hints used by specific programs to improve performance on your dataset. All RI jobs in ODI support configurable hints using rows added to the C_ODI_PARAM table.

The general process is to insert a row into C_ODI_PARAM with PARAM_NAME set to 'IKM_OPTIMIZER_HINT_INSERT' or 'IKM_OPTIMIZER_HINT_SELECT' and with INTEGRATION_ID set to 'Step/Interface Name'.

Insert statement template:

  INSERT INTO c_odi_param (
        row_wid,
        scenario_name,
        scenario_version,
        param_name,
        param_value,
        integration_id,
        created_on_dt,
        change_on_dt
    )
        ( SELECT
            2,
            $ODI_SCENARIO_NAME,
            '001',
            'IKM_OPTIMIZER_HINT_INSERT',
            $HINT_DEFINITION,
            $STEP_NAME,
            sysdate,
            sysdate
        FROM
            dual
        )

As an example, we want to add a hint for job step SIL_Retail_SalesTransactionFact inside the scenario 'SIL_RETAIL_SALESTRANSACTIONFACT'. We would run the following statement to add the hint:

INSERT INTO c_odi_param (
row_wid,
scenario_name,
scenario_version,
param_name,
param_value,
integration_id,
created_on_dt,
change_on_dt
)
( SELECT
2,
'SIL_RETAIL_SALESTRANSACTIONFACT', -- “Scenario_Name”
'001',
'IKM_OPTIMIZER_HINT_INSERT',
'/* +Append */',
'SIL_Retail_SalesTransactionFact', -- “Step/Interface Name” 
sysdate,
sysdate
FROM
dual
)

Once a row is added for the first time, it should not be inserted again. Instead, update the param_value with the new hint SQL.