Support Utilities

Data Cleanup Utility

Because foundation data is always loaded first through the Retail Insights data warehouse, implementers often need to erase data from the RI tables in preparation for a new load. Database functions have been exposed to APEX to allow targeted deletion of data by table name. The deletion requires a two-step process to prevent accidental execution of the command. The first step is to request an authentication token for the utility, which will return a string value. You must then pass the token into the subsequent call to the utility to execute the data cleanup step. Executing the utility with an expired or invalid token will return an error message.

The command to run in APEX to generate tokens is:

select  ri_support_util.generate_token from dual

Take the return value from this query and copy it into the TOKEN_VALUE variable in the PL/SQL block below. Specify the schema name and table name to be truncated, then run the PL/SQL block.

DECLARE
  TOKEN_VALUE VARCHAR2(200);
  SCHEMANAME VARCHAR2(200);
  TABLENAME VARCHAR2(200);
BEGIN
    TOKEN_VALUE := 'ABCDEF';
    SCHEMANAME := 'RADM01';
    TABLENAME := 'W_RTL_SLS_TRX_IT_LC_DY_FS';
    RI_SUPPORT_UTIL.CLEAR_SELECTED_RI_TABLES(
        TOKEN_VALUE => TOKEN_VALUE,
        SCHEMANAME => SCHEMANAME,
        TABLENAME => TABLENAME
    );
END;

If the process is successful, you will see that the PL/SQL block was successfully executed with no further message or results. If the process encounters any error, it will display the error details in the results panel in APEX. For example, if the token used is not valid it will show the following error:

To quickly clean the entire database schema instead of individual tables, you may instead call the following command. This command will erase all customer data except for the calendar, system configuration tables, and seed data records. Use this command if you need to reset the environment in preparation for a new dataload using a different dataset:

DECLARE 
v_token VARCHAR2(200); 
SCHEMANAME VARCHAR2(200); 

BEGIN 
    v_token := ri_support_util.generate_token; 
    SCHEMANAME := 'RADM01'; 
    RI_SUPPORT_UTIL.CLEAR_SELECTED_RI_TABLES( 
        TOKEN_VALUE => v_token, 
        SCHEMANAME => SCHEMANAME 
    ); 
END;

Calendar removal is provided as a separate function, because you cannot remove calendar information without also erasing all partitions (which are specific to your currently loaded calendar). The function name is CLEAR_RI_MCAL_TABLES and can be called the same way as the schema clear script above, passing in the token and schema name as the inputs. Before you perform any calendar cleanup, review the following:

Partition removal is based on the current partition configuration in C_MODULE_ARTIFACT; it will not modify tables that are not enabled for partitioning. Ensure the configuration table reflects your current cleanup needs.
Because calendar cleanup includes partition removal, you cannot use the system for a new data load without first re-partitioning the system. Refer to the RAP Implementation Guide for the steps to reload the calendar and partition the database.

When you run either of the cleanup commands above, it may take an hour or longer to complete depending upon the amount of data in your schema and the number of partitions requiring deletion. Due to APEX limitations, your session may expire or timeout while waiting for the command to complete, but the process will continue to run in the database. To monitor the activity after a session timeout, you can query the RI_LOG_MSG table and check for new log messages:

select * from ri_log_msg order by msg_ts desc;

If the process is still running, you will see new log entries being added for ri_support_util methods. If no recent entries are added and the last set of messages show the END messages for a process step, then you can verify that all your tables are cleared and proceed with your implementation activities.

Aggregation Utility

Retail Insights has over 100 different tables for pre-calculating data at higher levels of aggregation, mainly for the purpose of BI reporting and analytics. These tables do not need to be populated during initial historical data loads but would be needed before end-users begin accessing data in RI. To populate these tables after history loads are complete, an aggregation utility is provided that can use the base intersection of a functional area to calculate all of the higher-level tables.

Prerequisites for using the utility (all steps must be completed every time you want to use the utility):

Partitioning has been run for the target functional areas such as sales (SLS), inventory (INV), and so on. Follow the steps in the RAP Implementation Guide to perform additional partitioning as needed. If you have not used the utility since the last time you received a product patch, you should re-run the partitioning process again to ensure all tables are partitioned.
The base fact for the functional area has already been loaded with data for the entire date range you want to aggregate on. For example, the W_RTL_SLS_TRX_IT_LC_DY_F table is loaded before attempting to aggregate it to W_RTL_SLS_IT_LC_WK_A.
Database statistics have been collected recently using REFRESH_RADM_JOB and ANAYLZE_TEMP_TABLES_JOB (either as part of an ad hoc data load or automatically as part of nightly batch).

The configuration table to control the utility is C_RI_AGGREGATION_MAP, which is available from the Control Center in the AI Foundation user interface. It contains a list of aggregate tables in RI that can be processed by the utility. For each table you want to load, set the START_DT as the earliest date to process and the END_DT to the final date to process. The tables are grouped by functional area such as MODULE_CODE=SLS so you can update all tables relating to that fact.

When specifying the start/end dates, make sure to consider the calendar level of the table. Day level tables can have any start/end dates because they use daily partitions. Week level (WK) tables should use week starting/ending dates to ensure each full week of data is always aggregated into the table. Similarly, Gregorian month (GMH) tables should use month start/end dates. The utility also has functions to auto-extend your date ranges to encompass full weeks and months even if you make mistakes in the configuration. By default, dates will always be auto-extended so that full weeks/months are always loaded where needed. This can be changed using parameter RI_AGG_FULL_LOAD_TYPE on C_ODI_PARAM_VW if you only want the dates you specify to be included in the aggregations. Valid values include:

F – full auto-extend of dates
FE – extend end dates only
FS – extend start dates only

N/A (or any other values) – Use only the dates in the mapping table

Once the necessary updates are performed, you will execute an ad hoc process in POM named AGGREGATION_UTILITY_ADHOC. This process is a first-time manual run to validate the configuration is working as intended and to set up the temp tables. This process has 3 jobs in it:

AGG_UTILITY_PRE_JOB – Calculates a temporary lookup table for product hierarchy relationships

AGG_UTILITY_ORG_PRE_JOB – Calculates a temporary lookup table for organization hierarchy relationships

AGG_UTILITY_JOB – Performs an aggregation action for a specific table name and run type

The AGG_UTILITY_JOB requires two input parameters: the name of the table as found in C_RI_AGGREGATION_MAP and the type of aggregation to perform (FRESH or RESTART). When FRESH is specified, it assumes you want to aggregate the entire date range specified in the configuration table, even if it has been run before. If RESTART is specified, it will run only from the last completed period (the partition job aggregates one quarter at a time so it will not re-run earlier quarters that already completed). Also use the RESTART option if you changed the END_DT to some time further in the future and want to only process incomplete dates resulting from the change. In most use cases you can always specify RESTART as the option and it will perform the required actions.

Example Postman message body for the process call:

{
  "cycleName":"Adhoc", 
  "flowName":"Adhoc", 
  "requestType":"POM Scheduler", 
  "processName":"AGGREGATION_UTILITY_ADHOC",
  "requestParameters":"jobParams.AGG_UTILITY_JOB= W_RTL_SLS_CS_IT_LC_DY_A RESTART"
}

Once you have issued the command to start the process, you may monitor the detailed run status by querying the table C_BULK_LOAD_STATUS from APEX. A record will be inserted for each calendar quarter that has been processed until the entire date range is aggregated. The POM job will complete successfully after the table is loaded for all dates. You may then compare the base fact table with the target aggregate and confirm the values have been rolled up as expected.

The aggregate tables must be populated in a specific sequence based on the value in the AGGREGATION_LEVEL column in C_RI_AGGREGATION_MAP. For each MODULE_CODE, the level 1 tables must be populated first, then the level 2 tables, and so on. To automate this execution sequence, there is a separate job available in POM, named AGG_SRVC_JOB. The aggregation service job accepts a single input parameter for the MODULE_CODE value. The job will execute all tables in the associated record set in C_RI_AGGREGATION_MAP for that module, following the AGGREGATION_LEVEL sequence as needed. The two PRE jobs (AGG_UTILITY_PRE_JOB and AGG_UTILITY_ORG_PRE_JOB) are prerequisites for this job, so ensure you’ve already run those at least once before using AGG_SRVC_JOB.

Example Postman message body for the process call:

{
  "cycleName":"Adhoc", 
  "flowName":"Adhoc", 
  "requestType":"POM Scheduler", 
  "processName":"AGGREGATION_SRVC_ADHOC",
  "requestParameters":"jobParams.AGG_SRVC_JOB=INV"
}

If any processes in the AGG_SRVC_JOB have a failure or are taking too long to run, you may also check the following tables for more information:

C_RI_SRVC_REQ_QUEUE – Contains the status of individual service calls invoked by the aggregation process. A status of 6 means success while 7 means failed.
RI_LOG_MSG – If a process does fail, the detailed trace logs will be written to this table to help you identify the problem. Look for records where PROGRAM_UNIT = RI_AGGREGATION_UTIL.

Database Statistics Utility

A critical part of working with large datasets in Oracle Database is the collection of statistics on your database tables. The POM processes used to load data generally include a job to collect statistics on the entire database schema to ensure stats are always up-to-date. The drawback of this program is that it can take a significant amount of time to run, even if you only need to refresh statistics on a single table. To help implementers collect statistics on specific tables, a utility is provided using the POM standalone program COLLECT_STATS_JOB.

The COLLECT_STATS_JOB accepts a single input parameter for the database module code you wish to gather stats on. The module codes are defined from the configuration table C_MODULE_DBSTATS, which is available from the Control & Tactical Center in the AI Foundation UI. The configuration table will come pre-defined with some core modules that often need stats collected on them using the codes SLS, INV, and PRICE. You have the ability to insert new rows into the table to define your own custom values for MODULE_CODE. You may specify any value you wish for the MODULE_CODE, along with one or more tables you plan to collect stats on. You would then pass the MODULE_CODE value into the job parameters to collect stats on your chosen list of tables. TABLE_NAME and MODULE_CODE are the only required values for tables in the RADM01 schema. If you are collecting stats on a temp table (in the RABE01USER schema) then you must also populate the OWNER_TYPE as BATCH.

After reviewing the configuration, you may invoke the job from POM or Postman, providing your MODULE_CODE as the only input parameter.

Example Postman message body for the process call:

{
  "cycleName":"Adhoc", 
  "flowName":"Adhoc", 
  "requestType":"POM Scheduler", 
  "processName":"COLLECT_STATS_ADHOC",
  "requestParameters":"jobParams.COLLECT_STATS_JOB=SLS"
}

External Table Load Logs

The first step of importing a file into RAP applications is to map the raw file as an external table on the Oracle database. The file is then pulled from the external table into an actual staging table in the target database schema. From a batch job perspective, the external table steps are performed by the jobs having STG in the name, such as W_RTL_CMP_CLOSED_DS_STG_JOB or STG_SI_ORGANIZATION_JOB. Issues that occur during the external table setup and load process result in rejected records on the application server that are not immediately visible to the database, since no data is yet loaded into the system.

To access rejected records from external tables, a temporary link is created in the database that points to the log files. You must use a procedure in the ri_support_util package to access this data. The procedure is named get_file_load_result and it accepts two input parameters:

The log file type, using values LOG or BAD. LOG files are the detailed log messages, while BAD files are the actual rejected records from the source data.
The numerical sequence of the database object linked to the logs. This is obtained from the error message when a job fails in POM.

Here is an example log message you might get from a failed job in POM:

Status check shows failed job, due to [ORA-20003: Reject limit reached, query table "RADM01"."COPY$124_LOG" for error details

The table referenced in this message is actually an external table link to a log file on the server. To access the data, log into Innovation Workbench and call the support utility with this command:

create table BATCH_LOG124 as select * from table (ri_support_util.get_file_load_result('LOG', '124'));

Creating a table allows you to preserve the logs without re-querying the application server. If there are rejected records associated with the same load, then there will also be a BAD table, which can use the same command but replacing LOG with BAD. External table logs are temporary, and they will be erased frequently by automated processes. You will need to extract the relevant data from the logs the same day the job fails, or it may be deleted.

Database Hints for SQL Jobs

Oracle Support may need to alter or add to the Oracle SQL hints used by specific programs to improve performance on your dataset. All RI jobs in ODI support configurable hints using rows added to the C_ODI_PARAM table.

The general process is to insert a row into C_ODI_PARAM with PARAM_NAME set to 'IKM_OPTIMIZER_HINT_INSERT' or 'IKM_OPTIMIZER_HINT_SELECT' and with INTEGRATION_ID set to 'Step/Interface Name'.

Insert statement template:

  INSERT INTO c_odi_param (
        row_wid,
        scenario_name,
        scenario_version,
        param_name,
        param_value,
        integration_id,
        created_on_dt,
        change_on_dt
    )
        ( SELECT
            2,
            $ODI_SCENARIO_NAME,
            '001',
            'IKM_OPTIMIZER_HINT_INSERT',
            $HINT_DEFINITION,
            $STEP_NAME,
            sysdate,
            sysdate
        FROM
            dual
        )

As an example, we want to add a hint for job step SIL_Retail_SalesTransactionFact inside the scenario 'SIL_RETAIL_SALESTRANSACTIONFACT'. We would run the following statement to add the hint:

INSERT INTO c_odi_param (
row_wid,
scenario_name,
scenario_version,
param_name,
param_value,
integration_id,
created_on_dt,
change_on_dt
)
( SELECT
2,
'SIL_RETAIL_SALESTRANSACTIONFACT', -- “Scenario_Name”
'001',
'IKM_OPTIMIZER_HINT_INSERT',
'/* +Append */',
'SIL_Retail_SalesTransactionFact', -- “Step/Interface Name” 
sysdate,
sysdate
FROM
dual
)

Once a row is added for the first time, it should not be inserted again. Instead, update the param_value with the new hint SQL.