6 Retail Insights Aggregation Framework

This chapter describes how Retail Insights implements customized aggregation by using the Retail Insights Aggregation Framework.

Overview of Retail Insights Aggregation Framework

The Retail Insights Aggregation Framework is a PL/SQL-based tool designed to simplify the Retail Insights aggregation process by leveraging the existing Retail Insights aggregation programs that are mandatory to the Retail Insights application. It is designed to provide a framework for end users to populate customized aggregation tables to gain better performance on front-end re-porting.

The framework is designed to either generate SQL DML file, or to execute the SQL DML statement, or to do both based on the setting in the ra.env file to populate customized aggregation tables. The client needs to populate the configuration table to provide enough mapping information for the framework to generate the DML statement. This customized process by using the Aggregation Framework can be included in the client's batch scheduler by calling the wrap script aggplp.ksh. Besides providing regular Retail Insights ETL logging and program status control capability, the framework also generates SQL DML file, message file, and error file under Retail Insights database utlfile folder to help the end user for the verification.

The framework also has the capability to aggregate data across the attributes along with the product hierarchy and either generate the SQL DML file, or execute the SQL DML statement, or do both based on the setting in ra.env file. This attribute aggregation process can be included in the client's batch scheduler by calling the wrap script attraggplp.ksh.

Due to the security concern, the database connection for the framework is managed by Retail Insights ODI in a same approach which is utilized by Retail Insights regular batch programs.

Aggregation Framework Initial Setup and Daily Execution

Perform the following steps for the Aggregation Framework installation and initial setup:

Note:

The <STAGING_DIR> mentioned below is the Retail Insights installer staging directory. Please refer to the Oracle Retail Insights Administration Guide for additional information.

Importing ODI Components for Aggregation Framework

Perform the following procedure to import Aggregation Framework ODI components:

Make sure $ODI_HOME/bin/odiparams.sh is configured correctly.
Copy odi_import.ksh to different folders:
1. Copy <STAGING_DIR>/ora/installer/ora142/mmhome/full/src/odi_import.ksh to <STAGING_DIR>/ora/installer/Aggregation_Framework/odi folder.
2. Copy <STAGING_DIR>/ora/installer/ora142/mmhome/full/src/odi_import.ksh to <STAGING_DIR>/ora/installer/Aggregation_Framework/odi/odi_parent folder.
Execute odi_import.ksh from <STAGING_DIR>/ora/installer/Aggregation_Framework/odi/odi_parent/odi_import.ksh, which will import the below Aggregation Framework ODI components
- FOLD_Aggregation_Framework.xml
Note:
Before executing odi_import.ksh, please read the comments inside of the script on how to use the script and set up the ODI_HOME and LOGDIR environment variables correctly.
Execute odi_import.ksh from <STAGING_DIR>/ora/installer/Aggregation_Framework/odi/odi_import.ksh, which will import the below Aggregation Framework ODI components
- VAR_RA_UTLPATH.xml
- VAR_RA_AGG_EXEC_MODE.xml
- FOLD_PLP_RetailAggregationDaily.xml
- PACK_PLP_RetailAggregationDaily.xml
- PACK_PLP_RetailAggregationReclass.xml
- PACK_PLP_RetailAttrAggregationDaily.xml
- TRT_RetailAggregationDaily.xml
- TRT_RetailAggregationReclass.xml
- TRT_RetailAttrAggregationDaily.xml
- TRT_RetailAggregationDaily_Debug.xml
- TRT_RetailAggregationReclass_Debug.xml
- TRT_RetailAttrAggregationDaily_Debug.xml

Importing Aggregation Framework Shell script

Perform the following step to import the Aggregation Framework shell script:

Copy aggplp.ksh, aggrcplp.ksh, and attraggplp.ksh from <STAGING_DIR>/ora/installer/Aggregation_Framework/ to $MMHOME/src directory.

Initial Aggregation Framework Setup

Perform the following procedure for initial set up of the Aggregation Framework:

Note:

The sql files mentioned below can be found under the <STAGING_DIR>/ora/istall/Aggregation_Framework folder.

Under the Retail Insights batch user schema, execute the provided script W_RTL_AGGREGATION_DAILY_TMP.sql and Alter_W_RTL_AGGREGATION_DAILY_TMP.sql in the same sequence as mentioned to create the configuration table W_RTL_AGGREGATION_DAILY_TMP.
Under the Retail Insights batch user schema, execute the provided script W_RTL_AGGREGATION_MSG_TMP.sql to create the Staging Log table W_RTL_AGGREGATION_MSG_TMP.
Under the Retail Insights batch user schema, execute the ra_aggregation_daily.sql, ra_aggregation_rec.sql, and RA_ATTR_AGGREGATION_DAILY_PROC.sql scripts to create a PL/SQL stored procedure RA_AGGREGATION_DAILY RA_AGGREGATION_REC and RA_ATTR_AGGREGATION_DAILY_PROC.
Create customized aggregation tables under the Retail Insights data mart schema.
Configure database UTLFILE folder and execution mode in the ra.env file. The UTLFILE folder location will be setup on the application server.

Figure 6-1 Configuring the UTLFILE Folder
Populate the W_RTL_AGGREGATION_DAILY_TMP configuration table. There is one row for each customized aggregation table.
Populate the Retail Insights program control table C_ODI_PARAM to include the program/customized target tables. The SCENARIO_NAME column should be populated with either PLP_RETAILAGGREGATIONDAILY if it is regular aggregation process or PLP_RETAILAGGREGATIONRECLASS if it is reclassification related process or PLP_RETAILATTRAGGREGATIONDAILY if it is regular attribute aggregation process. For each program/target table, the PARAM_NAME must be populated with value of 'TARGET_TABLE_NAME' and PARAM_VALUE with the actual customized table name. The current version does not get ETL_PROC_WID and EXECUTION_ID from this table. The end user can use SP_MAPPING column to populate these columns if these columns are required by the end users.

Aggregation Framework Verification

Perform the following steps for the Aggregation Framework verification:

For daily batch process, execute script aggplp.ksh with the name of customized aggregation table name and execution mode as parameters. For the reclassification only batch process, execute script aggrcplp.ksh with the name of customized aggregation table name and execution mode as parameters.
For daily batch process, execute script attragplp.ksh with the name of customized attribute aggregation table name and execution mode as parameters.
If the execution mode is not specified in the command line, or the value is not in ("F", "B", "E"), then the value defined in the ra.env file will be used.
Verify result by using the generated sql file with the DML statement. If it is not correct, then go to step 6 in the initial setup to reconfigure the configuration table.
Add step 7 into the batch scheduler, so the populating of the aggregation table will be part of daily ETL job.

Aggregation Framework Configuration

The Aggregation Framework must be properly setup and configured before the process can be included in the daily ETL process.

Creating Customized Aggregation Table

In order to use the framework, the end users must create the aggregation table with the following rules:

The customized aggregation table has to be created under the Retail Insights data mart schema.
The user should use the following table naming standard:
For day level aggregation table, the table name must contain _DY_.
For day and location level aggregation table, the table name must contain _LC_DY_.

Note:
It is a good practice to keep the column name as the same with the column name in the source table. This will reduce the task on the column mapping under the SP_MAPPING column.
For Attribute aggregation, the attribute column name in aggregate table should be same as that of the attribute tables. SP_MAPPING on attribute columns is not supported. 'ITEMDIFF' is one of the exceptions, for 'ITEMDIFF' attribute the attribute column name in aggregate table should be the differentiator name .i.e the attribute value of FLEX_ATTRIB_10_CHAR from W_RTL_ITEM_GRP1_D attribute table.
The attribute aggregation framework only supports attributes from W_RTL_ITEM_GRP1_D, W_PRODUCT_ATTR_D and W_PRODUCT_D, hence the attributes columns from one of these tables should be used while creating the Aggregate tables, if the aggregation is at attribute level.
The name of aggregateable columns (using sum or average) and only the name of aggregatable columns should end with _AMT, _LCL, _GLOBAL1, _GLOBAL2, _GLOBAL3, _QTY, _COUNT. Otherwise, a column mapping should be provided under SP_MAPPING column in the configuration table.
As a rule of the framework, the transaction date in the aggregation source table has to be named as DT_WID (for a source table at daily level), WK_WID (for a source table at week level), or DAY_DT (for a source TMP table).
For all other Retail Insights standard columns, please refer to the Oracle Retail Insights Data Model Guide.

Configuring Framework in W_RTL_AGGREGATION_DAILY_TMP

W_RTL_AGGREGATION_DAILY_TMP is a configuration table under the Retail Insights batch user schema. It has aggregation information utilized by the framework to generate DML statement. In order to use the Retail Insights Aggregation Framework correctly, the following information has to be provided by the client:

SRC_TABLE

This is the source table that is used as a source of aggregation process.

For a regular daily batch, the source table has to be a temp table owned by the Retail Insights batch user schema. In most case, it is generated by a Retail Insights ETL batch program which is mandatory to be executed. In case of attribute aggregation process the source temp table should be at _IT_LC_DY_TMP level.

For a reclassification batch that is only executed when there is a reclassification, the source table is a fact table at item/location/day level for transaction fact or at item/day or item/week for positional fact. These source fact tables should be owned by the Retail Insights data mart schema and populated by Retail Insights mandatory batch programs.
TGT_TABLE

This is the customized aggregation table. It is under the Retail Insights data mart schema.

AGGREGATION_TYPE

This column specified the aggregation type that the framework supports. Currently the framework supports aggregation on product hierarchy, time hierarchy (to week), and product season. The valid values are:

SC: from item to subclass based on as-was
CL: from item to class based on as-was
DP: from item to department based on as-was
SC_ASIS: from item to subclass based on as-is
CL_ASIS: from item to class based on as-is
DP_ASIS: from item to department based on as-is
SN: from item to item season based on the transaction date
IT: to aggregate items at item/attribute levels, only for attribute aggregation process.
SC_REC_ASIS: fact recalculation on reclassification day from item to subclass based on as-is. When this type is selected, the program will re-aggregate fact data from item level to subclass level based on the as-is. A temporary table is generated under the Retail Insights batch user schema to store the re-calculation result which can be used for further aggregation to class or department level. The temporary table name can be found in the DML statement file generated by the program under the UTLFILE folder.
CL_REC_ASIS: fact re-calculation on reclassification day from item to class based on as-is. The batch program that uses SC_RC_ASIS aggregation type is a pre-requirement for the batch program that uses CL_REC_ASIS and both should have the same source table name under SRC_TABLE column.
DP_REC_ASIS: fact re-calculation on reclassification day from item to department based on as-is. The batch program that uses SC_RC_ASIS aggregation type is a pre-requirement for the batch program that uses DP_RC_ASIS and both should have the same source table name under SRC_TABLE column.

Note:

_ASIS & _REC_ASIS aggregation_types are not supported by attribute aggregation process.

For reclassification type of aggregation, due to performance concern, the subclass/location/day level is mandatory before clients can aggregate to other levels. All other levels will use the result from the process to continue higher level aggregation.

ATTRIBUTE_KEY

This is the Attribute Key from the attribute table on which the attribute aggregation works. These keys are used to aggregate data at the attribute level. If there are multiple attributes to be aggregated in a single table, all attributes should be declared in the single column with comma "," delimiter. The attribute Key columns should be prefixed with their respective table names.

Example: W_RTL_ITEM_GRP1_D.BRAND_WID, W_RTL_ITEM_GRP1_D.COLOR_WID

The attribute key in case of regular aggregation or reclass aggregation will be left blank.
SEQ_NAME

This is the name of the sequence that will be used as ROW_WID on the target table.
AVG_COLUMNS

This is a list of columns that use average logic in the aggregation. The name of columns should be separated by comma.
PK_COLUMNS

This lists primary key columns for the customized aggregation table. The name of columns should be separated by comma.
SP_MAPPING

The framework provides auto column mapping for the following cases:
- The column name in the target table is the same as a source column in the source table.
- The amount columns _AMT, _AMT_GLOBAL1, _AMT_GLOBAL2, _AMT_GLOBAL3 are mapped to the source columns as _AMT_LCL/LOC_EXCHANGE_RATE if the configuration CURRENCY_EXPAND_IND is set to 'Y'.
- The column W_INSERT_DT and W_UPDATE_DT on the target column are mapped to system time from the database.
- The sequence name defined in the configuration table is used as ROW_WID when ROW_WID column exists in the target table.
- If the target column cannot be found in the source table by matching column name and at the same time, the target column name does not exist in the customized column mapping under SP_MAPPING, then value 0 will be used for the mapping and a warning message will be written to the message file.
- The INTEGRATION_ID column is mapped with the concatenation of primary key provided in the configuration table. The order of the concatenation is the same as the order of primary key provided in the PK_COLUMNS column in the configuration table. This auto mapping may use surrogate key instead of the ID from source system if the surrogate key is used as part of primary key.
  
  Besides the capability of auto mapping, this framework also provides customized column mapping by using the column SP_MAPPING in the configuration table.
- The syntax for the customized mapping is column1=value1. The column1 is a column name on the target table. The value1 can be either constant value or a column name on the source table.
- If there are multiple customized mappings, '&' should be used between each mapping. For example column1=value1 & column2=value2.
- The SQL aggregation function (sum, average, min) should be considered if the target column in the customized mapping is not part of the primary key specified in the column PK_COLUMNS.
- The customized mapping using SP_MAPPING column only supports regular update. Once a column mapping is specified, the update on this column will always use TARGET.COLUMN1=SOURCE.COLUMN1 regardless of the configuration value specified in the column POSITIONAL_IND.
- SP_MAPPING cannot be applied on the attribute key columns, Attribute key will be the same column name as attribute table.
CURRENCY_EXPAND_IND

This column is to indicate if the target table has an amount column in a primary currency or global currency that will be derived from the source table by a calculation. The valid values are 'Y' or 'N'.
PARA_DEGREE

This column has the parallel degree for the DML process. The default value is 0.
POSITIONAL_IND

This column indicates if the amount columns, quantity columns, or count columns on this table are stored in positional format or in transactional format. The valid values are 'Y' or 'N'. If the value is 'N', the target column will be updated by TARGET.COLUMN1=NVL(SOURCE.COLUMN1, 0) +NVL(TARGET.COLUMN1,0). If the value is 'Y', the target column will be updated by TARGET.COLUMN1=SOURCE.COLUMN1. If there is any exception, the end user can use customized mapping on column SP_MAPPING for those exceptional columns.

Populating the Customized Aggregation Table

Batch Process

Once the configuration is completed and tested, the customized aggregation table can be populated as daily ETL batch process. The syntax to kick off the process is:

For daily batch process of regular aggregation process, calling Unix script aggplp.ksh TARGET_TABLE_NAME, in which TARGET_TABLE_NAME is the name of customized aggregation table and it should be already configured in W_RTL_AGGREGATION_DAILY_TMP table.

For reclassification only process, calling Unix script aggrcplp.ksh TARGET_TABLE_NAME, in which TARGET_TABLE_NAME is the name of customized aggregation table and it should be already configured in W_RTL_AGGREGATION_DAILY_TMP table. For transactional fact, the execution of subclass/location/day level is the pre-requirement for all other levels. For positional fact, the execution of subclass/day and subclass/week is the pre-requirement for other corporate/day level and corporate/week level.

For daily batch process of attribute aggregation process, calling Unix script attraggplp.ksh TARGET_TABLE_NAME, in which TARGET_TABLE_NAME is the name of customized aggregation table and it should be already configured in W_RTL_AGGREGATION_DAILY_TMP table.

Please refer to the Oracle Retail Insights Data Model Guide for Retail Insights table naming standards.

Batch Status Control

The process from calling aggplp.ksh, aggrcplp.ksh or attraggplp.ksh will also cause the framework to insert a record to the Retail Insights batch status control table C_LOAD_DATES with PLP_RETAILAGGREGATIONDAILY, PLP_RETAILAGGREGATIONRECLASS or PLP_RETAILATTRAGGREGATIONDAILY as PACKAGE_NAME and the name of customized aggregation table as TARGET_TABLE_NAME. The client has to either execute etlrefreshgenplp.ksh to remove this record from C_LOAD_DATES or manually delete this status record from C_LOAD_DATES before the same ETL batch process can be executed again against the same aggregation table. This batch control process is consistent with the process used by Retail Insights mandatory batch programs.

Batch Logging

The Retail Insights Aggregation Framework writes batch logging information into a Retail Insights log file that is used by Retail Insights regular batch programs. The end user can also view the detailed logging information though ODI operator. This is in consistence with the logging from Retail Insights regular batch programs.

Besides the standard Retail Insights logging, the framework also provides a message file and a SQL file under the Oracle utlfile folder. The message file uses [TABLE_NAME] or [TABLE_NAME]_rc as file name and "msg" as file name extension. It provides in-formation when the target column cannot be found in the source table and when the customized column mapping cannot be found in the configuration table. The SQL file is available when the execution mode is set to 'B' or 'F' in ra.env file. It uses [TABLE_NAME] or [TABLE_NAME]_rc as file name and contains the DML statement that will be used to populate the customized aggregation table. The DML statement has "sql" as the file name extension. All these files can be used to help the end user to verify the ETL process result during framework setup time or during the regular batch process. In case of any failures the error information is passed to the ODI operator and the regular batch logging.

Aggregation Framework Data Flow

Figure 6-2 SN, DP, CL, SC, DP_ASIS, CL_ASIS, SC_ASIS Framework Data Flow Diagram

Figure 6-3 DP_REC_ASIS, CL_REC_ASIS, SC_REC_ASIS Framework Data Flow Diagram

Figure 6-4 IT, SC, CL, DP, SN Attribute Aggregation Framework Data Flow Diagram