Performance

7 Performance

Retail Insights is a high performance data warehouse, capable of moving and storing massive amounts of data, and providing efficient access to that data via the delivered and custom built reports. For any BI solution, including Retail Insights, smart decisions on how to implement and run your data warehouse will ensure that you are getting the most out of it. This chapter contains information that will help you get the best performance out of Retail Insights and identifies common contributors that can weaken performance, as well as best practices that will ensure Retail Insights is running in the most optimal manner.All implementations are unique and the factors that are beneficial for one implementation may not have the same effect for all the implementations. It is a good practice to test several settings/approaches for the factors and recommendations listed below and use the ones that work best for your environment. The factors listed in this chapter are the key factors that impact performance but no absolute values or settings can be provided for implementation purposes due to the uniqueness of each environment.Oracle Retail Insights includes ODI for extract, transform and load and Oracle Analytics (OAS) for analytic reporting purposes. The recommendations in this chapter will focus on both back end (ETL) and front end (Oracle Analytics) components of Retail Insights. In the cloud solution, these features will be handled automatically by Oracle.

Key Factors in Performance

Based on the complexity of the report, Oracle Analytics sometimes generates complex SQL, causing the Oracle Database to pick a less than optimized execution plan. In order to avoid this scenario, it is recommended that the "SQL Plan Baseline" functionality of the Oracle 12c be enabled (it is disabled by default). For more details refer to the Oracle 12c Performance Tuning Guide.

Purging and Archiving Strategy

With an increased use of the Retail Insights application, the data volumes will grow and may result in slower performance. The performance impact can be on Retail Insights batch that loads data to data warehouse tables, Retail Insights reports, and storage.Adoption of purging and archiving strategy help in reducing data volumes, resulting in better performance. Consider the following recommendations while implementing these strategies in a data warehouse:

Design your archiving and purging strategy as early as possible in the Retail Insights implementation. This helps in designing the most optimal table partitioning for large tables.
Ensure that the data is deleted in the most optimal manner. SQL delete statements may not be the most efficient way of removing unnecessary data from Retail Insights tables. Consult with your database administrator to discuss purging and archiving techniques.
Purging and archiving of tables must be carefully designed as it requires a good understanding of analytic reports required by business users or regulatory requirements that require companies to retain certain data for a required duration. For example, in certain cases, aggregated data may be kept longer as compared to the base level fact data because the users are interested in summary level reports as compared to detailed (base level) reports for data older than two years.
Automation of the archiving and purging processes ensures that a consistent approach is being followed in maintaining tables with large data volumes and provides consistent report performance to the users.
While designing purging programs, make sure that dimensional data is not deleted for which fact data is available or will be available.
An important consideration during purging is to make sure that Retail Insights seed data (where applicable) is not deleted accidentally.

Flexible Aggregates

Retail Insights, by default, provides several aggregate tables. For the complete list, see "Aggregates List". These pre-built aggregate tables are selected based on the following:

General usage patterns of the data
Reporting needs (As-Is or As-Was or both)
General aggregation ratio

The ratio between data in the base fact table versus data in the potential aggregate table should be considered while deciding whether the fact table should be aggregated or not. A ratio of 1:5 through 1:10 is a good starting point, a ratio of 1:10 through 1:20 is good to aggregate, and a ratio of 1 to more than 20 must be aggregated.

During implementation or before, it is expected for the retailer to identify these scenarios and select the appropriate aggregate tables for best performance and usability. All the aggregate tables which are pre-packaged will have the ODI and Oracle Analytics mappings. It is highly recommended not to use Retail Insights with all the available aggregates.

Using all these aggregations improves report performance but the improved report performance should be weighed against reduced ETL batch performance and increased storage requirement.

The reason for providing these aggregates is to give flexibility for the customer to pick appropriate levels and doesn't have to invest in customizing the product.

Below are the different groupings of aggregations. See "Aggregates List" for additional details.

As-Was aggregates
As-Is aggregates
As-Was Corporate aggregates
As-Is Corporate aggregates
Season aggregates

Even though there are different flavors of aggregations based on As-Is and As-Was, there will be few aggregate tables which will be commonly used for both As-Is and As-Was. That is because, As-Is and As-Was differentiation is only across Product and Organization Hierarchy. If the aggregation for a fact table is based on Time dimension or any dimension other than Product and Organization, then that aggregate table can be used for both As-Is and As-Was. For example, since the W_RTL_SLS_IT_LC_WK_A table is at Item and Location, it can used for either As-Was, As-Is, or both.

When the aggregations happen on any level of the Product or Organization hierarchy, there will be separate aggregates for As-Is and As-Was. For example, the W_RTL_SLS_SC_LC_DY_A and W_RTL_SLS_SC_LC_DY_CUR_A aggregates are on the subclass level of product dimension. The W_RTL_SLS_SC_LC_DY_A aggregate is for As-Was and the W_RTL_SLS_SC_LC_DY_CUR_A aggregate is for As-Is. All the As-Is aggregates are suffixed by 'CUR_A', which means 'Current.'

Note:

With the exception of a Corporate aggregate in Sales, Retail Insights out of the box does not have any aggregates across the Organization Dimension.

When the aggregation is only on a level from Product or Organization dimension then those are referred as Corporate Aggregates. These kinds of aggregates are very useful when reporting is done on any level of Product and Calendar hierarchy or Organization and Calendar hierarchy. For the list of this type of aggregates for every fact area see "Aggregates List". Corporate aggregates are also classified into As-Is and As-Was because they need to be processed separately to capture the current as opposed to historical parent information.

Season aggregates are useful to do reporting specific to Season dimension. All the aggregates on Season can be used for both As-Is and As-Was.

For each group of aggregations, as mentioned above, there is a mandatory aggregate table that needs to be used for other selections of the aggregates and that can be identified in the FlexAggregates document with the highlighted text.

For example, if the business only needs As-Is, the following points need to be considered:

Get the general usage patterns of the data and aggregation ratio. Based on that, select the list of aggregate tables. This may not be accurate for the first time but can always be changed over a period of time based on the usage and the changing data.
Ensure that you disable/freeze all the As-Was aggregate jobs and some of the As-Is aggregates which were not selected, in ODI and disable the same in Oracle Analytics as well. See the Oracle Retail Insights Operations Guide for more information.

ETL Programs Performance

Setting ETL Program Multi-threading

Retail Insights base fact load programs can be configured to run using multiple threads. The default number of threads for these programs is set to one and can be configured based on requirements. For additional information on how multi-threading works, see the Program Overview chapter of the Oracle Retail Insights Operations Guide.

Finalize the multi-threading strategy for the base fact extract programs.
Number of threads for each program may vary based on the data volume that program handles and resource availability. Different thread numbers should be tested by clients during implementation to achieve optimal results from multi-threading.
In the C_ODI_PARAM table, update the value of the PARAM_VALUE column to the desired number of threads. This applies to all records with the value 'LOC_NUM_OF_THREAD' in the PARAM_NAME column and the name of the program that requires multi-threading set in the SCENARIO_NAME column. See an example below for scenario named SIL_Test, where the desired number of threads needs to be set to 2 from 1 (default).
```
UPDATE C_ODI_PARAMSET PARAM_VALUE = 2WHERE PARAM_NAME = 'LOC_NUM_OF_THREAD'AND SCENARIO_NAME = 'SIL_Test'
```
If the number of thread required is more than 10, you need to modify the DDL for intermediate temp tables used by the ODI scenario. DDL changes require adding extra partitions to hold the data. The number of partitions on the intermediate temp table must be the same or higher than the required number of threads (which is the value for LOC_NUM_OF_THREADS set in the previous step).
The value setup in the C_ODI_PARAM (in step 3) should be bigger or equal than the max value of column ETL_THREAD_VAL in the staging tables. Otherwise, some records could get missing.
If RDE SDE programs are not used, it will be the client's responsibility to assign the data evenly across partitions in the staging tables based on the partition key column ETL_THREAD_VAL. The following is the things that need to be considered when data partition is manually made:
- To get the most benefit of multi-threading, the data in the staging tables should be evenly partitioned by column ETL_THREAD_VAL.
- Records with same location (store or warehouse) should have same ETL_THREAD_VAL, otherwise, unique constrain could be violated.

ODI Configuration

ODI must be configured prior to implementing Retail Insights. See the Oracle Retail Insights Installation Guide for details on configuring ODI in an on-premise environment. In the cloud, ODI is managed by Oracle and this documentation is informational only.

ETL Batch Scheduling

Set up the proper dependencies between the applications to ensure resources are fully utilized, which helps the nightly batch finish earlier.
Retail Insights load programs (SIL programs) must not wait for all the extraction programs (sde) to finish before starting. Some of them can start executing as soon as the corresponding staging table is populated. For more information on setting up dependencies, refer to the Oracle Retail Insights Operations Guide.
Allocate resources to the most important batch jobs (ones that populate the tables) that support your critical reports (the reports you need first thing in the morning). You can assign job priority in most batch scheduling tools.
Ensure that your source applications batch is optimized. Retail Insights runs towards the end of the nightly batch. Retail Insights jobs are often the last jobs to start due to the dependencies on the source system jobs, so Retail Insights is often the last to finish. Optimizing the source applications batch helps Retail Insights jobs start earlier.

Additional Considerations

Sort the W_RTL_INV_IT_LC_G table data after the data is seeded for the first time to improve ETL performance.
In a production environment, fact tables with large data volume can be created with the No Logging option. This improves the ETL performance and can be implemented on a case by case basis.

Report Design

Report design can affect the performance of a report. While creating custom reports, refer to the following guidelines:

Report developers should be trained in Oracle BI to learn how to design reports in the most optimal manner.
Design reports at the highest level possible and allow drill down to more detailed levels when required.
Design reports in a manner that multiple users can utilize a single report output rather than multiple users running the same report. A best practice is to run one report and distribute that report to multiple users. For more information on how to distribute reports, refer to the Delivering Content chapter of the Oracle Business Intelligence Enterprise Edition User Guide and the Configuring Oracle BI Scheduler chapter of the Oracle Business Intelligence System Administrator's Guide.
Do not design reports to request data at a level lower than the minimum level that a metric can be reported. In addition, drilling must not be performed at these levels. This ensures that reports do not produce misleading or invalid results. For example, reports must not be designed to request planning data at the item level because planning data is only available at the subclass level and above.
Evaluate and purge reports periodically to eliminate any outdated or duplicate reports.
Design reports to use the least amount of fact areas necessary. This reduces the number of fact table joins and in turn reduces the risk of poor report performance. For example, a best practice is not to design a single report with all sales, inventory, pricing and cost metrics, as this report will perform poorly due to joins on big fact tables. In this type of scenario, try creating separate reports with one or two fact areas on the report at a time and combining the results after these reports have run successfully.
Design reports with the least number of metrics necessary.
Schedule reports according to priority. This ensures that critical reports are available when needed. For more information on how to schedule reports, refer to the Configuring Oracle BI Scheduler chapter of the Oracle Business Intelligence System Administrator's Guide.

Additional Factors

Decision support queries sometimes require retrieval of large amounts of data. The Oracle BI server can save the results of a query in cache files and then reuse those results later when a similar query is requested. Using the middle-tier cache permits a query to be run one time for multiple runnings of a query and not necessarily every time the query is run. The query cache allows the Oracle BI Server to satisfy many subsequent query requests without having to access back-end data sources (such as Oracle database). This reduction in communication costs can dramatically decrease query response time.

To summarize, query caching has the following advantages only when the same report is run repeatedly:

Improvement of query performance
Less network traffic
Reduction in database processing and charge back
Reduction in Oracle BI server processing overhead

For more details on Caching refer to the Managing Performance Tuning and Query Caching chapter in the Oracle Analytics System Administrator's Guide.

Partitioning Strategy

Database level table partitioning is very important for ETL batch and report performance. For more information, see Compression and Partitioning.

Data Base Configuration

Retail Insights is built on Oracle Database 12c and must be optimized and configured for a retailers' needs. Refer to the Setting up your Data Warehouse System chapter of the Oracle 12c Data Warehouse Guide.

Adequate Hardware Resources

ETL program and report performance are highly dependent on the hardware resources. For more information, see Setup and Configuration.