Oracle® Retail Demand Forecasting Cloud Service User Guide Release 19.0 F24922-17 |
|
![]() Previous |
![]() Next |
This chapter describes the functionality of the Data Cleansing for Promo Estimation Task which includes:
Preprocessing Administration, see About Preprocessing Administration
Source Measure Maintenance see About Source Measure Maintenance
The following table lists the workspaces, steps, and views for the Forecast Review task.
This section describes how the preprocessing functionality is implemented in RDF using the Preprocess Administration Workspace.
Use the Preprocess Administration Workspace to perform this step:
Note: Similar to the Preprocess Administration Workspace is the Source Measure Maintenance Workspace. For functionality differences, see Source Measure Maintenance Functionality. |
About Preprocessing
Preprocessing is a module that is used to correct historical data prior to forecast generation when history does not represent general demand patterns. It is meant to automatically make adjustments to the raw POS (Point Of Sales) data so the next demand forecasts do not replicate undesired patterns.
Data Preprocessing is commonly used to:
Correct for lost sales due to stock-outs
Cleanse data for effects of promotions and short-term price changes (optional)
Correct for outliers – unusually high or low values introduced by human error or special events (hurricane that left a store closed for a week)
Scrub data manually to fake history and override user history
Adjust demand for the occasional 53rd calendar week
Manage demand created during events and holidays that do not occur in the same period every year, for example, Back to School.
Preprocessing runs after the data has been loaded from the host system and prior to forecast generation. Use the Preprocess Administration Workspace to select the techniques used to transform sales to unconstrained demand.. Commonly, there are up to three or four runs to go from raw sales to the data source that is used to generate the forecast. For RDF, a maximum of six runs is allowed for one data source. For example, if there is one baseline and one causal level, there can be up to six preprocessing runs allowed to create the data source for the baseline forecast, and up to six runs allowed to create the data source for the causal forecast.
Preprocessing Data in the RDF Workflow
Preprocessing offers a variety of algorithm methods to support the business requirements. The main reason for preprocessing is to transform the raw sales data into a measure that gets as close as possible to unconstrained demand.
The preprocessing step is most often implemented in batch, by invoking a preprocessing special expression. The special expression takes several measures as input. For instance, one needs to specify the measure to be corrected, the desired algorithm, and the number of periods to be considered. However, you can go into more detail, and specify several filter window lengths, or exponential smoothing parameters.
To build the Preprocess Administration workspace, perform these steps:
From the left sidebar menu, click the Task Module to view the available tasks.
Click the Estimate Historical Demand activity and then click Promo to access the available workspaces.
Click Preprocess Administration. The Preprocess Administration wizard opens.
You can open an existing workspace, but to create a new workspace, click Create New Workspace.
Enter a name for your new workspace in the label text box and click OK.
The Workspace wizard opens. Select the locations you want to work with and click Next.
Select the products you want to work with and click Finish.
The wizard notifies you that your workspace is being prepared. Successful workspaces are available from the Dashboard.
The Preprocess Administration workspace is built.
The available views are:
These views make available the preprocessing parameters for four rounds of preprocessing runs, necessary to calculate the data source for baseline forecasting and promotional forecasting.
In RDF, preprocessing is configured to create the data sources for baseline forecasting, as well as, causal forecasting. The creation of each of the sources can go through at most six runs of preprocessing. For example, to generate the Causal Data Source, it is configured for three runs and for the baseline data source, it is configured for four runs.
The Preprocess Admin Promo view allows you to define the scope of the preprocessing run, as well as filter out item/locations where preprocessing does not make sense because of lack of enough historical sales.
The Preprocess Admin Promo view contains the following measures:
Min Number of Weeks in System
This parameter defines the number of periods from when an item was introduced in the system. Usually the introduction time is considered to be the date when the item first sold. This check is also introduced to stop making data corrections for items that are very new, and where cleansing would be unreliable.
Min Number of Weeks with Sales
This parameter defines the number of weeks with sales that an item/store combination needs to have to qualify for data cleansing. The reasoning behind this check is that for items without enough data, corrections may not be reliable. Once there is enough data, and trends become clearer, corrections can be made.
Std ES Adjustment
This parameter determines the sign of the adjustments. The values can be:
Positive— The adjustments are positive, meaning the demand is going to increase. This is useful when correcting for periods of out-of-stock, when the demand is likely larger than the actual sales.
Negative— The adjustments are negative, meaning the sales are going to be decreased. This is useful when removing promotion demand from sales to create the baseline demand.
Both— Both positive and negative adjustments are allowed. This is useful when the sales are corrected for outliers and stockouts during the same preprocessing run.
There are actually two views with the same set of measures, where, you can enter values for parameters specific for some of the preprocessing methods available in the special expression to create the data sources for the baseline and causal forecasts. There is a set of parameters for each of the maximum six runs allowed. However, very likely not all six are configured. For example, when generating the causal data source, three runs are configured. The parameters are entered at the class/store intersection
The Preprocess Method Parameters for Promo view contains the following measures:
Alpha
Exponential smoothing coefficient used to calculate past and future velocities.
Future Weeks
This represents the maximum number of data points to calculate the future velocity, when using the Standard Exponential Smoothing or Lost Sales Standard Exponential Smoothing preprocessing methods.
Last Date
This represents the end date of the preprocessing window; it is typically today's date, but can be any date in the past.
Past Weeks
This represents the maximum number of data points to calculate the past velocity, when using the Standard Exponential Smoothing or Lost Sales Standard Exponential Smoothing preprocessing methods.
Preprocessing Window
Number of historical data points that are preprocessed.
Standard Median Window
Filter window length for the Standard Median preprocessing method.
Partial Outage
A scalar parameter indicating if the period immediately following an out-of-stock period should be adjusted. The default behavior is for the flag to be True.
Stop at Event
This parameter determines which periods are included in the calculation of past/future velocities.
If the flag is set to True, then the algorithm only includes periods before the first event flag or event indicator.
If the flag is False, then all available, non-flagged periods, within the windows defined by Past Weeks and Future Weeks, are used in the calculation of the past and future velocities.
The default setting for the flag is False.
Std ES Adjustment Ovr
This parameter overrides the default behavior set in the Std ES Adjustment measure. The values can be:
Positive— The adjustments are positive, meaning the demand is going to increase. This is useful when correcting for periods of out-of-stock, when the demand is likely larger than the actual sales.
Negative— The adjustments are negative, are negative, meaning the sales are going to be decreased. This is useful when removing promotion demand from sales to create the baseline demand.
Both— Both positive and negative adjustments are allowed. This is useful when the sales are corrected for outliers and stockouts during the same preprocessing run.
None— No override of the default setting is necessary.
In this view at the item/store level, you can override values for parameters specific for some of the preprocessing methods available in the special expression. There are two views, corresponding to the preprocessing runs necessary to create the data sources for the baseline and causal forecasts.
After all parameters are set and committed back to the domain, usually a batch job will run the pre-processing steps and prepare the data source s for forecast generation.
The Preprocess Method Parameters Override for Promo view contains the following measures:
Alpha Override
Exponential smoothing coefficient used to calculate past and future velocities.
Future Weeks Override
This represents the maximum number of data points to calculate the future velocity, when using the Standard Exponential Smoothing or Lost Sales Standard Exponential Smoothing preprocessing methods.
Last Date Override
This represents the end date of the preprocessing window; it is typically today's date, but can be any date in the past.
Past Weeks Override
This represents the maximum number of data points to calculate the past velocity, when using the Standard Exponential Smoothing or Lost Sales Standard Exponential Smoothing preprocessing methods.
Preprocessing Window Override
Number of historical data points that are preprocessed.
Standard Median Window Override
Filter window length for the Standard Median preprocessing method.
Partial Outage Override
A scalar parameter indicating if the period immediately following an out-of-stock period should be adjusted. The default behavior is for the flag to be True.
Stop at Event Override
This parameter determines which periods are included in the calculation of past/future velocities.
If the flag is set to True, then the algorithm only includes periods before the first event flag or event indicator.
If the flag is False, then all available, non-flagged periods, within the windows defined by Past Weeks and Future Weeks, are used in the calculation of the past and future velocities.
The default setting for the flag is False.
Depending on your wizard selection for either baseline or causal, the view shows the preprocessing parameters for each relevant run. For example if baseline is selected, the view displays preprocessing information for the four runs that are configured. If causal is selected, the view shows preprocessing parameters for three runs.
Baseline View
The view displays the measures necessary to create the data source for baseline forecasting. This involves four rounds of preprocessing that run in batch or online.in this order:
Correcting for stockouts
Correcting for outliers
Depromoting sales
Smooth sales
Causal View
This view displays the measures necessary to create the data source for causal forecasting. This involves three rounds of preprocessing that run in this order:
Correcting for stockouts
Correcting for outliers
Deseasonalizing the measure to create the Causal Data Source
The Preprocess Panel for Promo view contains the following measures:
First Time-Phased Parameter Causal
This measure stores the first time-phased measure that is required for some preprocessing methods. For instance, for the STD ES LS method, this measure would store the measure name of the outage flag. Or for the STD ES method, it could store the name of the outlier flag.
Preprocess Method
Name of the preprocessing method to be used for each run. This method is selected in the Configuration Tools
Output Data Measure
Indicates the measure that stores the result of the last configured preprocessing run. For instance, for the Preprocess Panel for Baseline, the output comes from run 4.
Run Label
A label denoting the purpose of the preprocessing run, for example, Correct Outliers, or Depromote Sales.
Run Preprocess Flag
Boolean measure indicating if this run should be enabled or skipped.
Second Time-Phased Parameter Causal
This measure stores the second time-phased measure that is required for some preprocessing methods. For instance, for the Forecast Sigma method, this measure would store measure name of the confidence intervals. Or for the Override method it could store the measure name of the outage flag.
This section describes how the preprocessing functionality is implemented in RDF using the Source Measure Maintenance Workspace.
The functionality in the About Source Measure Maintenance is a superset of the functionality in the About Preprocessing Administration. The purpose and functionality between the two is described in Source Measure Maintenance Functionality.
Use the Source Measure Maintenance Workspace to perform these steps:
The Preprocess Administration Workspace and the Source Measure Maintenance Workspace have a large set of common content. The main difference is that while the Source Measure Maintenance Workspace has the calendar hierarchy, on top of the product and location, and the Preprocess Administration Workspace has only the product and location.The additional hierarchy allows the review of the time-phased preprocessing measures, as well as the calculated forecasting data sources.runs.
Due to their additional dimension of week, these measures add to the size of the workspace, and also make workspace operations slower. For instance, workspace build, refresh, commit, and so on, take longer than in the otherwise similar Preprocess Administration Workspace.
Preprocess Administration Workspace
The Preprocess Administration Workspace is at the product/location intersection, so it can be built with a lot of positions, without experiencing poor performance. The purpose is to set preprocessing parameters, which are inputs to the special expression that is run in batch.
Source Measure Maintenance Workspace
The Source Measure Maintenance Workspace, described in this chapter, is at the product/location/calendar intersection, and is a lot more data intensive. The purpose is to set preprocessing parameters and run the data filtering online, with the ability to review the results without having to wait for an overnight batch. If the results are not as expected, or you want to experiment with different settings, you can make changes to the parameters and rerun the custom menus. To achieve this it is expected that only a small subset of the available product/locations is included in the workspace.
To build the Source Measure Maintenance workspace, perform these steps:
From the left sidebar menu, click the Task Module to view the available tasks.
Click the Estimate Historical Demand activity and then click Promo to access the available workspaces.
Click Source Measure Maintenance. The Source Measure Maintenance wizard opens.
You can open an existing workspace, but to create a new workspace, click Create New Workspace.
Enter a name for your new workspace in the label text box and click OK.
The Workspace wizard opens. Select the products you want to work with and click Next.
Note: It is important to include all products that are members of the Merchandise dimensions in the forecast levels to be analyzed. For example, if you select to view a forecast level that is defined at subclass/store/week, you must include all items that are members of the particular subclass to be analyzed. It is recommended that Position Query functionality or selection from aggregate levels in the Merchandise hierarchy is employed if the task supports an AutoTask build. |
Select the locations you want to work with and click Next.
Note: It is important to include all locations that are members of the location dimensions in the forecast levels to be analyzed. For example, if you select to view a forecast level that is defined at item/chain/week, you should include all locations that are members of the particular chain to be analyzed. It is recommended that Position Query functionality or selection from aggregate levels in the location hierarchy is employed if the task supports an AutoTask build. |
Select the weeks of the forecast you wish to review and click Finish.
The wizard notifies you that your workspace is being prepared. Successful workspaces are available from the Dashboard.
The Source Measure Maintenance workspace is built and includes these steps:
The available views are:
These views make available the preprocessing parameters for four rounds of preprocessing runs, necessary to calculate the data source for baseline forecasting and promotional forecasting.
In RDF, preprocessing is configured to create the data sources for baseline forecasting, as well as, causal forecasting. The creation of each of the sources can go through at most six runs of preprocessing. For example, to generate the Causal Data Source, it is configured for three runs and for the baseline data source, it is configured for four runs.
This step contains the Preprocess Admin view that allows you to define the scope of the preprocessing run, as well as filter out item/locations where preprocessing does not make sense because of lack of enough historical sales.
Min Number of Weeks in System
This parameter defines the number of periods from when an item was introduced in the system. Usually the introduction time is considered to be the date when the item first sold. This check is also introduced to stop making data corrections for items that are very new, and where cleansing would be unreliable.
Min Number of Weeks with Sales
This parameter defines the number of weeks with sales that an item/store combination needs to have to qualify for data cleansing. The reasoning behind this check is that for items without enough data, corrections may not be reliable. Once there is enough data, and trends become clearer, corrections can be made.
There are actually two views with the same set of measures, where, you can enter values for parameters specific for some of the preprocessing methods available in the special expression to create the data sources for the baseline and causal forecasts. There is a set of parameters for each of the maximum six runs allowed. However, very likely not all six are configured. For example, when generating the causal data source, three runs are configured. The parameters are entered at the class/store intersection
The Preprocess Method Parameters for Promo view contains the following measures:
Alpha
Exponential smoothing coefficient used to calculate past and future velocities.
Future Weeks
This represents the maximum number of data points to calculate the future velocity, when using the Standard Exponential Smoothing or Lost Sales Standard Exponential Smoothing preprocessing methods.
Last Date
This represents the end date of the preprocessing window; it is typically today's date, but can be any date in the past.
Past Weeks
This represents the maximum number of data points to calculate the past velocity, when using the Standard Exponential Smoothing or Lost Sales Standard Exponential Smoothing preprocessing methods.
Preprocessing Window
Number of historical data points that are preprocessed.
Standard Median Window
Filter window length for the Standard Median preprocessing method.
Partial Outage
A scalar parameter indicating if the period immediately following an out-of-stock period should be adjusted. The default behavior is for the flag to be True.
Stop at Event
This parameter determines which periods are included in the calculation of past/future velocities.
If the flag is set to True, then the algorithm only includes periods before the first event flag or event indicator.
If the flag is False, then all available, non-flagged periods, within the windows defined by Past Weeks and Future Weeks, are used in the calculation of the past and future velocities.
The default setting for the flag is False.
In this view at the item/store level, you can override values for parameters specific for some of the preprocessing methods available in the special expression. There are two views, corresponding to the preprocessing runs necessary to create the data sources for the baseline and causal forecasts.After all parameters are set and committed back to the domain, usually a batch job will run the pre-processing steps and prepare the data source s for forecast generation.
The Preprocess Method Parameters Override for Promo view contains the following measures:
Alpha Override
Exponential smoothing coefficient used to calculate past and future velocities.
Future Weeks Override
This represents the maximum number of data points to calculate the future velocity, when using the Standard Exponential Smoothing or Lost Sales Standard Exponential Smoothing preprocessing methods.
Last Date Override
This represents the end date of the preprocessing window; it is typically today's date, but can be any date in the past.
Past Weeks Override
This represents the maximum number of data points to calculate the past velocity, when using the Standard Exponential Smoothing or Lost Sales Standard Exponential Smoothing preprocessing methods.
Preprocessing Window Override
Number of historical data points that are preprocessed.
Standard Median Window Override
Filter window length for the Standard Median preprocessing method.
Partial Outage Override
A scalar parameter indicating if the period immediately following an out-of-stock period should be adjusted. The default behavior is for the flag to be True.
Stop at Event Override
This parameter determines which periods are included in the calculation of past/future velocities.
If the flag is set to True, then the algorithm only includes periods before the first event flag or event indicator.
If the flag is False, then all available, non-flagged periods, within the windows defined by Past Weeks and Future Weeks, are used in the calculation of the past and future velocities.
The default setting for the flag is False.
Depending on your wizard selection for either baseline or causal, the view shows the preprocessing parameters for each relevant run. For example if baseline is selected, the view displays preprocessing information for the four runs that are configured. If causal is selected, the view shows preprocessing parameters for three runs.
Baseline View
The view displays the measures necessary to create the data source for baseline forecasting. This involves four rounds of preprocessing that run in batch or online.in this order:
Correcting for stockouts
Correcting for outliers
Depromoting sales
Smooth sales
Causal View
The causal view displays the measures necessary to create the data source for causal forecasting. This involves three rounds of preprocessing that are run in batch:
Correcting for stockouts
Correcting for outliers
Deseasonalizing the measure to create the Causal Data Source
The Preprocess Panel for Promo view contains the following measures:
First Time-Phased Parameter Causal
This measure stores the first time-phased measure that is required for some preprocessing methods. For instance, for the STD ES LS method, this measure would store the measure name of the outage flag. Or for the STD ES method, it could store the name of the outlier flag.
Preprocess Method
Name of the preprocessing method to be used for each run. This method is selected in the Configuration Tools
Output Data Measure
Indicates the measure that stores the result of the last configured preprocessing run. For instance, for the Preprocess Panel for Baseline, the output comes from run 4.
Run Label
A label denoting the purpose of the preprocessing run, for example, Correct Outliers, or Depromote Sales.
Run Preprocess Flag
Boolean measure indicating if this run should be enabled or skipped.
Second Time-Phased Parameter Causal
This measure stores the second time-phased measure that is required for some preprocessing methods. For instance, for the Forecast Sigma method, this measure would store measure name of the confidence intervals. Or for the Override method it could store the measure name of the outage flag.
The main purpose of this step is to display time-phased measures that represent input and output to the preprocessing stages, run in batch based on the settings selected in Preprocessing Admin.
This step includes the Source Measure Maintenance for Promo View.
This view displays measures that represent input and output of the preprocessing runs, in table format.
The Source Maintenance view contains the following measures:
User Adjustment
In this measure, you can enter values that are going to be added to the preprocessing adjustments to create the data sources.
The logic is: data source = weekly sales + preprocessing adjustments + user adjustment
This measure is read/write.
Weekly Sales
This measure stores the raw sales loaded in RDF. This is the input to the first run of preprocessing. This measure is read only.
Data Source
This measure represents the output of the preprocessed raw sales, as well as incorporates the user adjustments according to the formula:
data source = weekly sales + preprocessing adjustments + user adjustments.
Out of Stock Indicator
This measure is either loaded or calculated by the rules in the custom menu. It is used during the pre-processing run that corrects sales for lost sales.
Outliers Indicator
This measure is either loaded or calculated by the rules in the custom menu. It is used during the pre-processing run that corrects the sales for outliers.
Promotion Indicator
This measure is usually calculated as the or of all available Boolean promotional variables. It is used during the preprocessing run that removes promotional sales.