Oracle® Retail Demand Forecasting User Guide for the RPAS Fusion Client Release 16.0 E91109-03 |
|
![]() Previous |
![]() Next |
Preprocessing is a filtering module that automatically adjusts historical data to correct data points that do not represent general demand pattern. Essentially, it smoothes out spikes and dips in historical sales data, replacing stock-out data and data from short term events, such as promotions and temporary price changes, with data points that more accurately represent typical sales for that period. By adjusting the historical sales, Preprocessing can provide smarter data to the RDF Causal Engine, thus creating a smarter baseline forecast.
Note: There are no workbooks associated with Preprocessing - it is available as a configuration option. |
Common Preprocessing corrections are:
Out of stock - Interfaced from RMS, weekly or daily
Outliers - Indicator not required, depends on method
Short term events - Promotions, temporary price changes
For example, Figure A-1 illustrates how Preprocessing adjusts for stock-outs.
In Figure A-1, RMS sends historical sales data to the Preprocessing module of RDF. In that sales data, RMS has flagged out-of-stock instances with indicators (the gray portion of the first data set). Preprocessing takes note of that out-of-stock indicator and adjusts the sales for that time period to reflect a more typical sales quantity, taking into account trending and seasonality. Note in Figure A-1 that Preprocessing has removed the dip in sales in the second data set and has replaced it with a new data point.
Note: In order to run any preprocessing method, there needs to be at least three periods with non-zero data in the preprocessing window. If there are less than three periods with non-zero data, then the time series is skipped. |
Preprocessing uses several methods to massage historical data. The following sections detail these methods:
Standard Median calculates baselines on long time ranges.
Input: None
Optional parameter: Window length
When data points for the full window are not available, Preprocessing pads the beginning and end of the time series with the first and the last data points, respectively, so that there are values for the full window.
Retail Median calculates baselines on long time ranges and improves side effects by making five standard median filter passes.
Input: none
Optional parameter: window length
Standard Exponential Smoothing removes spikes (such as promotional promo, temporary price changes, and so on), as well as filling the gaps (out of stock, unusual events such as a fire or hurricane).
Input: An Event Indicator that indicates which periods should be preprocessed.
Optional Parameters:
The following table details the optional parameters for Standard Exponential Smoothing.
Optional Parameters | Description |
---|---|
ES (Exponential Smoothing) | The alpha parameter that determines the weight put on observations of periods included in the calculations. |
Number of future periods (nfut) | The number of periods after an outage periods that are considered in the calculation of the future velocity.
Note that if during these periods an event flag or a event indicator is on, the particular period is excluded from the calculation. |
Number of past periods (npast) | The number of periods before an outage periods that are considered in the calculation of the past velocity.
Note: When calculating the past velocity and the first period in the preprocessing window is flagged, then the past velocity is calculated using earlier periods outside the preprocessing window. Note that if during these periods an event flag or a event indicator is on, the particular period is excluded from the calculation. |
Event flag | This parameter indicates if a period should be excluded from the calculation of past/future velocities. |
Stop at event flag | This parameter determines which periods are included in the calculation of past/future velocities.
If the flag is set to True, then the algorithm only includes periods before the first event flag or event indicator. If the flag is False, then all available, non-flagged periods, within the windows defined by nfut and npast, are used in the calculation of the past and future velocities. The default setting for the flag is False. |
When event flags exist within the future and past velocity windows, rather than consider the entire window, Preprocessing only considers unflagged data points after the last event flag in the history window to compute the past velocity. It does a similar process for the future window by using the unflagged data points prior to the first event flag in the future window to compute the future velocity. Consecutive events are smoothed using the same velocities. A data point becomes flagged, and hence not part of the future/past velocity calculation, if either the event indicator or the optional event flag are on.
If future velocities cannot be calculated, then the past velocities, if they exist, are used as future and past velocities, and vice versa. When neither of the velocities can be calculated, there is no adjustment.
If the velocity window contains all zero values, then the calculated velocity is zero. A velocity of zero is a legitimate value if it occurs within the selling window. A velocity of zero is not acceptable if it is calculated based on values outside of the selling window.
Lost Sales Standard Exponential Smoothing calculates baselines on long time ranges. Lost Sales Standard Exponential Smoothing makes positive adjustments to the flagged periods and to the period immediately following the flagged period.
Input: An Out-of-stock indicator that indicates which periods should be preprocessed.
Optional parameters:
The following table details the optional parameters for Lost Sales Standard Exponential Smoothing.
Optional Parameters | Description |
---|---|
ES (Exponential Smoothing) | The alpha parameter that determines the weight put on observations of periods included in the calculations. |
Number of future periods (nfut) | The number of periods after an outage periods that are considered in the calculation of the future velocity.
Note that if during these periods an event flag or a event indicator is on, the particular period is excluded from the calculation. |
Number of past periods (npast) | The number of periods before an outage periods that are considered in the calculation of the past velocity.
Note that if during these periods an event flag or a event indicator is on, the particular period is excluded from the calculation. |
Event flag | This parameter indicates if a period should be excluded from the calculation of past/future velocities. |
Partial outage flag | A scalar parameter indicating if the period immediately following an out-of-stock period should be adjusted. The default behavior is for the flag to be True. |
Stop at event flag | This parameter determines which periods are included in the calculation of past/future velocities.
If the flag is set to True, then the algorithm only includes periods before the first event flag or event indicator. If the flag is False, then all available, non-flagged periods, within the windows defined by nfut and npast, are used in the calculation of the past and future velocities. The default setting for the flag is False. |
When event flags exist within the future and past velocity windows, rather than consider the entire window, Preprocessing only considers unflagged data points after the last event flag in the history window to compute the past velocity. It does a similar process for the future window by using the unflagged data points prior to the first event flag in the future window to compute the future velocity. Consecutive events are smoothed using the same velocities. A data point becomes flagged, and hence not part of the future/past velocity calculation, if either the event indicator or the optional event flag is on.
If future velocities cannot be calculated, then the past velocities, if they exist, are used as future and past velocities, and vice versa. When neither of the velocities can be calculated, there is no adjustment.
If the velocity window contains all zero values, then the calculated velocity is zero. A velocity of zero is a legitimate value if it occurs within the selling window. A velocity of zero is not acceptable if it is calculated based on values outside of the selling window. Note that by default, the periods being adjusted are the periods flagged by an out-of-stock indicator and the period immediately following any such period. If the optional scalar parameter POA (Partial Outage Allowed) is set to False, then this extra period will not be adjusted, and only the out-of-stock periods will be adjusted.
Override fills gaps in data when a reference measure exists.
Input: reference measure (R(t)) to copy data from
Optional parameter: outage/mask (M(i)), adjustment ratio (a)
Formula: Overrides LSOVER with the Src adjusted by the adjustment ratio according to the mask:
Increment updates gaps or outliers in data when a reference measure exists.
Input: reference measure (R(t)) to copy data from
Optional parameter: outage/mask (M(i)), adjustment ratio (a)
Increments the Src with the reference adjusted by the adjustment ratio according to the mask:
Forecast Sigma removes spikes in recent data when no indicators are available.
Inputs: forecast and confidence intervals
Optional parameters:
number of stddev for upper bound
number of stddev for lower bound
forecast lower bound
minimum history required for filtering
Formula: If the difference in the sales and forecast is larger than a threshold, the override value is brought within some bounds of the forecast.
Like Forecast Sigma, Forecast Sigma Event removes spikes in recent data when no indicators are available but also takes outage as input.
Inputs: outage, forecast, confidence intervals
Optional parameters:
number of stddev for upper bound
number of stddev for lower bound
forecast lower bound
minimum history required for filtering
Formula:
If outage is on:
LSOVER = forecast
Otherwise, if the difference in the sales and forecast is larger than a threshold, the override value is brought within some bounds of the forecast.
DePrice removes the pricing effects.
Inputs: price, maximum price
Optional parameters: none
Formula:
Smoothed = original * (price/maxprice) ^2
This Preprocessing method clears the Preprocessing adjustments from previous runs and also clears the lsover measure.
LSOVER(t) = 0
LS(t) = 0
This Preprocessing method does not filter the source data.The preprocessing adjustments are cleared and lsover is set to the source data.
LS(t) = 0
LSOVER(t) = SRC(t)
When using Preprocessing to correct for stock-outs, the system expects out-of-stock indicators from a merchandising system like RMS. The system can be set up for automatic adjustment of sales history to correct for stock-outs as well as for manual user overrides under exception cases.
When set to automatically adjust sales history to correct stock-outs, Preprocessing takes into account trending and seasonality and adjusts the sales that were flagged by the out-of-stock indicator to reflect a more typical sales quantity.
When using Preprocessing to correct for outliers, the system expects outlier indicators. These are typically loaded.
Preprocessing adjusts promotional data in a similar way that it does stock-outs. Typically, historical data shows a higher rate of sales during promotional periods. Were these spikes in sales to be left in historical sales data and loaded in the RDF Causal Engine, the baseline forecast created from this data would reflect similar spikes in future sales.