This chapter explains the proport mechanism.
This chapter covers the following topics:
The proport mechanism computes and stores information about the average demand per day for each item-location combination. Demantra uses this information whenever it needs to split higher-level data across the relevant lowest-level members.
For example, if one item-location combination had four times as many sales as another, the former combination should receive four times as much of the forecast.
In general, Demantra splits data whenever necessary, including the following occasions:
When the Analytical Engine generates a forecast at an aggregated level.
When data is imported at an aggregated level.
When users perform chaining at an aggregated level.
This chapter describes how matrix proportions are calculated.
Demantra provides three general ways to specify the relative proportions of different combinations:
Kind of proportions | Details | When used |
---|---|---|
Matrix proportions or stored proportions | Proportions that Demantra calculates and stores for later use. The calculation is based upon the demand, but also considers recent average demand, month-to-month variations, and so on. Various parameters and combination-specific flags control exactly how proport works. These proportions are averages are and are not as good as actual proportions. |
Option when importing data Automatically used when forecast must be created at higher level |
Actual proportions | Use the proportions of the Demand series. | Option when importing data |
Proportions of a reference series | Use the proportions of a reference series, typically: Demand (suitable for a historical series) Final Plan (suitable for a forecast series) |
When data is edited at an aggregated level |
SALES_DATA based proportion | Proportion that Demantra uses for future forecast splits. This proportion is used when the SKUs have insufficient historical demand data, and the engine forecasts several SKUs aggregated together with older SKUs providing history for the SKUs in use. In nodes where all participants are marked for SALES_DATA based proportion, the engine splits the forecast using the lowest level value of the series called Future Proportions. | When the future forecast is split based on configurations set by the users. |
You can tune the proport mechanism as follows;
Tuning how the proport mechanism smooths data variations form month to month. You can tune these settings globally or separately for individual combinations.
Specifying how the proport mechanism handles null data and other data issues.
Specifying which combinations the proport mechanism should consider when it runs.
The following sections describe how proport handles these steps.
For each combination, Demantra computes a rolling average of the most recent demand over some span of time. This rolling average (glob_prop) depends on the following:
After calculating the rolling average demand for each combination, Demantra calculates the average demand per day (P1, P2, ..., P12) averaged over a month's time. This calculation consists of three steps:
Calculating the average daily demand for each month of the year.
Adjusting the level of these averages to account for any overall trend. This calculation uses the rolling average demand.
Smoothing these averages to account for month-to-month variations. This calculation also uses the rolling average demand.
Note: In weekly and daily systems, the proport mechanism scales the monthly proportions (P1, ..., P12) by dividing by the number of days in the month, as appropriate.
For each combination, Demantra calculates the following average demand per day averaged over a month's time, for each month of the year. (This data is stored in mdp_matrix):
P1 | Average demand per day for the month of January |
P2 | Average demand per day for the month of February |
and so on |
Depending on your business, you may want to smooth out the month-to-month variations. The delta field in mdp_matrix specifies a weight for a given item-location combination. Demantra uses this weight to even out these variations as in the following example:
P1 = glob_prop * delta + (old P1) * (1 - delta)
The delta field must be a floating-point number, anywhere between 0 and 1, inclusive. The larger it is, the more you smooth out the day-to-day variations.
The def_delta parameter specifies the default value for the delta parameter for any new combinations.
These smoothed proportions are stored in mdp_matrix follows, overwriting the old P1, P2, ... fields:
P1 | Smoothed, level-adjusted average daily demand for the month of January |
P2 | Smoothed, level-adjusted average daily demand for the month of February |
and so on |
Starting with the average daily demands for each item-location combination for each month of the year, Demantra considers the change in level over the past year and adjusts the level of the proportions accordingly.
For a simple example, consider the following historical daily demand, over the last year:
Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec |
---|---|---|---|---|---|---|---|---|---|---|---|
30 | 32 | 31 | 34 | 35 | 37 | 38 | 36 | 39 | 41 | 39 | 43 |
A slight upwards trend is fairly obvious, especially when this demand is graphed.
To keep matters simple, suppose that we have only one year's worth of data. If we just used demand from last January (that is, 30) for the next January, we would underestimate the demand, because the overall level of demand has increased over the last year. Therefore, Demantra calculates the level-adjusted daily demand for January as follows:
P1 = (smoothed P1) * (rolling average) * 12 / (sum of all average demand)
Here smoothed P1 is the average demand per day for January as calculated previously and then smoothed as described in “Smoothing Out Variations”. These level-adjusted, “normalized” proportions are stored in mdp_matrix, overwriting the old P1, P2, ... fields:
P1 | Level-adjusted, smoothed average daily demand for the month of January |
P2 | Level-adjusted, smoothed average daily demand for the month of February |
and so on |
You can control how the proport mechanism handles various data issues.
You must specify how to treat missing dates, which can have a large effect on the averages. If there are no sales for an item-location combination for a given month, that can either mean there truly were zero sales or it can indicate a problem with the data. You specify the proport_missing parameter as follows.
If this parameter is equal to 0, dates with null data are treated as dates with zero sales. That is, suppose that you have three months worth of data as follows: 30, null, 60. If proport_missing equals 0, the average of these three months is calculated as 30 (or [30+0+60]/3).
If this parameter is equal to 1, dates with null data are ignored. Using the old example, if proport_missing equals 1, the average of these three months is calculated as 45 (or [30+60]/2). This is mathematically equivalent to assuming that the missing month has average sales (45).
Similarly, suppose you have weekly sales data, but you do not have data for all the weeks in a given month. If proport_missing equals 0, the weeks with null sales are treated as having zero sales. If proport_missing equals 1, the weeks that have null sales are considered as having average sales.
For data with many missing observations, it is likely that the null sales actually represent no sales; in this case, it is suitable to specify proport_missing as 0.
For data with only a few missing observations, it may be more likely that the missing observations represent data problems. In this case, it would be better to specify proport_missing as 1 and ignore the missing observations.
For any given item-location combination, the sales may not include data for every month of the year. For example, for a given item-location combination, suppose that you have 24 months worth of data, but that there were no sales in November or December—for any year. This means that you have ten distinct months represented in the history.
The proport_threshold parameter specifies the minimum number of distinct months that must be present in the sales data for any given item-location combination. For example, if you have data for three Januaries, that counts as one observation for January.
Then:
If the history does contain enough distinct months, the averages are calculated as normal for the months that have non-null data. You must specify what values to use for the other months; see “Specifying How to Handle Missing Time Buckets”.
If the history does not contain enough distinct months, Demantra checks the value of the proport_missing parameter and then does the following:
If proport_missing equals 0, Demantra sets the averages equal to the glob_prop * delta.
If proport_missing equals 1, Demantra sets all averages equal to the rolling average (glob_prop).
In the preceding example, suppose that you have monthly data and suppose that proport_threshold is 11. In this case, this combination does not have data for enough distinct months, and all monthly proportions are equal to glob_prop. In contrast, suppose that proport_threshold equals 8 instead. In this case, the monthly averages are calculated as normal for the months with non-null data.
You use the proport_spread parameter to specify what value to use for any bucket that has null data.
Suppose that we have the following data for a given item-location combination:
Nov 2002 | Dec 2002 | Jan 2003 | Feb 2003 | Mar 2003 | Apr 2003 | May 2003 | Jun 2003 | Jul 2003 | Aug 2003 |
---|---|---|---|---|---|---|---|---|---|
7 no data before this month |
0 | 1 | 5 | 9 | 3 | 2 | null | 1 | no data yet |
Before we examine the proport_spread parameter, it is worthwhile to rearrange this information and identify which months are missing:
P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 5 | 9 | 3 | 2 | missing | 1 | missing | missing | missing | 7 | 0 |
The proport_spread parameter can equal any of the following values:
If proport_spread is 0, missing months receive 0 proportions. In this case, Demantra calculates the monthly averages as follows:
P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 5 | 9 | 3 | 2 | 0 | 1 | 0 | 0 | 0 | 7 | 0 |
If proport_spread is 1, Demantra checks the value of the proport_missing parameter and then does the following:
If proport_missing equals 0, then missing months receive glob_prop*delta. In this case, Demantra calculates the monthly averages as follows:
P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 5 | 9 | 3 | 2 | glob_prop*delta | 1 | glob_prop*delta | glob_prop*delta | glob_prop*delta | 7 | 0 |
If proport_missing equals 1, then missing months receive the rolling average (glob_prop). In this case, Demantra calculates the monthly averages as follows:
P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 5 | 9 | 3 | 2 | glob_prop | 1 | glob_prop | glob_prop | glob_prop | 7 | 0 |
If proport_spread is 2, Demantra considers whether a missing month could have been included within the history. (This setting has an effect only if you have data for less than a full year as in our example.)
First, Demantra uses 0 for missing months that could have been included within the partial year.
For missing months that could not have been included, Demantra checks the value of the proport_missing parameter and then does the following:
If proport_missing equals 0, then missing months receive glob_prop*delta.
If proport_missing equals 1, then missing months receive the rolling average (glob_prop).
In our example, the history started in November 2002 and continues through July 2003. That span of time does not include the months of August, September, and October, so those missing months receive glob_prop; the missing month of June, on the other hand, receives 0. In this case, Demantra calculates the monthly averages as follows (assuming that proport_missing does not equal 1)
P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 5 | 9 | 3 | 2 | 0 | 1 | glob_prop | glob_prop | glob_prop | 7 | 0 |
Note: If you set proport_missing to consider omitted values as null, there is no reason to set the Run_full_matrix_proport parameter to 1.
By default, the proport mechanism loads only combinations with non-null values, and it recalculates proportions based only on the loaded combinations. In some cases, this is correct; it may not be desirable to recompute proportions frequently if your data is intermittent.
In other cases, however, a null value really does mean zero sales, and the split should be recalculated accordingly. Assume that proport is considering three months of history data (hist_glob_prop equals 3). Consider this example:
Combination | January | February | March | April |
---|---|---|---|---|
1 (Product A at store A) | 500 | 500 | 400 | (null) |
2 (Product A at store B) | 50 | 60 | 40 | 70 |
In this case, if you calculated proportions in March, you would calculate the split between combination 1 and combination 2 as (500+500+400)/(50+60+40), which is appropriate.
However, if you calculated proportions in April, you would not load new member data for combination 1. In this case, the system would calculate the split between combination 1 and combination 2 as (500+500+400)/(60+40+70), which means that combination 1 would receive most of the split, even though there were no sales of this product in this store in April. This is probably not what you want.
The Run_full_matrix_proport parameter specifies whether to run the proport mechanism on all the item-location combinations.
If no (0), run proport only on the combinations that have newly loaded sales data or that have been flagged (prop_changes=1) in the database.
If yes (1), the proport mechanism calculates proportions for all nodes at loading time and assumes that null values represent zero. This takes longer (possibly much longer), but avoids the miscalculation outlined previously.
If equal to 2, the proport mechanism calculates proportions for all combinations that have new_member=1.
The proport mechanism then recomputes the rolling average, individual monthly averages, and individual monthly proportions for each of those combinations, as described in “Calculating the Rolling Average Demand”.
The proport mechanism considers only real combinations. That is, it ignores combinations for which is_fictive equals 1.
The proport mechanism calculates the prediction status of each combination, in addition to calculating the proportions.
Proport supports all engine profiles. The proport procedure can be passed an engine profile id (the engine profile ID is a system-generated value generated when the user creates the engine profile).
Users should query the DB table engine_profiles to determine the engine profile Id of the profile they want to use. They can then either set this value for the 'ProportDefaultEngineProfile' system parameter, as described below, or directly call the proport procedure with the engine parameter: "exec proport (null, null, engine_id);". This call will execute proport with all the parameters associated with the engine profile being passed. If a proport relevant parameter is not found in the specific engine profile, then a value from the Base engine profile will be used.
In instances where proport is called from another process, including data load or data updates, the profile called will be based on the system parameter ProportDefaultEngineProfile. This parameter should only be modified if the proport parameter settings in the engine profile differs from the base profile. Otherwise, if proport settings for your engine profiles do not vary from the Base profile, you should leave that parameter at its current setting.
Proport searches the corresponding parameter table for parameters currently used to drive proportion and prediction_status calculations. These parameters include but are not limited to:
Start_date
Quantity_form
Proport_threshold
Def_delta
Last_date
Last_date_backup
HistoryLength
Timeunit
Dying_time
Mature_age
Hist_glob_prop
Any parameters not found in the profile specific parameter table will be retrieved from the base profile parameter table.
In instances where proport is called from another process, including data load or data updates, the profile called will be based on the system parameter ProportDefaultEngineProfile. This parameter should be modified if the proport engine profile differs from the base profile.
When determining which combinations require a forecast, general level-based combinations behave differently than sales_data-based combinations. General Level (GL) combinations have an effectivity start and end date. Combinations are only forecast for dates when they are active.
When the proport procedure is called, it is passed a profile_id to be used. Proport will first search for definitions on the init_params table matching the passed profile. Any engine parameters that are not found in this profile use the parameter values defined in init_params_0.
In cases where the data and combination tables are not specified as SALES_DATA and MDP_MATRIX, proport will generate GL-relevant proportions and prediction_status values on the specified tables.
If a supersession is detected by the use of the EngKeyDef_supersession parameter, then the proportion is split as follows:
Fit Split (Historical Forecast)
Forecast Split (Future Forecast)
For both, the split is done according to the allocation rules from the forecast level down to the lowest GL level, defined by the lowest level of the forecast tree. Once the forecast is allocated down to the lowest GL level, any underlying combinations receive equal split.
Note: When forecasting above the supersession level, each supersession group receives its part of the forecast based on historical volume proportions, similar to the split used for non-supersession groups.
When allocating the historical forecast to the lowest GL level, the split is done in the usual manner, using the historical demand values.
For example:
End of history is defined for 12/1/2009
If fit forecast generated is 4000 per month, the forecast is allocated as follows based on historical values for each combination and month.
CTO_ID | 1 | 2 | 3 | 4 | 5 |
Date | History | ||||
1/1/2009 | 1500 | ||||
2/1/2009 | 700 | ||||
3/1/2009 | 1200 | ||||
4/1/2009 | 800 | ||||
5/1/2009 | 300 | 1000 | |||
6/1/2009 | 500 | 1000 | |||
7/1/2009 | 800 | 1200 | |||
8/1/2009 | 1200 | 800 | |||
9/1/2009 | 900 | 1100 | |||
10/1/2009 | 400 | 1600 | |||
11/1/2009 | 200 | 800 | |||
12/1/2009 | 1000 | 1000 |
In the situation where there are multiple members under the GL level, the allocation of fit among those members should be based on an equal split.
Splitting the forecast is driven by proportions. General level forecasting can be more complicated because of the requirement to respect effectivity start and end dates as well as product revisions logic. In addition, all members belonging to the same revision are defined by parameter EngKeyDef_Supersession. If this parameter is defined at the same aggregation level as the EngKeyDefPK parameter, then no supersessions are being used. If the EngKeyDef_Supersession parameter is defined above the EngKeyDefPK parameter, all nodes belonging to the same EngKeyDef_Supersession node are part of the supersession.
Note: When enabled, split by series takes precedence over the split rules used below.
When allocating a forecast generated at a supersession level, the forecast is allocated to dates that are defined as active for each member of the superssion. In this scenario, when forecasting for a specific date to underlying combinations, two options are available:
Allocate forecast amongst all active revisions
Allocate forecast only to the latest active revisions
When allocating the forecast amongst all active revisions, the forecast should be split equally among the GL members whose activity dates specify that the member is active during this period.
Example: Using All Active Revisions
End of history defined as 12/1/2009
Allocation in the supersession is set to All Active Revisions
Amount of history used to calculate proportions is 12 months
A specific forecast node has five underlying combinations:
Latest Revision | Item | Effectivity Start | Effectivity End |
---|---|---|---|
AA | 1 | 1/1/2008 | 6/1/2009 |
AA | 2 | 1/1/2009 | 12/1/2009 |
AA | 3 | 5/1/2009 | 6/1/2010 |
AA | 4 | 1/1/2010 | 10/1/2010 |
AA | 5 | 6/1/2010 | 12/1/2011 |
The history is as follows:
Item | ||||
---|---|---|---|---|
1 | 2 | 3 | 4 | 5 |
History | ||||
500 | 1500 | |||
300 | 700 | |||
800 | 1200 | |||
1200 | 800 | |||
700 | 300 | 1000 | ||
500 | 500 | 1000 | ||
800 | 1200 | |||
1200 | 800 | |||
900 | 1100 | |||
400 | 1600 | |||
200 | 800 | |||
1000 | 1000 |
A forecast of 4000 is generated for each period. The future one-year forecast is allocated equally among the revisions active at each date as follows:
Proportions | ||||||
---|---|---|---|---|---|---|
Item | 1 | 2 | 3 | 4 | 5 | |
Date | Active Revisions | |||||
1/1/2010 | 3,4 | 0 | 0 | 4000/2 | 4000/2 | |
2/1/2010 | 3,4 | 0 | 0 | 4000/2 | 4000/2 | |
3/1/2010 | 3,4 | 0 | 0 | 4000/2 | 4000/2 | |
4/1/2010 | 3,4 | 0 | 0 | 4000/2 | 4000/2 | |
5/1/2010 | 3,4 | 0 | 0 | 4000/2 | 4000/2 | |
6/1/2010 | 3,4,5 | 0 | 0 | 4000/3 | 4000/3 | 4000/3 |
7/1/2010 | 4,5 | 0 | 0 | 0 | 4000/2 | 4000/2 |
8/1/2010 | 4,5 | 0 | 0 | 0 | 4000/2 | 4000/2 |
9/1/2010 | 4,5 | 0 | 0 | 0 | 4000/2 | 4000/2 |
10/1/2010 | 4,5 | 0 | 0 | 0 | 4000/2 | 4000/2 |
11/1/2010 | 5 | 0 | 0 | 0 | 0 | 4000 |
12/1/2010 | 5 | 0 | 0 | 0 | 0 | 4000 |
When the option of latest revision has been chosen, the forecast generated at a supersession level is only allocated to the latest revision active during that period. The latest revision is defined as the active GL member who has the latest effectivity start date.
Example: Using Latest Revision
End of history defined as 12/1/2009
GLPropSupsersessionMethod set to Latest Revision
A specific forecast node has five underlying combinations:
Latest Revision | Item | Efectivity Start | Effectivity End |
---|---|---|---|
AA | 1 | 1/1/2008 | 6/1/2009 |
AA | 2 | 1/1/2009 | 12/1/2009 |
AA | 3 | 5/1/2009 | 6/1/2010 |
AA | 4 | 1/1/2010 | 10/1/2010 |
AA | 5 | 6/1/2010 | 12/1/2011 |
The history is as follows:
Item | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
Date | History | ||||
1/1/2009 | 500 | 1500 | |||
2/1/2009 | 300 | 700 | |||
3/1/2009 | 800 | 1200 | |||
4/1/2009 | 1200 | 800 | |||
5/1/2009 | 700 | 300 | 1000 | ||
6/1/2009 | 500 | 500 | 1000 | ||
7/1/2009 | 800 | 1200 | |||
8/1/2009 | 1200 | 800 | |||
9/1/2009 | 900 | 1100 | |||
10/1/2009 | 400 | 1600 | |||
11/1/2009 | 200 | 800 | |||
12/1/2009 | 1000 | 1000 |
A forecast of 4000 is generated for each period. The future one-year forecast is allocated as follows:
Proportions | ||||||
---|---|---|---|---|---|---|
Item | 1 | 2 | 3 | 4 | 5 | |
Date | Latest Revision | |||||
1/1/2010 | 4 | 0 | 0 | 0 | 4000 | |
2/1/2010 | 4 | 0 | 0 | 0 | 4000 | |
3/1/2010 | 4 | 0 | 0 | 0 | 4000 | |
4/1/2010 | 4 | 0 | 0 | 0 | 4000 | |
5/1/2010 | 4 | 0 | 0 | 0 | 4000 | |
6/1/2010 | 5 | 0 | 0 | 0 | 4000 | |
7/1/2010 | 5 | 0 | 0 | 0 | 4000 | |
8/1/2010 | 5 | 0 | 0 | 0 | 4000 | |
9/1/2010 | 5 | 0 | 0 | 0 | 4000 | |
10/1/2010 | 5 | 0 | 0 | 0 | 4000 | |
11/1/2010 | 5 | 0 | 0 | 0 | 4000 | |
12/1/2010 | 5 | 0 | 0 | 0 | 4000 |
Note: In cases where supersessions exist, Oracle recommends that EngKeyDef_Supersession be set at a level aggregating the different supersession nodes together.