Advanced Predictions: Support for Regression Algorithms
Advanced Predictions now supports regression algorithms.
Applies To: FreeForm, Planning
You can choose from these algorithms:
- Linear Regression:
- Simple linear relationships.
- This method is simple and easy to understand. For example, "I want to forecast monthly sales based primarily on marketing spend and pricing. The relationship is straightforward; higher marketing drives higher sales, higher prices reduce sales."
- This method is best when there are one to five drivers with clear linear relationships.
- Lasso Regression:
- Linear with many drivers.
- This method is useful when you have many drivers. It automatically identifies and selects the most impactful drivers, and removes weak drivers. For example, "I have eight potential cost drivers for our operating expenses (labor, materials, utilities, maintenance, and so on), but I'm not sure which ones actually drives the variance. I want the system to automatically identify the key drivers."
- This method is best when you have many drivers and aren't sure which drivers matter.
- Ridge Regression:
- Linear with correlated drivers.
- This method retains all drivers while weighting their relative importance, and handles correlation well. For example, "I want to forecast fee revenue using assets under management, market index performance, and trading volume. These metrics are correlated with each other but all are important drivers of our revenue."
- This method when is best when drivers are correlated but all matter.
Regression model performance is evaluated using the existing Forecast Error Metrics:
- sMAPE—Symmetric Mean Absolute Percentage Error
- MAPE—Mean Absolute Percentage Error
- RMSE—Root Mean Squared Error
For regression models, the error measure reported is for the holdout period by doing an 90:10 split. A portion of historical data is set aside and not used during model training and is instead used to validate model performance:
- Historical data is split into training data (used to build the model) and holdout data (used for validation).
- The model is trained on the training dataset.
- Predictions are generated for the holdout period.
- Predictions are compared with actual values to measure accuracy. The accuracy and fitted values are calculated on the holdout period.
For example, if you have five years of monthly historical data (60 data points):
- These 60 data points are split 90:10. 90% of the data points (54) are used for training the model and 10% of the data points (6) are used as test data points.
- The test model is created by using 54 data points. The prediction is for 6 data points.
- The predictions and test data are compared and the following error measures are calculated: SMAPE, MAPE and RMSE.
- Accuracy is calculated as 100 – SMAPE.
Business Benefit: Regression algorithms are useful in some scenarios where time series forecasting is not applicable. For example, use regression algorithms in these cases:
- Specific drivers matter more than time patterns.
- You have clear causal relationships (marketing drives sales, price affects demand).
- You want to understand "what happens if we change X."
- You need to explain driver impacts to stakeholders.
- Time patterns are weak or irregular.
Tips and considerations
Regression algorithms work with feature engineering and feature selection.
Key resources
- Administering Planning
- Administering FreeForm