Before you Begin

This 20 minute tutorial shows you how to build a Machine Learning (ML) model using R (one of the most commonly used ML tools to build ML models) and how to save the ML model as a PMML file.

Background

With Bring Your Own ML, EPM administrators can import a fully trained ML model and deploy it to a Planning business process. Planners can then leverage robust, ML-based forecasting that uses advanced predictive modeling techniques to generate more accurate forecasts.

Data scientists gather and prepare historical data related to a business problem, train the algorithm, and generate a PMML file (Predictive Model Markup Language, a standard language used to represent predictive models) using a third party tool. These predictive analytic models and machine learning models use statistical techniques or ML algorithms to learn patterns hidden in large volumes of historical data. Predictive analytic models use the knowledge acquired during training to predict the existence of known patterns in new data.

EPM administrators can then import and configure the fully trained ML model, which generates two Groovy rules. Administrators attach the rule to a form or dashboard, or schedule a job to generate prediction results on a regular basis. This puts the benefits of machine learning and the power of data science into the hands of business users, enhancing the planning and budgeting process and leading to better business decisions.

For example, you can predict product volume for an entity, using key drivers such as average sales price, planned spend on promotions and advertising, historical volumes, and estimated industry volumes.

In this tutorial, you create an ML model in R and save it as a PMML file.

What Do You Need?

You must have RStudio or R 4.2.1 (or higher) installed and running on your computer. This tutorial uses RStudio Pro 2022.07. Verify that the PMML and R2PMML packages are installed.

  • Download this csv file that contains a sample dataset.
  • Download this R script file that contains commands to create a simple R model and export to PMML.
  • Save both files into a folder named (ml) on your C Drive (C:/ml). This will be the working directory.
  • Obtaining and reviewing the training data for the ML Model

    The first step in the process is to prepare and review the historical data for use to train the ML model. This tutorial uses the same dataset used in the Importing ML Models tutorial. The training data is provided in a CSV file (“volume_forecasting_training_jan_2020.csv”). The dataset uses 4 input drivers/features to predict revenue (which is the target variable, i.e. variable that we want to predict).

    1. Open the Volume_forecasting_training_Jan.2020.csv file and review the data.
      view data
    2. Close the data file.

    Reviewing and executing the R script

    Once historical data is finalized for ML training, the next step is to create and execute an R script to build the ML model. For the purpose of this tutorial, an R script has been created for you (“volume_forecasting_ml.R”). The R script does the following:

  • Loads required libraries “r2pmml” and “pmml” (Packages in R that are needed to save the ML model as a PMML file)
  • Imports the historical training data into R
  • Preprocesses the data (converts a variable into factor data type)
  • Builds a multivariate regression model to predict “volume” using “lm” function
  • Views a summary of the regression model including its coefficients and R squared
  • Saves the model as a PMML file using two different options (using the PMML and R2PMML packages)
    1. In RStudio Pro, select File, then Open File. In the Open file dialog box, select the volume_forecasting_ml.R script.
      Open R script file
    2. With the script opened, move the cursor to Line 1, then click Run. The system returns that the r2pmml library was loaded, and moves the cursor to the next line in the script.
      line 1 Run
      .
    3. Click Run to execute line 2. The pmml library is loaded.
      run line 2
      .
    4. Click Run to execute Line 4 (Line 3 is a comment). The working directory is set to C:/ml.
      run line 4
    5. Click Run 3 times to execute Lines 6, 7, and 8. These commands import the dataset, modify the nature of the Product column, and attach the data.
      run line 6, 7, and 8
    6. Optional: Click train to view the dataset.
      view dataset
    7. Click Run 2 times to run lines 10 and 11. These commands create the model and show a Summary.
      create model and show summary
    8. Click Run 2 times to save the model as a PMML file (using the PMML and R2PMML Packages).
      SaveAsPMML

      save as PMML

    9. The PMML file can now be imported into the Planning Business Process as outlined in the Importing ML Models tutorial.

    Learn More