Before you Begin
This 20 minute tutorial shows you how to build a Machine Learning (ML) model using R (one of the most commonly used ML tools to build ML models) and how to save the ML model as a PMML file.
Background
With Bring Your Own ML, EPM administrators can import a fully trained ML model and deploy it to a Planning business process. Planners can then leverage robust, ML-based forecasting that uses advanced predictive modeling techniques to generate more accurate forecasts.
Data scientists gather and prepare historical data related to a business problem, train the algorithm, and generate a PMML file (Predictive Model Markup Language, a standard language used to represent predictive models) using a third party tool. These predictive analytic models and machine learning models use statistical techniques or ML algorithms to learn patterns hidden in large volumes of historical data. Predictive analytic models use the knowledge acquired during training to predict the existence of known patterns in new data.
EPM administrators can then import and configure the fully trained ML model, which generates two Groovy rules. Administrators attach the rule to a form or dashboard, or schedule a job to generate prediction results on a regular basis. This puts the benefits of machine learning and the power of data science into the hands of business users, enhancing the planning and budgeting process and leading to better business decisions.
For example, you can predict product volume for an entity, using key drivers such as average sales price, planned spend on promotions and advertising, historical volumes, and estimated industry volumes.
In this tutorial, you create an ML model in R and save it as a PMML file.
What Do You Need?
You must have RStudio or R 4.2.1 (or higher) installed and running on your computer. This tutorial uses RStudio Pro 2022.07. Verify that the PMML and R2PMML packages are installed.
Obtaining and reviewing the training data for the ML Model
The first step in the process is to prepare and review the historical data for use to train the ML model. This tutorial uses the same dataset used in the Importing ML Models tutorial. The training data is provided in a CSV file (“volume_forecasting_training_jan_2020.csv”). The dataset uses 4 input drivers/features to predict revenue (which is the target variable, i.e. variable that we want to predict).
- Open the Volume_forecasting_training_Jan.2020.csv file and review the data.
- Close the data file.
Reviewing and executing the R script
Once historical data is finalized for ML training, the next step is to create and execute an R script to build the ML model. For the purpose of this tutorial, an R script has been created for you (“volume_forecasting_ml.R”). The R script does the following:
- In RStudio Pro, select File, then Open File. In the Open file dialog box, select the volume_forecasting_ml.R script.
- With the script opened, move the cursor to Line 1, then click Run. The system returns that the r2pmml library was loaded, and moves the cursor to the next line in the script..
- Click Run to execute line 2. The pmml library is loaded..
- Click Run to execute Line 4 (Line 3 is a comment). The working directory is set to C:/ml.
- Click Run 3 times to execute Lines 6, 7, and 8. These commands import the dataset, modify the nature of the Product column, and attach the data.
- Optional: Click train to view the dataset.
- Click Run 2 times to run lines 10 and 11. These commands create the model and show a Summary.
- Click Run 2 times to save the model as a PMML file (using the PMML and R2PMML Packages).
The PMML file can now be imported into the Planning Business Process as outlined in the Importing ML Models tutorial.
Learn More
Building ML models and PMML files
F60498-01
July 2022
Copyright © 2022, Oracle and/or its affiliates.
This tutorial shows you how to build Machine Learning (ML) models using R (one of the most commonly used ML tools to build ML models) and save ML models as PMML files.
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract. The terms governing the U.S. Government's use of Oracle cloud services are defined by the applicable contract for such services. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Inside are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Epyc, and the AMD logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.