- Enterprise Modeling User Guide
- Managing Modeling
- Managing Modeling
- Creating and Executing a Model
- Creating a Model Using R Scripted Technique
Creating a Model Using R Scripted Technique
- Select the Add icon from the
Model Management toolbar. The
Model Definition window is displayed. The
Add button is disabled if you
have selected any Model ID in the grid.
Figure 9-3 Model Definition page
- Enter the Model Definition Details. The
common fields are described in the following table. The grid
below the Model Details section
displays the various tabs available for the selected
technique. To update the required information, see the
following sections:
Table 9-1 Model Definition - Field and its Description
Field Description Model Name Specify a model name for the model definition. Model Name is case sensitive and does not allow duplication. For example, the model name "Linear Regression" is not allowed if a model with the name "linear regression" exists. Ensure that there are no special characters like `, {,},", ', ~, <,>, /, \, and multiple spaces. Model Description Enter a description of the model. Do you like to script the model? Select the checkbox to script the model in the Model Script pane. Model Objective Select the Model Objective from the drop-down list. You can also click the Add icon to create a Model Objective. Technique This field is disabled if you have selected to script the model. - 1. Click and open the Technique Selection window. The pre-packaged techniques and user-defined (registered and authorized) R techniques are listed in the Techniques pane.
- Click the + icon and expand the technique heading groups.
- Select the required technique and click the Forward Arrow icon.
- Click OK. The selected Technique details are displayed in the Model Definition window.
If you have selected the R technique, click the Book icon to view the script.
Dataset By default, the dataset of the Sandbox is displayed. You can change the dataset if necessary. Dataset selection is mandatory:
For models based on R scripted techniques if variables are declared in the R script.
- Click to open the Dataset Selection window. The available datasets are listed in the Datasets pane.
- Select a dataset and click to view the details
of the selected datasets.
Click to create a new dataset. For more details on creating a dataset, refer to the Creating Data Set section in Oracle Financial Services Analytical Applications Infrastructure User Guide. You can create a dataset using any of the tables which are part of the production information domain. But if you create a dataset with a table that is not part of the Sandbox and create a model using that dataset, then deploy the model to production Infodom and execute it there.
- Select the required Dataset based on which the model is to be created and click the Forward Arrow icon. Ensure the selected dataset is loaded with data, otherwise model execution will fail. You can select multiple datasets for models executed using Standard R Engine. If multiple datasets are used, you should use at least a variable from each dataset.
- Click OK.
Note:
The Datasets based on Derived Entities are not supported.Language This field is not displayed for techniques based models. Select the scripting language from the drop-down list. The options are: - R
- Input Data Type
Type This field is not displayed for techniques based models. Select the type of engine from the drop-down list. The options are: - Standard R Engine
- ORE Engine - This option is not displayed in Hive-based Infodoms.
Input Data Type This field is displayed only in Hive-based Infodoms for models based on R scripted techniques or if you select to script the model. Select the input data type. The options are ORE Frame, Data Frame, and HDFS File. Fields marked in red asterisk (*) are mandatory. - Click Save to save the model definition details, after all the necessary details are updated.
- Click Preview Data to view the data of
the selected dataset. It displays the primary keys and the
attributes/ columns of the tables in the selected dataset.
Only those columns which are mapped to the variables in the
script are displayed.
Note:
In the case of the Hive-based Sandbox information domain, previewing data takes a long time and only 100 records are displayed. - 5. Click Execute.The Execution Status grid displays the model execution log dynamically.
Note:
For R-based models, the execution may fail if the dataset contains internal joins. Executing a model using the standard R engine with the new Cloudera jars is failing with model queries exceeding a certain limit. The workaround is to append UseNativeQuery=1 in the JDBC URL of the Hive schemas in which model definitions and executions happen. For example, jdbc:hive2:192.168.1.0:1000/default;useNativeQuery=1