Apply a Predictive or Registered Oracle Machine Learning Model to a Data Set

Use the data flow editor to score a predictive model on any data set, or score a registered Oracle machine learning model on a data set in its corresponding database.

Running the model outputs a new data set with columns containing predicted values that you can use for analysis and visualization.
When you run a predictive model, the data is moved into and processed by Oracle Analytics. When you run a registered Oracle machine learning model, data isn't moved from the database into Oracle Analytics. Instead the model resides, is processed, and the output data set is stored in the database.

Use this information to understand the data flow editor and Apply Model step options:

  • The registered models are displayed and available for review and analyses. Unregistered models aren't displayed.

  • The available output columns are specific to the model type. For example, for numeric prediction, output columns include PredictedValue and PredictedConfidence, and for clustering, output columns include the clusterId.

  • The available parameters are specific to the model type. For example, if you use a clustering model for scoring, maximum null values is a parameter that you can provide for the scoring process. This parameter is used in the missing value imputation.

  • The model and the mapped input data types must match when you're working with an Oracle machine learning model. See View a Registered Model's Details.

  1. On the Home page, click Create, and then click Data Flow.
  2. Select the data set that you want to apply the model to. Click Add.
  3. In the Data Flow editor, click Add a step (+).
  4. From the Data Flow Steps pane, double-click Apply Model, and then select the model to use.
  5. In Apply Model, go to the Inputs section, and then select a column as the input.
  6. In Apply Model, go to the Outputs section, and then select the columns that you want created with the data set, and update the Column Name fields as needed.
  7. In the data flow editor, click Add a step (+) and select Save Data.
  8. Enter a name. In the Save data to field, specify the location for saving the output data.
    If you're working with an Oracle machine learning model, then the data set's connection information defaults to the input data set's connection.
  9. Set data preferences as needed in the Treat As and Default Aggregation fields.
    When you save data, the apply model appends the model's output columns that you selected to the data set.
  10. Click Save, enter a name and description for the data flow, and click OK to save the data flow.
  11. Click Run Data Flow to create the data set.