Based on the problem that needs to be solved, an advanced data analyst chooses an appropriate algorithm to train a predictive model and then evaluates the model's results.
Arriving at an accurate model is an iterative process and an advanced data analyst can try different models, compare their results, and fine tune parameters based on trial and error. A data analyst can use the finalized, accurate predictive model to predict trends in other data sets, or add the model to projects.
Oracle Analytics provides algorithms for numeric prediction, multi-classification, binary-classification and clustering. For information about how to choose an algorithm, see How Do I Choose a Training Model Algorithm?
The algorithms aren't available until you install Oracle machine learning into your local Oracle Analytics Desktop directory. See How do I install Machine Learning for Desktop?
- In the Home page, click Create and select Data Flow.
- Select the data set that you want to use to train the model. Click Add.Typically you'll select a data set that was prepared specifically for training the model and contains a sample of the data that you want to predict. The accuracy of a model depends on how representative the training data is.
- In the data flow editor, click Add a step (+).After adding a data set, you can either use all columns in the data set to build the model or select only the relevant columns. Choosing the relevant columns requires an understanding of the data set. Ignore columns that you know won't influence the outcome behavior or that contain redundant information. You can choose only relevant columns by adding the Select Columns step. If you're not sure about the relevant columns, then use all columns.
- Navigate to the bottom of the list and click the train model type that you want to apply to the data set.
- Select an algorithm and click OK.
- If you're working with a supervised model like prediction or classification, then click Target and select the column that you're trying to predict. For example, if you're creating a model to predict a person's income, then select the Income column.If you're working with an unsupervised model like clustering, then no target column is required.
- Change the default settings for your model to fine tune and improve the accuracy of the predicted outcome. The model you're working with determines these settings.
- Click the Save Model step and provide a name and description. This will be the name of the generated predictive model.
- Click Save, enter a name and description of the data flow, and click OK to save the data flow.
- Click Run Data Flow to create the predictive model based on the input data set and model settings that you provided.