Create a Machine Learning Model from Scratch

You can create a machine learning model completely from scratch. You can use the model to either predict similar records or predict outcomes for accounts, contacts, opportunities, and sales leads.

Creating a model from scratch involves additional setup in the Basic Information step. Otherwise the process is the same as for creating a similar accounts model from the one provided by Oracle.

  1. Navigate to the Machine Learning Models page.
  2. On the Machine Learning Models page, click Create.
  3. In the Basic Information step, enter the following:

    • Model name
    • Name of the use case the model covers
    • Optional description
  4. Select an object. Options are:
    • Account
    • Contact
    • Opportunity
    • Sales Lead
  5. Select the model type, either Identify Similar Records or Predict Outcome.
  6. If you selected Predict Outcome, then select the attribute you want to predict. For example, to predict if an opportunity will be won or lost, then select status category.
  7. Select the records that you want to include in the data set.
  8. Click Next.
  9. On the Attribute Selection page, add the attributes you want to use in the model.
    Note: You can now add attributes from additional related records to generate more exact predictions and better insights including the relevant predictors from related records.
    For example, you can now build an opportunity prediction model including attributes from the account related record. The following related records are supported:
    • Opportunity account
    • Opportunity primary contact
    • Lead account
    • Lead primary contact
    • Custom object relationship for all parent objects
    • Custom object relationship for models built for custom parent objects
  10. Click Prepare Data to validate if enough data exists to create a model from the attributes you selected.

    The Status column on the Machine Learning page shows the Running status while the validation process is running.

  11. Click Refresh to check for status changes.

  12. A status of Error means that you don't have enough data for some attributes you selected for your model. At least 30 percent of the records must have a value for the attribute. Here's what to do:
    1. Click the name link of your model.
    2. Click the Attribute Selection step to see a list of the errors.
    3. Delete the attributes that you can't use.
    4. Click Prepare Data again.
  13. If your model is in the Prepared status, you've enough data to analyze and tweak your model further.
  14. Click the Features step.
  15. Click Actions > Edit for any of the attributes to fine tune you model by categorizing the values.
  16. On the Calculation Type page, you can provide one or more categories for the model to consider. Categories affect the way that the model learns and how it creates clusters of records. For example, if you find out that your model has too many unique values for a particular attribute, you can come here and group them. Which calculation type is available depends on the attribute:
    • Date range: Calculates the number of days between two specified date attributes.
    • Date range bucket: Creates categories of number of days between two specified date attributes.
    • Age Date bucket: Use this calculation type for date-time attributes such as Creation Date. You can define age groups, such as 0 to 1 year old, 1 to 2 years old, and 2 to 3 years old, instead of a date. You model will consider age buckets when identifying the similarity or the relationship between this feature and the prediction outcome.
    • Number bucket: Use this calculation type for numeric attributes like Potential Revenue and Organization Size. For example, instead of searching by a deal amount, you can create number buckets of equal ranges, such as 0 to 100,000, and 100,000 to 200,000. You model will consider number buckets when identifying similarity or relationship between this feature and the prediction outcome.
    • Category: You can use this calculation type to create categories based on attribute conditions. For example, you can categorize accounts by world regions, by grouping countries as Latin America, North America, Asia Pacific, and Europe. Or you can categorize accounts as Large, Medium, or Small, using Opportunity Revenue or Organization Size.
  17. Here's how to classify the countries where you do business by geographical regions, for example:
    1. Click Actions > Edit for Country.
    2. From the Calculation Type list, select Category.
    3. In the Category Value field, enter North America.
    4. From the Operator list, select Equals
    5. In the Value field, search and enter one of the countries in North America.
    6. Click Add Another Rule (the plus sign) and add a second country.
    7. Repeat the process until you added all the countries in the category.
    8. Click Add Category to add additional categories.
    9. Click Done.

    Fine-tune the parameters of an attribute.
  18. When you're done adding calculation types, click Submit to run the model.
  19. Click Refresh to refresh the status.
  20. When the model status changes from Running to Ready, click Actions > Edit. The Actions menu is the hamburger icon on the right side of the page.
  21. Click the Review step to review similar accounts the model finds for any account you select:
  22. From the Account field, select an account to display the similar accounts predicted by the model.
  23. Click Analyze to review information about your model on the Analysis Report page.
  24. The report includes tips on how you can improve the model. For example, the report can tell you that some fields have too few unique values and that others have too many.

    If a field has too few values, you'll need to either import more data or eliminate that attribute from the model. If there are too many values, you'll need to group them into categories on the Features step.

    Here's some other useful information:

    • Algorithm Selected: The algorithm that was run for your model. You can't change the algorithm, so you can ignore this field.
    • Model Accuracy: Shows the accuracy of your model as a percentage. A good model accuracy is above 75 percent. If the value is less, change the attributes to improve the accuracy.
    • Prediction accuracy report : Shows per class prediction accuracy. A good model would have the desired accuracy for each predicted class. For example, an opportunity outcome prediction model will have the desired accuracy in predicting both won records and lost records.
    • Number of Clusters: Number of account groupings in your model. The more clusters that you have the fewer similar accounts that you get. About 20 clusters is a good number.
    • The Data Analysis tab shows you the number of distinct values for each attribute and the percent empty values, for example.
    • The Model Analysis tab shows the number of clusters and a pie chart of their distribution.
  25. Click Next.
  26. Specify how to store the prediction outcome for the predict outcome model type:
    • Scoring recurrence: Set the frequency for making predictions and storing them, depending on how often you want to refresh (score again) the prediction.
    • Prediction data set: Add data set filters to identify the prediction data set. For example, make a prediction for all open opportunities or leads.
  27. On the Deploy page: enter the date and time that you want to start running the m and recurrence.
    1. Date and Time: Enter the date and time that you want to start the schedule from.
    2. Recurrence: Set the frequency for rebuilding your model, depending on how often your data changes.
      • Daily
      • Weekly
      • Monthly
  28. Click Deploy.

    Your model is now active.

    Note: Only one model can be active at a time. If you already have an active model, you must confirm that you want to replace it.