Creating a Model Deployment with Autoscaling

Learn how to create a model deployment with compute and load balancer autoscaling configured.

Consider using a custom scaling metric type for setting up autoscaling with more advanced options and metrics on the model deployment.

Ensure that you have added the necessary policy required for autoscaling to work.

You can create and run model deployments using the Console, the OCI CLI, or the Data Science API.

    1. Use the Console to sign in to a tenancy with the necessary policies.
    2. Open the navigation menu and click Analytics & AI. Under Machine Learning, click Data Science.
    3. Select the compartment that contains the project with the model deployments.

      All projects in the compartment are listed.

    4. Click the name of the project.

      The project details page opens and lists the notebook sessions.

    5. Under Resources, click Model deployments.

      A tabular list of model deployments in the project is displayed.

    6. Click Create Model Deployment.
    7. Follow the steps in Creating a Model Deployment to configure the model deployment.
    8. Under Autoscaling configuration, select Enable autoscaling.
      Several lists and fields are displayed to let you configure the autoscaling.
    9. (Optional) Update the values in each list or field as appropriate for the configuration.
    10. (Optional) To autoscale the load balancer, click Show advanced options.
      Set the value of maximum bandwidth to be more than the minimum bandwidth value, and no more than twice the minimum bandwidth value.
    11. Click Create.
  • Use the oci data-science model-deployment create command and required parameters to create a model deployment:

    oci data-science model-deployment create --required-param-name variable-name ... [OPTIONS]
    For example, create a deployment with:
    oci data-science model-deployment create \
    --compartment-id <MODEL_DEPLOYMENT_COMPARTMENT_OCID> \
    --model-deployment-configuration-details file://<MODEL_DEPLOYMENT_CONFIGURATION_FILE> \
    --project-id <PROJECT_OCID> \
    --display-name <MODEL_DEPLOYMENT_NAME>
    Use this model deployment JSON configuration file with this example:
    {
      "deploymentType": "SINGLE_MODEL",
      "modelConfigurationDetails": {
        "modelId": "ocid1.datasciencemodel....",
        "instanceConfiguration": {
          "instanceShapeName": "VM.Standard.E4.Flex",
          "modelDeploymentInstanceShapeConfigDetails": {
            "ocpus": 1,
            "memoryInGBs": 16
          }
        },
        "scalingPolicy": {
          "policyType": "AUTOSCALING",
          "coolDownInSeconds": 600,
          "isEnabled": true,
          "autoScalingPolicies": [
            {
              "autoScalingPolicyType": "THRESHOLD",
              "initialInstanceCount": 1,
              "maximumInstanceCount": 2,
              "minimumInstanceCount": 1,
              "rules": [
                {
                  "metricExpressionRuleType": "PREDEFINED_EXPRESSION",
                  "metricType": "CPU_UTILIZATION",
                  "scaleInConfiguration": {
                    "scalingConfigurationType": "THRESHOLD",
                    "threshold": "10"
                  },
                  "scaleOutConfiguration": {
                    "scalingConfigurationType": "THRESHOLD",
                    "threshold": "65"
                  }
                }
              ]
            }
          ]
        },
        "bandwidthMbps": 10,
        "maximumBandwidthMbps": 20
      }
    }

    For a complete list of parameters and values for CLI commands, see the CLI Command Reference.

  • Use the CreateModelDeployment operation to create a model deployment.