Creating a Model Deployment with Autoscaling
Learn how to create a model deployment with compute and load balancer autoscaling configured.
Consider using a custom scaling metric type for setting up autoscaling with more advanced options and metrics on the model deployment.
Ensure that you have added the necessary policy required for autoscaling to work.
You can create and run model deployments using the Console, the OCI CLI, or the Data Science API.
Use the oci data-science model-deployment create command and required parameters to create a model deployment:
oci data-science model-deployment create --required-param-name variable-name ... [OPTIONS]
For example, create a deployment with:Use this model deployment JSON configuration file with this example:oci data-science model-deployment create \ --compartment-id <MODEL_DEPLOYMENT_COMPARTMENT_OCID> \ --model-deployment-configuration-details file://<MODEL_DEPLOYMENT_CONFIGURATION_FILE> \ --project-id <PROJECT_OCID> \ --display-name <MODEL_DEPLOYMENT_NAME>
{ "deploymentType": "SINGLE_MODEL", "modelConfigurationDetails": { "modelId": "ocid1.datasciencemodel....", "instanceConfiguration": { "instanceShapeName": "VM.Standard.E4.Flex", "modelDeploymentInstanceShapeConfigDetails": { "ocpus": 1, "memoryInGBs": 16 } }, "scalingPolicy": { "policyType": "AUTOSCALING", "coolDownInSeconds": 600, "isEnabled": true, "autoScalingPolicies": [ { "autoScalingPolicyType": "THRESHOLD", "initialInstanceCount": 1, "maximumInstanceCount": 2, "minimumInstanceCount": 1, "rules": [ { "metricExpressionRuleType": "PREDEFINED_EXPRESSION", "metricType": "CPU_UTILIZATION", "scaleInConfiguration": { "scalingConfigurationType": "THRESHOLD", "threshold": "10" }, "scaleOutConfiguration": { "scalingConfigurationType": "THRESHOLD", "threshold": "65" } } ] } ] }, "bandwidthMbps": 10, "maximumBandwidthMbps": 20 } }
For a complete list of parameters and values for CLI commands, see the CLI Command Reference.
Use the CreateModelDeployment operation to create a model deployment.