Model Deployment

The ads.model.deployment SDK allows you to deploy models using the Oracle Cloud Infrastructure (OCI) Data Science service. This SDK is built on top of the oci Python SDK to simplify data science workflows.

Example Notebook: Model Deployment with ADS

The focus of this notebook is to demonstrate how to deploy a model using the data science friendly ads.model.deployment SDK. The notebook shows you how to list, update, and delete model deployments. Also, it demonstrates how to make a prediction using the deployed model, and obtain the logs.


Placeholder text for required values are surrounded by angle brackets that must be removed when adding the indicated content. For example, when adding a database name to database_name = "<database_name>" would become database_name = "production".

Datasets are provided as a convenience. Datasets are considered Third Party Content and are not considered Materials under your agreement with Oracle applicable to the Services. The Dataset multiclass_fk_10k is distributed under the UPL license.

import ads
import json
import logging
import oci
import os
import random
import shutil
import string
import tempfile
import uuid
import warnings

from ads.catalog.model import ModelCatalog
from ads.common.model import ADSModel
from ads.dataset.factory import DatasetFactory
from oci.data_science import models
from ads.model.deployment import ModelDeployer, ModelDeploymentProperties
from sklearn.ensemble import RandomForestClassifier

logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.ERROR)


Model deployments are a managed resource in the OCI Data Science service that allow you to deploy machine learning models as HTTP endpoints. Deploying machine library models as web applications (HTTP API endpoints) serving predictions in real time is the most common way that models are productized. HTTP endpoints are flexible and can serve requests for model predictions.

You train a model and store it in the model catalog. Then you deploy the model using the model deployment resource.

After a model is saved to the model catalog, it becomes available for deployment as a model deployment resource. The service supports models running in a Python runtime environment, and their dependencies can be packaged in a conda environment. This notebook demonstrates how to deploy a model using the friendly data scientist ads.model.deployment SDK.

Model deployment requires that you specify an inference conda environment in the runtime.yaml model artifact file. This inference conda environment contains all of your model dependencies, and is installed in the model server container. You can specify either a service or customer-managed conda environment.

Model Deployment Components

Model deployments rely on these key components to deploy a model as an HTTP endpoint:

Load Balancer

A copy of the inference conda environment and the selected model artifact are also copied to each instance in the pool. Two copies of the model are loaded to memory for each OCPU of each VM instance in the pool. For example, if you select a VM.Standard2.4 instance to run the model server, then 4 OCPUs x 2 = 8 copies of the model are loaded to memory. Multiple copies of the model help to handle concurrent requests that are made to the model endpoint by distributing those requests among the model replicas in the VM memory. Ensure that select a VM shape with a large enough memory footprint to account for those model replicas in memory. For most machine learning models with sizes in MBs or the low GBs, memory is likely not an issue.

A pool of VM instances host the model server, the conda environment, and the model itself. A copy of the model server is made to each compute instance.

A copy of the inference conda environment and the selected model artifact are also copied to each instance in the pool. Two copies of the model are loaded to memory for each OCPU of each VM instance in the pool. For example, if you select a VM.Standard2.4 instance to run the model server, then 4 OCPUs x 2 = 8 copies of the model are loaded to memory. Multiple copies of the model help to handle concurrent requests that are made to the model endpoint by distributing those requests among the model replicas in the VM memory. Ensure that select a VM shape with a large enough memory footprint to account for those model replicas in memory. For most machine learning models with sizes in MBs or the low GBs, this is typically not an issue.

The load balancer distributes requests made to the model endpoint among the instances in the pool. We recommend that you use smaller VM shapes to host the model with a larger number of instances as opposed to selecting fewer though larger VMs.

Model Catalog

Model deployment requires a model artifact that is stored in the model catalog and that the model is in an active state. Model deployment exposes the predict() function defined in the file of the model artifact.

Conda Environment

A conda environment encapsulates all the third-party Python dependencies (like Numpy, Dask, or XGBoost) that your model requires. Model deployment pulls a copy of the inference conda environment defined in the runtime.yaml file of the model artifact to deploy the model and its dependencies. The relevant information about the model deployment environment is under the MODEL_DEPLOYMENT parameter in the runtime.yaml file.

In this example, the runtime.yaml file instructs the model deployment to pull the published conda environment from the Object Storage path defined by INFERENCE_ENV_PATH. It then installs it on all instances of the pool hosting the model server, and the model itself.

If you don’t want to use a customer-managed conda environment, the alternative is to use a service conda environment with INFERENCE_ENV_TYPE = 'data_science' with the path on Object Storage available in the Environment Explorer tool or in the included conda environments.

INFERENCE_ENV_PATH: oci://<bucket-name>@<namespace>/<prefix>/<env>

These parameters are automatically captured when a model is saved using ADS in a notebook session. If you want to save a model to the catalog and deploy it using the oci SDK, CLI, or the Console, you have to provide a runtime.yaml file as part of your model artifact that includes those parameters.

If MODEL_DEPLOYMENT is missing from the runtime.yaml file, then a default conda environment is used and installed in the model server and is used to load your model. The default conda environment is the Classic CPU Notebook Session Kernel (version 1.0) .

For all model artifacts saved in the catalog without a runtime.yaml file, model deployments also use the default Classic CPU Notebook Session Kernel (version 1.0) conda environment for model deployment. A model deployment can also pull a Data Science conda environment or a conda environment that you create or modify then publish.

Model deployment logging can be integrated with the OCI Logging service. This option allows you to emit logs from a model and then inspect these logs.


A model deployment requires d a model. In this example, a toy random forest model is created in the model catalog. This is the model that is deployed. The notebook also set sup a connection to the logger so that logs can be emitted.

Model Creation

The next cell uses the oracle_classification_dataset1_150K.csv dataset to create a random forest model, and store it in the model catalog.

compartment_id = os.environ['NB_SESSION_COMPARTMENT_OCID']
project_id = os.environ['PROJECT_OCID']

# Load the dataset
ds_path = os.path.join("/", "opt", "notebooks", "ads-examples", "oracle_data",
ds =, target="Labels")
ds = ds.drop(['F1', 'F2', 'F3', 'F4', 'F5'], axis=1)

# Build the model and convert it to an ADSModel object
train, test = ds.train_test_split(test_size=0.15)
clf = RandomForestClassifier(n_estimators=10).fit(train.X.values, train.y.values)
model = ADSModel.from_estimator(clf)

# Prepare the model artifacts
artifact_path = tempfile.mkdtemp()
artifact_model = model.prepare(artifact_path, force_overwrite=True,
                                     data_sample=test, data_science_env=True)

# Store the model in the Model Catalog
mc_model =, compartment_id=compartment_id,
                               display_name="RF Classifier",
                               description="A sample Random Forest classifier",
model_id =
print(f"Model OCID: {model_id}")
HBox(children=(HTML(value='loop1'), FloatProgress(value=0.0, max=4.0), HTML(value='')))
HBox(children=(HTML(value='loop1'), FloatProgress(value=0.0, max=8.0), HTML(value='')))
HBox(children=(HTML(value='loop1'), FloatProgress(value=0.0, max=4.0), HTML(value='')))
Model OCID: ocid1.datasciencemodel.oc1.iad.amaaaaaav66vvniaylwcfs7wai6dbxndhsqcj3xyd6nsi27mlcjwzaymbaaa

Setup Logging

The Logging service is a highly scalable and managed service for logging. Logging provides logs from OCI resources like as model deployment. For model deployment, there is support for logging the access to the model deployment and predictions that are made.

Log groups are logical containers for logs. Generally, you have a log group that contains logs for access and predictions on a deployed model. The access and prediction logs <> are custom logs.

The next cell creates a log group along with the custom access and prediction logs.

# Generate a random log group and log name
log_group_name = "ModelDeployment-Demo-" + str(uuid.uuid4())
access_log_name = "ModelDeployment-Demo-Access_Log-" + str(uuid.uuid4())
predict_log_name = "ModelDeployment-Demo-Predict_Log-" + str(uuid.uuid4())

# Create a log group
response = !oci logging log-group create --compartment-id=$compartment_id --display-name=$log_group_name --wait-for-state SUCCEEDED --wait-for-state FAILED
    data = json.loads("".join(response[1:]))
    log_group_ocid = data.get("data").get("resources")[0].get("identifier")
    print(f"Log group OCID: {log_group_ocid}")
    raise Exception(response)

# Create an access log in the log group
response = !oci logging log create --log-group-id=$log_group_ocid --log-type=CUSTOM --display-name=$access_log_name --wait-for-state SUCCEEDED --wait-for-state FAILED
    data = json.loads("".join(response[1:]))
    access_log_ocid = data.get("data").get("resources")[0].get("identifier")
    print(f"Access log OCID: {access_log_ocid}")
    raise Exception(response)

# Create a predict log in the log group
response = !oci logging log create --log-group-id=$log_group_ocid --log-type=CUSTOM --display-name=$predict_log_name --wait-for-state SUCCEEDED --wait-for-state FAILED
    data = json.loads("".join(response[1:]))
    predict_log_ocid = data.get("data").get("resources")[0].get("identifier")
    print(f"Predict log OCID: {predict_log_ocid}")
    raise Exception(response)
Log group OCID: ocid1.loggroup.oc1.iad.amaaaaaav66vvniafls4ob7thr6ad6wb4ngruqgz7i4snblu57okk7cgjmpa
Access log OCID: ocid1.log.oc1.iad.amaaaaaav66vvniafnknpig4h5666o5fnco57fsclvb3dil2mxayvr2f5bja
Predict log OCID: ocid1.log.oc1.iad.amaaaaaav66vvniagny6fqadznearicmj6hw3skzuyh2tnduah3yb4bu57aa

Deploy a Model

Next, you use the ads.model.deployment SDK to deploy a model, which designed for data science workflows.

The ads.model.deployment SDK provides three classes for model deployment to simplify the process:

  • ads.model.deployment.ModelDeploymentProperties: Stores the properties mapping to OCI model deployment models.

  • ads.model.deployment.ModelDeployer: Creates a new deployment, lists existing deployments, and gets/updates/deletes an existing deployment.

  • ads.model.deployment.ModelDeployment: Encapsulates the information and actions for an existing deployment.


The ModelDeploymentProperties class acts as a container to store model deployment properties. This class requires the model OCID. String properties are set with the with_prop() method. You use it to chain together properties like the display name, project OCID, and compartment OCID. The with_access_log() and with_predict_log() methods define the logging properties. Alternatively, you could use the with_logging_configuration() method to define the predict and access log properties as a single method. The with_instance_configuration() method defines the instance shape, count and bandwidth.

# Initialize ModelDeploymentProperties
model_deployment_properties = ModelDeploymentProperties(
    'display_name', "Model Deployment Demo using ADS"
    "project_id", project_id
    "compartment_id", compartment_id
    log_group_ocid, access_log_ocid, log_group_ocid, predict_log_ocid
    config={"INSTANCE_SHAPE":"VM.Standard2.1", "INSTANCE_COUNT":"1",'bandwidth_mbps':10}


The ModelDeployer class is used to deploy a model. This class fetches the default OCI config setting, initializes the DataScienceClient, DataScienceCompositeClient, and LogSearchClient objects. You can authentication using API keys or resource principals.

The deploy() method is used to create a model deployment. In addition, it has these parameters that control the behavior of the deployment:

  • max_wait_time: The timeout limit for the deployment process to wait until it is active. Defaults to 1200 seconds.

  • poll_interval: The interval between checks of the deployment status, in seconds. The defaults is 30 seconds.

  • wait_for_completion: Blocked process until the deployment has completed. Defaults to True.

There are two ways to use the deploy() method. You can create a ModelDeploymentProperties object and pass that in, or you can define the model deployment properties in the deploy() method.

Deploy a Model with a ``ModelDeploymentProperties`` Object

Once a ModelDeploymentProperties object has been created, model_deployment_properties in this example, a model can be deployed with this code snippet:

deployer = ModelDeployer()
deployment = deployer.deploy(model_deployment_properties)

Deploy a Model without a ``ModelDeploymentProperties`` Object

Depending on your use case, it may be more convenient to skip the creation of a ModelDeploymentProperties object and create the model deployment directly using the deploy() method. You do this by passing the using keyword arguments instead of ModelDeploymentProperties. You specify the model deployment properties as parameters in the deploy() method.

To define the model deployment properties use the following parameters:

  • access_log_group_id: Log group OCID for the access logs. Required if access_log_id is specified.

  • access_log_id: Custom logger OCID for the access logs. Required if access_log_group_id is specified.

  • bandwidth_mbps: The bandwidth limit on the load balancer in Mbps. Optional.

  • compartment_id: Compartment OCID that the model deployment belongs to.

  • defined_tags: A dictionary of defined tags to be attached to the model deployment. Optional.

  • description: A description of the model deployment. Optional.

  • display_name: A name that identifies the model deployment.

  • freeform_tags: A dictionary of freeform tags to be attached to the model deployment. Optional.

  • instance_count: The number of instances to deploy.

  • instance_shape: The instance shape to use. For example, “VM.Standard2.1”

  • model_id: Model OCID that is used in the model deployment.

  • predict_log_group_id: Log group OCID for the predict logs. Required if predict_log_id is specified.

  • predict_log_id: Custom logger OCID for the predict logs. Required if predict_log_group_id is specified.

  • project_id: Project OCID that the model deployment will belong to.

deployer = ModelDeployer()
deployment = deployer.deploy(
    display_name="Model Deployment Demo using ADS",
    # The following are optional

deployment_id = deployment.model_deployment_id
print(f"Deployment {deployment_id} is {}")
HBox(children=(HTML(value='loop1'), FloatProgress(value=0.0, max=6.0), HTML(value='')))
Deployment ocid1.datasciencemodeldeployment.oc1.iad.amaaaaaav66vvnia5iypp4jq4s2rcdrcyjah2zrhky3y3omwydxtpmuhig4q is ACTIVE

Make Predictions

Predictions can be made by calling the HTTP end-point for the model deployment. The ModelDeployment object has an attribute url that specifies the endpoint. You could also use the ModelDeployment object with the predict() method. The format of the data that is passed to the HTTP end-point depends on the setup of the model artifact. The default setup is to pass it a Python dictionary that has been converted to a JSON data structure. The first level defines the feature names. The second level uses an identifier for the observation (for example, row in the data frame) and the value associated with it. Assume that the model has features F1, F2, F3, F4, and F5. In this example, the observations are identified by the values 0, 1, and 2 so the data would look like this:

























The JSON would be:

   'F1': { 0: 11, 1: 21, 2: 31},
   'F2': { 0: 12, 1: 22, 2: 32},
   'F3': { 0: 13, 1: 23, 2: 33},
   'F4': { 0: 14, 1: 24, 2: 34},
   'F5': { 0: 15, 1: 25, 2: 35}

The next cell uses the first three rows of the test data frame and the ModelDeployment object to call the HTTP endpoint. The returned result is the predictions for the three observations.

deployment.predict(test.X.iloc[[0, 1, 2], :].to_dict())
{'prediction': [0, 2, 0]}

Viewing the Logs

The show_logs() method in the ModelDeployment class provides access to the predict and access logs. This method returns a data frame where each row represents a log entry. The parameter log_type accepts predict and access to specify which logs to return. If it is not specified, the access logs are returned. The parameters time_start and time_end restrict the logs to time periods between those entries. The parameter limit limits the number of log entries that are returned.

Note: Logs are not collected in real time. The next cell may result in no output. This is expected behavior. If the cell is run later, it should show output.

deployment.show_logs(log_type="access", limit=10)

The logs() method has the same arguments as show_logs(), but returns the data as a list of dictionaries.

Note: Logs are not collected in real time. The following cell may result in no output. This is expected behavior. If the cell is run later, it should show output.

deployment.logs(log_type="access", limit=1)

Information About a Model Deployment

The ModelDeployment class has a number of attributes that provide information about the deployment. The properties attribute contains information about the model deployment’s properties by exposing a ModelDeploymentProperties object. This object has all of the attributes of the OCI Model Deployment model. The most commonly used properties are:

  • category_log_details: An OCI model object that contains the OCIDs for the access and predict logs.

  • compartment_id: Compartment ID of the model deployment.

  • created_by: OCID of the user that created the model deployment.

  • defined_tags: System defined tags.

  • description: Description of the model deployment.

  • display_name: Name to display.

  • freeform_tags: User-defined tags.

  • model_id: OCID of the deployed model.

  • project_id: OCID of the project the model deployment belongs to.

print(f"Compartment OCID: {}\n" +
      f"Project OCID: {}\n" +
      f"Model OCID: {}\n" +
      f"Deployment Name: {}\n")
Compartment OCID: ocid1.compartment.oc1..aaaaaaaapvb3hearqum6wjvlcpzm5ptfxqa7xfftpth4h72xx46ygavkqteq
Project OCID: ocid1.datascienceproject.oc1.iad.amaaaaaav66vvniaklsknycfb6fso64knuk36egpsg3j5vasn3sveiuvdmna
Model OCID: ocid1.datasciencemodel.oc1.iad.amaaaaaav66vvniaylwcfs7wai6dbxndhsqcj3xyd6nsi27mlcjwzaymbaaa
Deployment Name: Model Deployment Demo using ADS

The model_deployment_id of the ModelDeployment class specifies the OCID of the model deployment.


The URL of the model deployment to use to make predictions using an HTTP request is given in the url attribute of the ModelDeployment class. You can make HTTP requests to this endpoint to have the model make predictions, see the Make Predictions section and Invoking a Model Deployment documentation for details.


You can determine he state of the model deployment using the attribute with values like ‘ACTIVE’, ‘INACTIVE’, and ‘FAILED’.

The list_workflow_logs() provides a list of dictionaries that define the steps that were used to deploy the model.

   "message": "Creating compute resource configuration.",
   "timestamp": "2021-04-21T20:45:27.609000+00:00"
   "message": "Creating compute resources.",
   "timestamp": "2021-04-21T20:45:30.237000+00:00"
   "message": "Creating load balancer.",
   "timestamp": "2021-04-21T20:45:33.076000+00:00"
   "message": "Compute resources are provisioned.",
   "timestamp": "2021-04-21T20:46:46.876000+00:00"
   "message": "Load balancer is provisioned.",
   "timestamp": "2021-04-21T20:53:54.764000+00:00"

Update a Model Deployment

The update() method of the ModelDeployment class is used to make changes to a deployed model. This method accepts the same parameters as the deploy() method, see the Deploy a Model without a ModelDeployment Object section of a list of model configurations that can be changed. One difference between the update() and deploy() methods is that all parameters are optional. Therefore, you only need to specify the parameters that you want to change.

A common use case is to change the underlying model that is deployed. To do this you would run a command like:


In the next cell, the display name of the model is updated.

print(f"Orginal Display Name: {}")
print(f"New Display Name: {}")
Orginal Display Name: Model Deployment Demo using ADS
HBox(children=(HTML(value='loop1'), FloatProgress(value=0.0, max=6.0), HTML(value='')))
New Display Name: Model Deployment Demo using ADS

Listing and Access Model Deployments

When a model deployment is created, a ModelDeployment is returned. However, if a model is deployed then you need a method to access a ModelDeployment object associated with the model deployment of interest. The get_model_deployment() method creates a ModelDeployment object when the model deployment OCID is known. The list_deployments() method returns a list of ModelDeployment objects.

Access a Model Deployment by OCID

If the model deployment OCID is known, then the get_model_deployment() method of the ModelDeployer class can be used to obtain a ModelDeployment object.

The next cell creates a new ModelDeployment object that has access to the model deployment created in this notebook.

existing_deployment = deployer.get_model_deployment(model_deployment_id=deployment_id)
print(f"deployment OCID: {deployment.model_deployment_id}\n" +
      f"existing deployment OCID: {existing_deployment.model_deployment_id}")
deployment OCID: ocid1.datasciencemodeldeployment.oc1.iad.amaaaaaav66vvnia5iypp4jq4s2rcdrcyjah2zrhky3y3omwydxtpmuhig4q
existing deployment OCID: ocid1.datasciencemodeldeployment.oc1.iad.amaaaaaav66vvnia5iypp4jq4s2rcdrcyjah2zrhky3y3omwydxtpmuhig4q

The get_model_deployment_state() method of the ModelDeployer class accepts a model deployment ODIC and returns an enum state. This is a convenience method to obtain the model deployment state when the model deployment OCID is known. The next cell returns the state of the model deployed in this notebook.


List Model Deployments

The list_deployments() method of the ModelDeployer class returns a list of ModelDeployment objects. The optional parameter compartment_id limits the search to a specific compartment. By default, it uses the same compartment that the notebook is in. The optional parameter status limits the returned ModelDeployment objects to those model deployments that have the specified status. Values for the status parameter would be ‘ACTIVE’, ‘INACTIVE’, or ‘FAILED’.

The next cell obtains a list of active deployments in the compartment specified by compartment_id, and prints the display name.

for active_deployment in deployer.list_deployments(status="ACTIVE", compartment_id=compartment_id):
Model Deployment Demo using ADS

The show_deployments() method is a helper function that works the same way as the list_deployments() method except it returns a data frame of the results.

deployer.show_deployments(compartment_id=compartment_id, status="ACTIVE")
model_id deployment_url current_state
0 ocid1.datasciencemodeldeployment.oc1... ACTIVE

Delete a Model Deployment

The odsc SDK provides two methods for deleting a model deployment. The ModelDeployer class has a static method delete(). This method can also be accessed on a ModelDeployer object. For example, this notebook uses the deployer``variable, which is a ``ModelDeployer object. A model deployment can be deleted with:


or equivalently,


If you have a ModelDeployment object, there is a delete() method that deletes the model that is associated with that object. The optional wait_for_completion parameter accepts a boolean and determines if the process is blocking or not.

The next cell uses a ModelDeployment object to delete the model deployment that was created in this notebook.

deployment = deployment.delete(wait_for_completion=True)

When a model deployment is deleted, it deletes the load balancer instances associated with it. If logging was configured, it does not delete that log group or loggers. In addition, the model in the model catalog is not deleted. In the next cell, these resources are removed.

# Delete the log group and logs
logging_client = oci.logging.LoggingManagementClient(config=oci.config.from_file())
logging_client.delete_log(log_group_ocid, access_log_ocid)
logging_client.delete_log(log_group_ocid, predict_log_ocid)

# Delete the model