<font color=gray>Oracle Cloud Infrastructure Data Science Sample Notebook

Copyright (c) 2021 Oracle, Inc.  All rights reserved. <br>
Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl.
</font>

# Deploying a PyTorch Model with Model Deployment 

In this tutorial we are going to prepare and save a pytorch model artifact using ADS, we are going to publish a conda environment, and deploy the model as an HTTP endpoint. 

## Pre-requisites to Running this Notebook 

* We recommend that you run this notebook in a notebook session using the **Data Science Conda Environment "General Machine Learning for CPU (v1.0)"** 
* You need access to the public internet
* **You need to upgrade the current version of the OCI Python SDK** (`oci`)
* You need to install the `transformers` library 

In [None]:
!pip install --upgrade oci
!pip install transformers

In [None]:
import oci
import ads
import json
import logging
import os
import tempfile
import warnings
from os import path
import numpy as np
import pandas as pd
import time
from ads.common.model_export_util import prepare_generic_model
from transformers import AutoTokenizer, AutoModelForSequenceClassification

logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.ERROR)
warnings.filterwarnings('ignore')
ads.set_documentation_mode(False)

Here we download a pre-trained bert model: 

In [None]:
# Download pretrained model
pretrained_model_name = "lannelin/bert-imdb-1hidden"
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name)
model = AutoModelForSequenceClassification.from_pretrained(pretrained_model_name)

Since we installed `transformers` in our conda environment, let's first publish the environment before saving the model to the catalog. We will need the same environment (with `transformers`) for model deployment. 

You can publish an environment by first initializing `odsc conda` with the namespace of your tenancy and the object storage bucket name where you want to store the conda environment. Then execute the `odsc conda publish` command in the terminal to copy the environment in the bucket. This command can take a few minutes to complete: 

In [None]:
!odsc conda init -b <your-bucket-name>  -n <your-tenancy-namespace> # replace with your values. 
!odsc conda publish -s mlcpuv1 # change this value if you are running this notebook in a different conda environment. 

Also make sure that you write a policy allowing model deployment to read objects in your bucket: 

```
Allow any-user to read objects in compartment <your-compartment-name>
where ALL { request.principal.type='datasciencemodeldeployment', 
target.bucket.name=<your-bucket-name> }
```

Once we are done publishing the environment, we need to provide a reference of its path on object storage. The path of a published conda environment should be passsed to the parameter `inference_conda_env`. 

If you don't know how to find the path of your environment on object storage, simply go back to the "Environment Explorer" tool in the notebook session. Click on "Published Environments". The path is written on each environment card (`Object Storage URI`) 

In [None]:
# Specify the inference conda environment.
inference_conda_env = "<your-conda-env-object-storage-path>" # replace with your value. 

# Prepare the model artifact template
path_to_model_artifacts = "pytorch_artifacts"
model_artifact = prepare_generic_model(path_to_model_artifacts,
                                               function_artifacts=False,
                                               force_overwrite=True,
                                               data_science_env=False,
                                               inference_conda_env=inference_conda_env)
model.save_pretrained(path_to_model_artifacts)
tokenizer.save_pretrained(path_to_model_artifacts)

# List the template files
print("Model Artifact Path: {}\n\nModel Artifact Files:".format(
    path_to_model_artifacts))
for file in os.listdir(path_to_model_artifacts):
    if path.isdir(path.join(path_to_model_artifacts, file)):
        for file2 in os.listdir(path.join(path_to_model_artifacts, file)):
            print(path.join(file, file2))
    else:
        print(file)

In [None]:
%%capture
score = '''
import json
import os

from functools import lru_cache
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "pytorch_model.bin"
tokenize_name = 'vocab'

@lru_cache(maxsize=10)
def load_model(model_file_name=model_name):
    """
    Loads model from the serialized format

    Returns
    -------
    model:  a model instance on which predict API can be invoked
    """
    model_dir = os.path.dirname(os.path.realpath(__file__))
    contents = os.listdir(model_dir)
    if model_file_name in contents:
        model = AutoModelForSequenceClassification.from_pretrained(model_dir)
        return model
    else:
        raise Exception('{0} is not found in model directory {1}'.format(model_file_name, model_dir))


def predict(data, model=load_model()):
    """
    Returns prediction given the model and data to predict

    Parameters
    ----------
    model: Model instance returned by load_model API
    data: Data format as expected by the predict API of the core estimator. For eg. in case of sckit models it could be numpy array/List of list/Panda DataFrame

    Returns
    -------
    predictions: Output from scoring server
        Format: {'prediction':output from model.predict method}

    """
    tokenizer_dir = os.path.dirname(os.path.realpath(__file__))
    contents = os.listdir(tokenizer_dir)
    LABELS = ["negative", "positive"]    
    if tokenize_name + '.json' in contents:
        tokenizer = AutoTokenizer.from_pretrained(tokenizer_dir)
    outputs = []
    for text in data:
        inputs = tokenizer.encode_plus(text, return_tensors='pt')
        output = model(**inputs)[0].squeeze().detach().numpy()
        outputs.append(LABELS[(output.argmax())])
    return {'prediction': outputs}
'''

with open(path.join(path_to_model_artifacts, "score.py"), 'w') as f:
    print(f.write(score))

In [None]:
project_id = os.environ['PROJECT_OCID'] 
compartment_id = os.environ['NB_SESSION_COMPARTMENT_OCID']

mc_model = model_artifact.save(
    project_id=project_id, compartment_id=compartment_id, 
    display_name="pytorch_model (Model Deployment Test)",
    description="A sample bert pretrained model",
    ignore_pending_changes=True, timeout=6000)

In [None]:
# Print published model information
mc_model

## Deploying the model with Model Deployment

We are ready to deploy `mc_model`. We are using the user principal (config+key) method of authentication. Alternatively you can use resource principal. 

In [None]:
# Getting OCI config information
oci_config = oci.config.from_file("~/.oci/config", "DEFAULT")
# Setting up DataScience instance
data_science = oci.data_science.DataScienceClient(oci_config)
# Setting up data science composite client to unlock wait_for_state operations
data_science_composite = oci.data_science.DataScienceClientCompositeOperations(data_science)

The model deployment configuration object: 

In [None]:
# Prepareing model deployment data
model_deployment_details = {
    "displayName": "Pytorch model test",
    "projectId": mc_model.project_id,
    "compartmentId": mc_model.compartment_id,
    "modelDeploymentConfigurationDetails": {
        "deploymentType": "SINGLE_MODEL",
        "modelConfigurationDetails": {
            "modelId": mc_model.id,
            "instanceConfiguration": {
                "instanceShapeName": "VM.Standard2.4"
            },
            "scalingPolicy": {
                "policyType": "FIXED_SIZE",
                "instanceCount": 2
            },
            "bandwidthMbps": 10
        }
    },
    "categoryLogDetails": None
}

We are now ready to deploy. This takes a few minutes to complete. 

In [None]:
%%time

model_deployment = data_science_composite.create_model_deployment_and_wait_for_state(model_deployment_details,
                                                                                     wait_for_states=["SUCCEEDED",
                                                                                                      "FAILED"])

This cell extract from the `model_deployment` object a series of useful diagnostics about the creation of the model deployment resource: 

In [None]:
print("Grabbing the model deployment ocid...")
model_deployment_data = json.loads(str(model_deployment.data))
model_deployment_id = model_deployment_data['resources'][0]['identifier']
print(f"Model deployment ocid: {model_deployment_id}")

print("Checking for the correct response status code...")
if model_deployment.status == 200:
    print(f"Work request status code returned: {model_deployment.status}")
    print("Checking for non-empty response data...")
    if model_deployment.data:
        print(f"Data returned: {model_deployment.data}")
        print("Grabbing the model deployment work request status...")
        work_request_status = model_deployment_data['status']
        print("Checking for the correct work request status...")
        if work_request_status == "SUCCEEDED":
            print(f"Work request status returned: {work_request_status}")
        else:
            print(
                f"Work request returned an incorrect status of: {work_request_status}")
            print(
                f"Work requests error: {data_science.list_work_request_errors(model_deployment.data.id).data}")
            print(
                f"opc-request-id: {model_deployment.headers['opc-request-id']}")
    else:
        print("Failed to grab model deployment data.")
        print(f"opc-request-id: {model_deployment.headers['opc-request-id']}")
else:
    print(
        f"Model deployment returned an incorrect status of: { model_deployment.status}")
    print(f"opc-request-id: {model_deployment.headers['opc-request-id']}")