Invoking a Model Deployment
After a model deployment is in an active lifecycleState
, the predict
endpoint can successfully receive requests made by clients. Invoking a model deployment
means that you can pass feature vectors or data samples to the predict endpoint, and
then the model returns predictions for those data samples.
From your model deployment detail page, click Invoking Your Model. The following details are displayed:
- The model HTTP endpoint.
- Sample code that enables you to invoke the model endpoint using the OCI CLI. Alternatively, you could use the OCI Python, and Java SDKs to invoke the model with the provide code sample.
-
The payload size limit is 10 MB.
-
The timeout on invoking a model is 60 seconds for HTTP calls.
Use the sample code to invoke your model deployment.
Invoking a model deployment calls the predict endpoint of the model deployment URI. This
endpoint takes sample data as input and is processed using the
predict()
function in the score.py
model artifact
file. The sample data is usually in JSON format though can be in other formats.
Processing means that the sample data could be transformed then passed to a models
inference method. The models can generate predictions that can be processed before being
returned back to the client.
The API responses are:
HTTP Status Code | Description |
---|---|
200 Success
|
Success. |
404 |
Not Found or unauthorized. |
413 |
The payload size limit is 10 MB. |
429 |
Too Many Requests.
|
500 |
Internal Server Error.
There is a 60 second timeout for the |
503 |
The model server is unavailable. |
Invoking with the OCI Python SDK
This example code is a reference to help you invoke your model deployment:
import requests
import oci
from oci.signer import Signer
import json
# model deployment endpoint. Here we assume that the notebook region is the same as the region where the model deployment occurs.
# Alternatively you can also go in the details page of your model deployment in the OCI console. Under "Invoke Your Model", you will find the HTTP endpoint
# of your model.
endpoint = <your-model-deployment-uri>
# your payload:
input_data = <your-json-payload-str>
if using_rps: # using resource principal:
auth = oci.auth.signers.get_resource_principals_signer()
else: # using config + key:
config = oci.config.from_file("~/.oci/config") # replace with the location of your oci config file
auth = Signer(
tenancy=config['tenancy'],
user=config['user'],
fingerprint=config['fingerprint'],
private_key_file_location=config['key_file'],
pass_phrase=config['pass_phrase'])
# post request to model endpoint:
response = requests.post(endpoint, json=input_data, auth=auth)
# Check the response status. Success should be an HTTP 200 status code
assert response.status_code == 200, "Request made to the model predict endpoint was unsuccessful"
# print the model predictions. Assuming the model returns a JSON object.
print(json.loads(response.content))
Invoking with the OCI CLI
The OCI-CLI is included in the OCI Cloud Shell environment and is preauthenticated. This example invokes a model deployment with the CLI:
oci raw-request --http-method POST --target-uri
<model-deployment-url>/predict --request-body '{"data": "data"}'