Managing an Endpoint in Generative AI

To use an OCI Generative AI custom model for inference, you must first create an endpoint for that model. You can also create endpoints for the pretrained models that are available in the playground.

An endpoint is a designated point on a dedicated AI cluster where a model can accept user requests and send back responses such as the model's generated text. You create an endpoint on a hosting dedicated AI cluster.

Note

After you create an active endpoint for a custom model, the model is listed in the playground's model list.
Each custom model can have more than one endpoint.
Each hosting dedicated AI cluster can host many endpoints. See the number of remaining endpoints on the dedicated AI cluster's detail page. If you no longer need a custom model's endpoint, you can delete that endpoint and use its associated dedicated AI cluster to host a new endpoint.

You can perform the following tasks to create and manage endpoints for custom models:

Creating an endpoint
Listing the endpoints
Viewing a model with an endpoint in the playground
Getting an endpoint's details
Updating an endpoint
Moving an endpoint
Getting an endpoint's metrics
Deleting an endpoint

Oracle Cloud Infrastructure Documentation

Managing an Endpoint in Generative AI