Managing an Endpoint in Generative AI

To use an OCI Generative AI custom model for inference, you must first create an endpoint for that model. You can also create endpoints for the pretrained models that are available in the playground.

An endpoint is a designated point on a dedicated AI cluster where a model can accept user requests and send back responses such as the model's generated text. You create an endpoint on a hosting dedicated AI cluster.

Note

  • After you create an active endpoint for a custom model, the model is listed in the playground's model list.
  • Each custom model can have more than one endpoint.
  • Each hosting dedicated AI cluster can host many endpoints. See the number of remaining endpoints on the dedicated AI cluster's detail page. If you no longer need a custom model's endpoint, you can delete that endpoint and use its associated dedicated AI cluster to host a new endpoint.

You can perform the following tasks to create and manage endpoints for custom models: