Adding a Model to a Private Endpoint

Learn how to attach an endpoint with a custom or pretrained model to a private endpoint in OCI Generative AI.

You can attach one or more endpoints to a private endpoint.

  • Create Endpoint

    1. On the Private Endpoints list page, select the private endpoint that you want to work with. If you need help finding the list page for private endpoints, see Listing Private Endpoints.
    2. Select Endpoints and then select Create endpoint.

    Endpoint Information

    1. Select a compartment to create the endpoint in. The default compartment is the same as the list page, but you can select any compartment that you have permission to work in.
      Tip

      We recommend that you create the endpoint in the same compartment as the model.
    2. (Optional) Enter a name for the endpoint. Start the name with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be 1 to 255 characters. If you don't enter a name, the system generates a name that you can change later.
      The generated name has the format generativeaiendpoint<timestamp>. Example: generativeaiendpoint20250531235319
    3. (Optional) Enter a description for the model.

    Hosting configuration

    1. Select the compartment that hosts the model that you want to add an endpoint to.
    2. Select the model that you want to add an endpoint to. This model can be a custom model or a ready-to-use pretrained foundational model available in the region that you're working in.
    3. If the model that you selected has several versions, select a model version.
      For the ready-to-use pretrained foundational models, this field populates when you select the model.
    4. Select a hosting dedicated AI cluster by performing one of the following actions:
      • Select a Dedicated AI cluster from the list. If you created a cluster a few minutes ago, wait for that cluster to become active. Ensure that the base model that's associated with this cluster matches the base model for the model that you want to add an endpoint to.
      • Select Create new dedicated AI cluster and perform the following steps:
        1. (Optional) Enter a name and description.
        2. Select a Base model that matches the base model of the model that you want to host.
        3. Add 1 model replica to the endpoint. When you create a cluster you need at least one unit for an endpoint. For an existing cluster, you can use that same unit to host new endpoints. Each instance hosts all the active endpoints. Increasing the instance count on a cluster, increases the number of supported RPMs for all active endpoints hosted on a cluster.
        4. Read the commitment unit hours for the hosting dedicated AI cluster and select the checkbox to agree to the commitment.
        5. (Optional) Select Add tag and assign tags to this dedicated AI cluster. See Resource Tags.
        6. Select Create and wait for the cluster to become active.
        7. From the Dedicated AI cluster list, select the dedicated AI cluster that you created.

    Networking resources

    The private endpoint is preselected.

    Guardrails

    1. Select whether to enable the following guardrails.
      • Content moderation
        • Off: Don't apply content moderation and output explicit content.
        • Block: Help identify and apply content moderation.
        • Inform: Don't apply content moderation, but aim to inform the user if the model detects content that needs moderation.
      • Prompt injection (PI) protection
        • Off: Don't apply PI protection and allow unrestricted input.
        • Block: Help identify and protect against prompt injection.
        • Inform: Don't apply PI protection, but aim to inform the user if the model detects content that needs PI protection.
      • Personally identifiable information (PII) protection
        • Off: Don't apply PII protection, Instead, output content without data exposure restrictions.
        • Block: Help identify and protect PII such as help remove personal data from responses.
        • Inform: Don't apply PII protection, but aim to inform the user if the model detects content that needs PII protection.
    2. (Optional) Select Add tag and assign tags to this endpoint. See Resource Tags.
    3. Select Create.
      You're directed to the endpoint details page where you can track the state of the endpoint.
  • Use the endpoint create command and required parameters to create an endpoint:

    oci generative-ai endpoint create 
    --model-id <model-OCID>
    --compartment-id <compartment-OCID> 
    --dedicated-ai-cluster-id <hosting-dedicated-AI-cluster-OCID> 
    [OPTIONS]

    For a complete list of parameters and values for CLI commands, see the CLI Command Reference.

    Note

    For pretrained models, instead of an OCID, you can use the model name exactly as listed in the Console's playground. You can also find this OCI model name, in the model's detail page in Pretrained Foundational Models in Generative AI.
  • Run the CreateEndpoint operation to create an endpoint.

    Note

    For pretrained models, instead of an OCID, you can use the model name exactly as listed in the Console's playground. You can also find this OCI model name, in the model's detail page in Pretrained Foundational Models in Generative AI.