Creating a Dedicated AI Cluster in Generative AI for Hosting Models

Hosting dedicated AI cluster are compute resources that you use for hosting endpoints for custom models. Here are the steps to create a dedicated AI cluster of type, hosting.

  1. In the navigation bar of the Console, select a region with Generative AI, for example, US Midwest (Chicago). If you don't know which region to select, see Regions with Generative AI.
  2. Open the navigation menu and click Analytics & AI. Under AI Services, click Generative AI.
  3. In the left navigation, choose a compartment for creating the new version of the custom model in.
    Ensure that you have permission to use or manage generative-ai-family and object-family resources in this compartment.
  4. In the left navigation, choose a compartment that you have permission to work in.
  5. Click Dedicated AI clusters.
  6. Click Create dedicated AI cluster.
  7. Select a compartment to create the dedicated AI cluster in. The default compartment is the one you selected in step 3, but you can select any compartment that you have permission to work in.
  8. (Optional) Enter a name and description. If you leave the name blank, the system generates a name that you can change later.

    The generated name has the format generativeaidedicatedaicluster<timestamp>.

    For example: generativeaidedicatedaicluster20240124214815

  9. For Cluster type, click Hosting.
  10. Choose the Base model for the models that you want to host on this cluster:
    • Cohere.command
    • Cohere.command-light
    • Cohere.embed
    • Llama-2-70b-chat

    These models include any base model with a version listed in the playground that's not in a deprecated state.


    When you create a dedicated AI cluster for hosting models for inference, by default one unit is created for the base model that you choose. To increase the throughput, you can increase the instances through Instance Count, or later when you edit the cluster. For example, creating two instances on this cluster, requires two units and doubles the throughput.
  11. Read the commitment unit hours for the hosting dedicated AI cluster and click to agree with the commitment.
  12. (Optional) Click Show advanced options and assign tags to this dedicated AI cluster.
  13. Click Create.

    Dedicated AI clusters take a few minutes to create. After a cluster is in an active state, you select that cluster to host a model, when creating an endpoint for that model.