Conversational Search with OCI Generative AI

OCI Search with OpenSearch provides support for creating an OCI Generative AI connector.

You can leverage the connector to have access to all the Generative AI features such as Retrieval-Augmented Generation (RAG), text summarization, text generation, conversational search, and semantic search.

Prerequisites

  • To use an OCI Generative AI connector with OCI Search with OpenSearch, you need a cluster configured to use OpenSearch version 2.11. By default, new clusters are configured to use version 2.11. To create a cluster, see Creating an OpenSearch Cluster.

    For existing clusters configured for version 2.3, you can perform an inline upgrade to version 2.11, for more information, see Inline Upgrade for OpenSearch Clusters.

    To upgrade existing clusters configured for version 1.2.3 to 2.11, you need to use the upgrade process described in Upgrading an OpenSearch Cluster.

  • Create a policy to grant access to Generative AI resources. The following policy example includes the required permissions:

    ALLOW ANY-USER to manage generative-ai-family in tenancy WHERE ALL {request.principal.type='opensearchcluster', request.resource.compartment.id='<cluster_compartment_id>'}

    If you're new to policies, see Getting Started with Policies and Common Policies.

  • Use the settings operation of the Cluster APIs to configure the recommended cluster settings that allow you to create a connector. The following example includes the recommended settings:

    PUT _cluster/settings
    {
      "persistent": {
        "plugins": {
          "ml_commons": {
            "only_run_on_ml_node": "false",
            "model_access_control_enabled": "true",
            "native_memory_threshold": "99",
            "rag_pipeline_feature_enabled": "true",
            "memory_feature_enabled": "true",
            "allow_registering_model_via_local_file": "true",
            "allow_registering_model_via_url": "true",
            "model_auto_redeploy.enable":"true",
            "model_auto_redeploy.lifetime_retry_times": 10
          }
        }
      }
      }

Register the Model Group

Register a model group using the register operation in the Model Group APIs, as shown in the following example:

POST /_plugins/_ml/model_groups/_register
{
   "name": "public_model_group-emb",
   "description": "This is a public model group"
}

Make note of the model_group_id returned in the response:

{
  "model_group_id": "<model_group_ID>",
  "status": "CREATED"
}

Create the Connector

Create the Generative AI connector as shown in one of the following examples.

Cohere model:

POST _plugins/_ml/connectors/_create
{
     "name": "OpenAI Chat Connector",
     "description": "when did us pass espio",
     "version": 2,
     "protocol": "oci_sigv1",
     "parameters": {
         "endpoint": "inference.generativeai.us-chicago-1.oci.oraclecloud.com",
         "auth_type": "resource_principal"
     },
     "credential": {
     },
     "actions": [
         {
             "action_type": "predict",
             "method": "POST",
             "url": "https://${parameters.endpoint}/20231130/actions/generateText",
             "request_body": "{\"compartmentId\":\"<cluster_compartment_id>\",\"servingMode\":{\"modelId\":\"cohere.command\",\"servingType\":\"ON_DEMAND\"},\"inferenceRequest\":{\"prompt\":\"${parameters.prompt}\",\"maxTokens\":600,\"temperature\":1,\"frequencyPenalty\":0,\"presencePenalty\":0,\"topP\":0.75,\"topK\":0,\"returnLikelihoods\":\"GENERATION\",\"isStream\":false ,\"stopSequences\":[],\"runtimeType\":\"COHERE\"}}"
         }
     ]
 }
Llama model:
POST _plugins/_ml/connectors/_create
{
     "name": "OpenAI Chat Connector",
     "description": "testing genAI connector",
     "version": 2,
     "protocol": "oci_sigv1",
     "parameters": {
         "endpoint": "inference.generativeai.us-chicago-1.oci.oraclecloud.com",
         "auth_type": "resource_principal"
     },
     "credential": {
     },
     "actions": [
         {
             "action_type": "predict",
             "method": "POST",
             "url": "https://${parameters.endpoint}/20231130/actions/generateText",
             "request_body": "{\"compartmentId\":\"<cluster_compartment_id>\",\"servingMode\":{\"modelId\":\"meta.llama-2-70b-chat\",\"servingType\":\"ON_DEMAND\"},\"inferenceRequest\":{\"prompt\":\"${parameters.prompt}\",\"maxTokens\":600,\"temperature\":1,\"frequencyPenalty\":0,\"presencePenalty\":0,\"topP\":0.75,\"topK\":-1,\"isStream\":false,\"numGenerations\":1,\"stop\":[],\"runtimeType\":\"LLAMA\"}}",
            "post_process_function": "def text = params['inferenceResponse']['choices'][0]['text'].replace('\n', '\\\\n');\n return '{\"name\":\"response\",\"dataAsMap\":{\"inferenceResponse\":{\"generatedTexts\":[{\"text\":\"' + text + '\"}]}}}'"
                          
               }
     ]
 }

Authentication is done using a resource principal. Specify the cluster's compartment ID in request_body.

Make note of the connector_id returned in the response:

{
  "connector_id": "<connector_ID>",
}

Register the Model

Register the remote model using the Generative AI connector with the connector ID and model group ID from the previous steps, as shown in the following example:

POST /_plugins/_ml/models/_register
{
   "name": "oci-genai-embed-test",
   "function_name": "remote",
   "model_group_id": "<model_group_ID>",
   "description": "test semantic",
   "connector_id": "<connector_ID>"
 }

Deploy the Model

Register the remote model using the Generative AI connector with the connector ID and model group ID from the previous steps, as shown in the following example:

POST /_plugins/_ml/models/<model_ID>/_deploy