GenerativeAiInferenceClient¶

class oci.generative_ai_inference.GenerativeAiInferenceClient(config, **kwargs)¶

OCI Generative AI is a fully managed service that provides a set of state-of-the-art, customizable large language models (LLMs) that cover a wide range of use cases for text generation, summarization, and text embeddings.

Use the Generative AI service inference API to access your custom model endpoints, or to try the out-of-the-box models to [chat](#/EN/generative-ai-inference/latest/ChatResult/Chat), [generate text](#/EN/generative-ai-inference/latest/GenerateTextResult/GenerateText), [summarize](#/EN/generative-ai-inference/latest/SummarizeTextResult/SummarizeText), and [create text embeddings](#/EN/generative-ai-inference/latest/EmbedTextResult/EmbedText).

To use a Generative AI custom model for inference, you must first create an endpoint for that model. Use the [Generative AI service management API](#/EN/generative-ai/latest/) to [create a custom model](#/EN/generative-ai/latest/Model/) by fine-tuning an out-of-the-box model, or a previous version of a custom model, using your own data. Fine-tune the custom model on a [fine-tuning dedicated AI cluster](#/EN/generative-ai/latest/DedicatedAiCluster/). Then, create a [hosting dedicated AI cluster](#/EN/generative-ai/latest/DedicatedAiCluster/) with an [endpoint](#/en/generative-ai/latest/Endpoint/) to host your custom model. For resource management in the Generative AI service, use the [Generative AI service management API](#/EN/generative-ai/latest/).

To learn more about the service, see the [Generative AI documentation](/iaas/Content/generative-ai/home.htm).

Methods

`__init__`(config, **kwargs)	Creates a new service client
`apply_guardrails`(apply_guardrails_details, …)	Applies guardrails to the input text, including content moderation, PII detection, and prompt injection protection.
`chat`(chat_details, **kwargs)	Creates a response for the given conversation.
`embed_text`(embed_text_details, **kwargs)	Produces embeddings for the inputs.
`generate_text`(generate_text_details, **kwargs)	Generates a text response based on the user prompt.
`rerank_text`(rerank_text_details, **kwargs)	Reranks the text responses based on the input documents and a prompt.
`summarize_text`(summarize_text_details, **kwargs)	Summarizes the input text.

__init__(config, **kwargs)¶

Creates a new service client

Parameters:

config (dict) – Configuration keys and values as per SDK and Tool Configuration. The from_file() method can be used to load configuration from a file. Alternatively, a dict can be passed. You can validate_config the dict using validate_config()
service_endpoint (str) – (optional) The endpoint of the service to call using this client. For example https://iaas.us-ashburn-1.oraclecloud.com. If this keyword argument is not provided then it will be derived using the region in the config parameter. You should only provide this keyword argument if you have an explicit need to specify a service endpoint.
timeout (float or tuple(float, float)) – (optional) The connection and read timeouts for the client. The default values are connection timeout 10 seconds and read timeout 60 seconds. This keyword argument can be provided as a single float, in which case the value provided is used for both the read and connection timeouts, or as a tuple of two floats. If a tuple is provided then the first value is used as the connection timeout and the second value as the read timeout.
signer (AbstractBaseSigner) –
(optional) The signer to use when signing requests made by the service client. The default is to use a Signer based on the values provided in the config parameter.

One use case for this parameter is for Instance Principals authentication by passing an instance of InstancePrincipalsSecurityTokenSigner as the value for this keyword argument
retry_strategy (obj) –
(optional) A retry strategy to apply to all calls made by this service client (i.e. at the client level). There is no retry strategy applied by default. Retry strategies can also be applied at the operation level by passing a retry_strategy keyword argument as part of calling the operation. Any value provided at the operation level will override whatever is specified at the client level.

This should be one of the strategies available in the retry module. A convenience DEFAULT_RETRY_STRATEGY is also available. The specifics of the default retry strategy are described here.
circuit_breaker_strategy (obj) – (optional) A circuit breaker strategy to apply to all calls made by this service client (i.e. at the client level). This client uses DEFAULT_CIRCUIT_BREAKER_STRATEGY as default if no circuit breaker strategy is provided. The specifics of circuit breaker strategy are described here.
circuit_breaker_callback (function) – (optional) Callback function to receive any exceptions triggerred by the circuit breaker.
client_level_realm_specific_endpoint_template_enabled (bool) – (optional) A boolean flag to indicate whether or not this client should be created with realm specific endpoint template enabled or disable. By default, this will be set as None.
allow_control_chars – (optional) allow_control_chars is a boolean to indicate whether or not this client should allow control characters in the response object. By default, the client will not allow control characters to be in the response object.

apply_guardrails(apply_guardrails_details, **kwargs)¶

Applies guardrails to the input text, including content moderation, PII detection, and prompt injection protection.

Parameters:	apply_guardrails_details (oci.generative_ai_inference.models.ApplyGuardrailsDetails) – (required) Details for applying guardrails to the input text. opc_retry_token (str) – (optional) A token that uniquely identifies a request so it can be retried in case of a timeout or server error without risk of executing that same action again. Retry tokens expire after 24 hours, but can be invalidated before that, in case of conflicting operations. For example, if a resource is deleted and purged from the system, then a retry of the original creation request is rejected. opc_request_id (str) – (optional) The client request ID for tracing. retry_strategy (obj) – (optional) A retry strategy to apply to this specific operation/call. This will override any retry strategy set at the client-level. This should be one of the strategies available in the `retry` module. This operation uses `DEFAULT_RETRY_STRATEGY` as default if no retry strategy is provided. The specifics of the default retry strategy are described here. To have this operation explicitly not perform any retries, pass an instance of `NoneRetryStrategy`. allow_control_chars (bool) – (optional) allow_control_chars is a boolean to indicate whether or not this request should allow control characters in the response object. By default, the response will not allow control characters in strings
Returns:	A `Response` object with data of type `ApplyGuardrailsResult`
Return type:	`Response`
Example:

Click here to see an example of how to use apply_guardrails API.

chat(chat_details, **kwargs)¶

Creates a response for the given conversation.

Parameters:	chat_details (oci.generative_ai_inference.models.ChatDetails) – (required) Details of the conversation for the model to respond. opc_retry_token (str) – (optional) A token that uniquely identifies a request so it can be retried in case of a timeout or server error without risk of executing that same action again. Retry tokens expire after 24 hours, but can be invalidated before that, in case of conflicting operations. For example, if a resource is deleted and purged from the system, then a retry of the original creation request is rejected. opc_request_id (str) – (optional) The client request ID for tracing. retry_strategy (obj) – (optional) A retry strategy to apply to this specific operation/call. This will override any retry strategy set at the client-level. This should be one of the strategies available in the `retry` module. This operation uses `DEFAULT_RETRY_STRATEGY` as default if no retry strategy is provided. The specifics of the default retry strategy are described here. To have this operation explicitly not perform any retries, pass an instance of `NoneRetryStrategy`. allow_control_chars (bool) – (optional) allow_control_chars is a boolean to indicate whether or not this request should allow control characters in the response object. By default, the response will not allow control characters in strings
Returns:	A `Response` object with data of type `ChatResult`
Return type:	`Response`
Example:

Click here to see an example of how to use chat API.

embed_text(embed_text_details, **kwargs)¶

Produces embeddings for the inputs.

An embedding is numeric representation of a piece of text. This text can be a phrase, a sentence, or one or more paragraphs. The Generative AI embedding model transforms each phrase, sentence, or paragraph that you input, into an array with 1024 numbers. You can use these embeddings for finding similarity in your input text such as finding phrases that are similar in context or category. Embeddings are mostly used for semantic searches where the search function focuses on the meaning of the text that it’s searching through rather than finding results based on keywords.

Parameters:	embed_text_details (oci.generative_ai_inference.models.EmbedTextDetails) – (required) Details for generating the embed response. opc_retry_token (str) – (optional) A token that uniquely identifies a request so it can be retried in case of a timeout or server error without risk of executing that same action again. Retry tokens expire after 24 hours, but can be invalidated before that, in case of conflicting operations. For example, if a resource is deleted and purged from the system, then a retry of the original creation request is rejected. opc_request_id (str) – (optional) The client request ID for tracing. retry_strategy (obj) – (optional) A retry strategy to apply to this specific operation/call. This will override any retry strategy set at the client-level. This should be one of the strategies available in the `retry` module. This operation uses `DEFAULT_RETRY_STRATEGY` as default if no retry strategy is provided. The specifics of the default retry strategy are described here. To have this operation explicitly not perform any retries, pass an instance of `NoneRetryStrategy`. allow_control_chars (bool) – (optional) allow_control_chars is a boolean to indicate whether or not this request should allow control characters in the response object. By default, the response will not allow control characters in strings
Returns:	A `Response` object with data of type `EmbedTextResult`
Return type:	`Response`
Example:

Click here to see an example of how to use embed_text API.

generate_text(generate_text_details, **kwargs)¶

Generates a text response based on the user prompt.

Parameters:	generate_text_details (oci.generative_ai_inference.models.GenerateTextDetails) – (required) Details for generating the text response. opc_retry_token (str) – (optional) A token that uniquely identifies a request so it can be retried in case of a timeout or server error without risk of executing that same action again. Retry tokens expire after 24 hours, but can be invalidated before that, in case of conflicting operations. For example, if a resource is deleted and purged from the system, then a retry of the original creation request is rejected. opc_request_id (str) – (optional) The client request ID for tracing. retry_strategy (obj) – (optional) A retry strategy to apply to this specific operation/call. This will override any retry strategy set at the client-level. This should be one of the strategies available in the `retry` module. This operation uses `DEFAULT_RETRY_STRATEGY` as default if no retry strategy is provided. The specifics of the default retry strategy are described here. To have this operation explicitly not perform any retries, pass an instance of `NoneRetryStrategy`. allow_control_chars (bool) – (optional) allow_control_chars is a boolean to indicate whether or not this request should allow control characters in the response object. By default, the response will not allow control characters in strings
Returns:	A `Response` object with data of type `GenerateTextResult`
Return type:	`Response`
Example:

Click here to see an example of how to use generate_text API.

rerank_text(rerank_text_details, **kwargs)¶

Reranks the text responses based on the input documents and a prompt.

Rerank assigns an index and a relevance score to each document, indicating which document is most related to the prompt.

Parameters:	rerank_text_details (oci.generative_ai_inference.models.RerankTextDetails) – (required) Details required for the rerank request. opc_retry_token (str) – (optional) A token that uniquely identifies a request so it can be retried in case of a timeout or server error without risk of executing that same action again. Retry tokens expire after 24 hours, but can be invalidated before that, in case of conflicting operations. For example, if a resource is deleted and purged from the system, then a retry of the original creation request is rejected. opc_request_id (str) – (optional) The client request ID for tracing. retry_strategy (obj) – (optional) A retry strategy to apply to this specific operation/call. This will override any retry strategy set at the client-level. This should be one of the strategies available in the `retry` module. This operation uses `DEFAULT_RETRY_STRATEGY` as default if no retry strategy is provided. The specifics of the default retry strategy are described here. To have this operation explicitly not perform any retries, pass an instance of `NoneRetryStrategy`. allow_control_chars (bool) – (optional) allow_control_chars is a boolean to indicate whether or not this request should allow control characters in the response object. By default, the response will not allow control characters in strings
Returns:	A `Response` object with data of type `RerankTextResult`
Return type:	`Response`
Example:

Click here to see an example of how to use rerank_text API.

summarize_text(summarize_text_details, **kwargs)¶

Summarizes the input text.

Parameters:	summarize_text_details (oci.generative_ai_inference.models.SummarizeTextDetails) – (required) Details for summarizing the text. opc_retry_token (str) – (optional) A token that uniquely identifies a request so it can be retried in case of a timeout or server error without risk of executing that same action again. Retry tokens expire after 24 hours, but can be invalidated before that, in case of conflicting operations. For example, if a resource is deleted and purged from the system, then a retry of the original creation request is rejected. opc_request_id (str) – (optional) The client request ID for tracing. retry_strategy (obj) – (optional) A retry strategy to apply to this specific operation/call. This will override any retry strategy set at the client-level. This should be one of the strategies available in the `retry` module. This operation uses `DEFAULT_RETRY_STRATEGY` as default if no retry strategy is provided. The specifics of the default retry strategy are described here. To have this operation explicitly not perform any retries, pass an instance of `NoneRetryStrategy`. allow_control_chars (bool) – (optional) allow_control_chars is a boolean to indicate whether or not this request should allow control characters in the response object. By default, the response will not allow control characters in strings
Returns:	A `Response` object with data of type `SummarizeTextResult`
Return type:	`Response`
Example:

Click here to see an example of how to use summarize_text API.