Cohere Rerank 3.5
The cohere.rerank.v3-5 model takes in a query and a list of texts and produces an ordered array with each text assigned a relevance score. The relevance score is how the model ranks the documents, that's, how well each text matches the query.
Regions for this Model
For supported regions, endpoint types (on-demand or dedicated AI clusters), and hosting (OCI Generative AI or external calls) for this model, see the Models by Region page. For details about the regions, see the Generative AI Regions page.
Access this Model
The API inks list the endpoints for all supported commercial, sovereign, and government regions.
Key Features
Dedicated AI Cluster for the Model
- Model available only through the dedicated mode. (Not available on-demand.)
- For dedicated mode, create an endpoint on a hosting dedicated AI cluster, host the model on the cluster, and then run the RerankText API or its relevant SDK.
For the cluster unit size that matches this model, see the following table.
| Base Model | Fine-Tuning Cluster | Hosting Cluster | Pricing Page Information | Request Cluster Limit Increase |
|---|---|---|---|---|
|
Not available for fine-tuning |
|
|
|
If you don't have enough cluster limits in your tenancy for hosting the Cohere Rerank 3.5 model on a dedicated AI cluster, request the dedicated-unit-rerank-cohere-count limit to increase by 1.
Endpoint Rules for Clusters
- A dedicated AI cluster can hold up to 50 endpoints.
- Use these endpoints to create aliases that all point either to the same base model or to the same version of a custom model, but not both types.
- Several endpoints for the same model make it easy to assign them to different users or purposes.
| Hosting Cluster Unit Size | Endpoint Rules |
|---|---|
| RERANK_COHERE |
|
-
To increase the call volume supported by a hosting cluster, increase its instance count by editing the dedicated AI cluster. See Updating a Dedicated AI Cluster.
-
For more than 50 endpoints per cluster, request an increase for the limit,
endpoint-per-dedicated-unit-count. See Creating a Limit Increase Request and Service Limits for Generative AI.
Cluster Performance Benchmarks
Review the Cohere Rerank 3.5 cluster performance benchmarks for different scenarios.
OCI Release and Retirement Dates
For release and retirement dates and replacement model options, see the following pages based on the mode (on-demand or dedicated):
Release and Retirement Dates
For release and retirement dates and replacement model options, see the following page:
Rerank Model Parameter
For the Rerank model parameters, see the RerankText API documentation.