Cohere Rerank 4

Cohere Rerank 4 is a rerank model available in two variants, Pro and Fast.

Reranking improves search relevance by reordering an initial set of retrieved results. After a retrieval step returns candidate documents, the reranking model compares the query with each candidate and ranks the results from most relevant to least relevant.

Cohere Rerank 4 supports multilingual reranking and semi-structured content, including JSON, tables, and code-like content.

What’s New in Rerank 4

Compared with Cohere Rerank 3.5, Rerank 4 adds a larger context window, improved reranking quality, self-learning support, and two variants optimized for different workload requirements

Increased context window

Rerank 4 supports a 32,000-token context window. The larger context window improves handling for long documents and larger candidate inputs, which is useful for dense enterprise content such as reports, contracts, manuals, and technical documentation.

Improved reranking quality

Rerank 4 improves result ordering for enterprise retrieval workloads. It provides stronger relevance ranking for business, finance, technical, and other domain-specific content, which can improve downstream retrieval-augmented generation workflows by surfacing more relevant context.

Self-learning support

Rerank 4 introduces self-learning support, which helps adapt reranking behavior to domain-specific data, terminology, and relevance preferences without requiring annotated training data.

Pro and Fast variants

Rerank 4 is available in two variants:

  • Pro is optimized for higher-precision reranking and more complex retrieval tasks.
  • Fast is optimized for lower-latency, higher-throughput workloads.
Multilingual and semi-structured data support

Rerank 4 supports reranking for English and non-English content across more than 100 languages. It also supports semi-structured content, including JSON, tables, and code-like content.

Regions for this Model

Important

For supported regions, endpoint types (on-demand or dedicated AI clusters), and hosting (OCI Generative AI or external calls) for this model, see the Models by Region page. For details about the regions, see the Generative AI Regions page.

Model Variants

Cohere Rerank 4 includes the following model variants:

Model OCI Model Name Description
Cohere Rerank 4 Pro cohere.rerank-v4.0-pro Multilingual reranking model for English and non-English text and semi-structured JSON data. Best suited for quality-focused and complex reranking workloads.
Cohere Rerank 4 Fast cohere.rerank-v4.0-fast Lightweight multilingual reranking model for English and non-English text and semi-structured JSON data. Best suited for lower-latency and higher-throughput workloads.

On-Demand Mode

Some Cohere Rerank 4 variants are available on-demand in supported regions. On-demand mode doesn't require a dedicated AI cluster.

See Models by Region to check which model variants are available on-demand and in which regions.

Model Name OCI Model Name Pricing Page Product Name
Cohere Rerank 4 Pro cohere.rerank-v4.0-pro Rerank 4 Pro
Cohere Rerank 4 Fast cohere.rerank-v4.0-fast Rerank 4 Fast

Pricing is based on 1,000 search units. See the Pricing Page.

Learn about On-Demand Mode.

Dedicated AI Cluster for the Model

Some Cohere Rerank 4 variants are available through dedicated AI clusters in supported regions. These models aren't available for fine-tuning.

For dedicated mode, create an endpoint on a hosting dedicated AI cluster.

Model Hardware Unit Size Available Regions Request Cluster Limit Increase
Cohere Rerank 4 Pro (cohere.rerank-v4.0-pro) COHERE_A100_80G_X1
  • US East (Ashburn)
  • US West (Phoenix)
  • Limit Name: dedicated-unit-a100-80g-count
  • For Hosting, Request Limit Increase by: 1
Cohere Rerank 4 Pro (cohere.rerank-v4.0-pro) COHERE_H100_X1
  • Brazil East (Sao Paulo)
  • Germany Central (Frankfurt)
  • India South (Hyderabad)
  • Japan Central (Osaka)
  • UK South (London)
  • US Midwest (Chicago)
  • Limit Name: dedicated-unit-h100-count
  • For Hosting, Request Limit Increase by: 1
Cohere Rerank 4 Fast (cohere.rerank-v4.0-fast) COHERE_A100_80G_X1
  • US West (Phoenix)
  • Limit Name: dedicated-unit-a100-80g-count
  • For Hosting, Request Limit Increase by: 1
Cohere Rerank 4 Fast (cohere.rerank-v4.0-fast) COHERE_H100_X1
  • Brazil East (Sao Paulo)
  • Germany Central (Frankfurt)
  • India South (Hyderabad)
  • Japan Central (Osaka)
  • UK South (London)
  • US East (Ashburn)
  • US Midwest (Chicago)
  • Limit Name: dedicated-unit-h100-count
  • For Hosting, Request Limit Increase by: 1

For pricing, see the Cost estimator and the Pricing Page.

Tip

If the tenancy doesn't have enough limits to host these models on a dedicated AI cluster, request a limit increase for the hardware shape used in the target region. For example, to host the models in US West (Phoenix), request an increase of 1 for dedicated-unit-a100-80g-count.

Access this Model

To use a Cohere Rerank 4 model, call the RerankText API from a supported region.

Endpoint
https://inference.generativeai.{region}.oci.oraclecloud.com
API operation
POST /20231130/actions/rerankText

In RerankTextDetails, for servingMode, set the servingType attribute based on how you want to access the model:

  • Use ON_DEMAND for an on-demand model in a supported region.
  • Use DEDICATED for a model hosted on a dedicated AI cluster endpoint.

For availability and setup details, see the preceding On-Demand Mode and Dedicated AI Cluster for the Model sections.

Rerank Model Parameters

For the Rerank model parameters, see the RerankText API documentation.