Memory Restrictions

With index creation offloaded to a remote container, there is a memory requirement for the host memory as well as GPU memory on the container side.

Similar to building an index in the database using CPUs, if the index memory footprint is larger than the available memory in the vector pool, the creation will fail.

Additionally, for successful index creation on the GPU, the peak consumption of GPU memory required during the CAGRA index creation must not exceed the available GPU memory. When insufficient memory is encountered on the container side, an ORA-52342 error is raised. For information about the peak memory usage needed from cuVS, see cuVS documentation.

The peak memory usage of the CAGRA graph can be estimated using the following formula:

n_vectors * ((20 * NEIGHBORS_parameter_value) + 4)

This formula can be applied, for example, on the following vector index created against a data set of 1 million vectors:

CREATE VECTOR INDEX gist_idx ON gist(embedding)
  ORGANIZATION INMEMORY NEIGHBOR GRAPH
  PARAMETERS (
    TYPE                       HNSW,
    NEIGHBORS                  32,
    EFCONSTRUCTION             200,
    OFFLOAD_CREDENTIAL_NAME    privateai,
    OFFLOAD_URL                'https://your_hostname:8443'
  ) PARALLEL 2;

With these values, the peak footprint is estimated as:

1M * ((20 * 32) + 4) = 614MB

Note that the calculated estimate is intended as a guideline. If the peak memory usage is near the limit of available memory, it may still not be sufficient if some amount of device memory is internally reserved for other purposes.

In the case that the peak estimate exceeds the GPU memory budget, the NEIGHBORS parameter value can be lowered to reduce memory overhead, although this may reduce index accuracy. Oracle also recommends avoiding concurrent requests to the same GPU to ensure optimal performance and better memory availability.

Parent topic: Considerations for the Vector Index Service