Memory Restrictions
With index creation offloaded to a remote container, there is a memory requirement for the host memory as well as GPU memory on the container side.
Similar to building an index in the database using CPUs, if the index memory footprint is larger than the available memory in the vector pool, the creation will fail.
Additionally, for successful index creation on the GPU, the peak consumption
of GPU memory required during the CAGRA index creation must not exceed the available GPU
memory. When insufficient memory is encountered on the container side, an
ORA-52342 error is raised. For information about the peak memory
usage needed from cuVS, see cuVS documentation.
The peak memory usage of the CAGRA graph can be estimated using the following formula:
n_vectors * ((20 * NEIGHBORS_parameter_value) + 4)This formula can be applied, for example, on the following vector index created against a data set of 1 million vectors:
CREATE VECTOR INDEX gist_idx ON gist(embedding)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
PARAMETERS (
TYPE HNSW,
NEIGHBORS 32,
EFCONSTRUCTION 200,
OFFLOAD_CREDENTIAL_NAME privateai,
OFFLOAD_URL 'https://your_hostname:8443'
) PARALLEL 2;With these values, the peak footprint is estimated as:
1M * ((20 * 32) + 4) = 614MBNote that the calculated estimate is intended as a guideline. If the peak memory usage is near the limit of available memory, it may still not be sufficient if some amount of device memory is internally reserved for other purposes.
In the case that the peak estimate exceeds the GPU memory budget, the
NEIGHBORS parameter value can be lowered to reduce memory overhead,
although this may reduce index accuracy. Oracle also recommends avoiding concurrent
requests to the same GPU to ensure optimal performance and better memory
availability.
Parent topic: Considerations for the Vector Index Service