HNSW RAC Duplication

Learn how Hierarchical Navigable Small World (HNSW) indexes are populated during index creation, index repopulation, or instance startup in an Oracle Real Application Clusters (Oracle RAC) or a non-RAC environment.

HNSW Index Creation in RAC Environment

Currently HNSW index in an Oracle RAC environment is created in centralized mode. The diagram below depicts the workflow for HNSW index creation in an Oracle RAC environment:

  1. The RAC instance that creates the HNSW index is responsible for creating the HNSW ROWID-to-VID mapping table on the disk. As explained in Optimizer Plans for HNSW Vector Indexes, this table is needed by the optimizer to run certain optimization plans.

  2. By default, once the first HNSW graph is created, all other RAC instances are informed to start their own HNSW In-Memory Graph creation concurrently. This operation is called an HNSW duplication operation. Duplication is achieved in two steps: The primary RAC instance creates a disk checkpoint of the HNSW graph that it created and then all the other secondary RAC nodes use that disk checkpoint to load the graph. Creating HNSW index in the centralized mode ensures that all nodes share an identical graph structure.

Note:

Instances that do not have enough vector memory cannot participate in the parallel RAC-wide HNSW index population.

In addition to the duplication mechanism that creates the HNSW index across the RAC nodes, full repopulation is used to perform a graph refresh duplication. See Transaction Support for HNSW Indexes for more information about why and when HNSW index full repopulation operation is triggered. The graph refresh duplication process is identical to that used during index creation: The primary instance initiates the graph refresh and creates a new full disk checkpoint. The other instances reload the graph from the latest checkpoint.

Note:

Snapshots are not created in Oracle RAC environment. Currently, full repopulation is the only method used for graph refresh duplication.

HNSW Index Reload at Instance Restart or New Node Joining Cluster

Because HNSW indexes reside in memory, the in-memory graphs representing your HNSW indexes are lost if an instance shuts down. Upon restart or when a new node joins the cluster, Oracle triggers a reload mechanism to quickly restore the HNSW graph in memory. This reload mechanism is enabled by default for both Oracle RAC and non-RAC single instance environments. As explained in HNSW Index Creation in RAC Environment, during index creation time, the in-memory HNSW graph and the ROWID-to-VID mapping table on disks are created. If enabled, a full checkpoint is also created on disks. The way the reload mechanism is processed depends on the existence of a full HNSW graph checkpoint on disk. Read HNSW Graph Persistence to understand more about checkpoints.
  • If a recent full checkpoint exists: The instance loads the HNSW graph directly from the checkpoint, ensuring rapid recovery.

  • If no valid checkpoint is available: Graph is recreated from the scratch and the new graph will not be identical to the older graph.

The reload behavior is governed by the VECTOR_INDEX_NEIGHBOR_GRAPH_RELOAD parameter. If the VECTOR_INDEX_NEIGHBOR_GRAPH_RELOAD instance parameter is set to RESTART (default setting) at the time the instance joins the cluster or when restarting, and if a full checkpoint exists and is not deemed too old compared to the current instance SCN, then it is used by the starting instance to create its HNSW graph in memory. If these two conditions are not met, then the starting instance uses the duplication mechanism to create the HNSW graph in memory from scratch. If the VECTOR_INDEX_NEIGHBOR_GRAPH_RELOAD instance parameter is set to OFF at the time the instance joins the cluster or when restarting, then the HNSW graph is not reloaded.