HNSW RAC Duplication
Learn how Hierarchical Navigable Small World (HNSW) indexes are populated during index creation, index repopulation, or instance startup in an Oracle Real Application Clusters (Oracle RAC) or a non-RAC environment.
HNSW Index Creation in RAC Environment
Currently HNSW index in an Oracle RAC environment is created in centralized mode. The diagram below depicts the workflow for HNSW index creation in an Oracle RAC environment:
-
The RAC instance that creates the HNSW index is responsible for creating the HNSW ROWID-to-VID mapping table on the disk. As explained in Optimizer Plans for HNSW Vector Indexes, this table is needed by the optimizer to run certain optimization plans.
-
By default, once the first HNSW graph is created, all other RAC instances are informed to start their own HNSW In-Memory Graph creation concurrently. This operation is called an HNSW duplication operation. Duplication is achieved in two steps: The primary RAC instance creates a disk checkpoint of the HNSW graph that it created and then all the other secondary RAC nodes use that disk checkpoint to load the graph. Creating HNSW index in the centralized mode ensures that all nodes share an identical graph structure.
Note:
Instances that do not have enough vector memory cannot participate in the parallel RAC-wide HNSW index population.In addition to the duplication mechanism that creates the HNSW index across the RAC nodes, full repopulation is used to perform a graph refresh duplication. See Transaction Support for HNSW Indexes for more information about why and when HNSW index full repopulation operation is triggered. The graph refresh duplication process is identical to that used during index creation: The primary instance initiates the graph refresh and creates a new full disk checkpoint. The other instances reload the graph from the latest checkpoint.
Note:
Snapshots are not created in Oracle RAC environment. Currently, full repopulation is the only method used for graph refresh duplication.HNSW Index Reload at Instance Restart or New Node Joining Cluster
-
If a recent full checkpoint exists: The instance loads the HNSW graph directly from the checkpoint, ensuring rapid recovery.
-
If no valid checkpoint is available: Graph is recreated from the scratch and the new graph will not be identical to the older graph.
The reload behavior is governed by the
VECTOR_INDEX_NEIGHBOR_GRAPH_RELOAD
parameter. If the
VECTOR_INDEX_NEIGHBOR_GRAPH_RELOAD
instance parameter is set to
RESTART
(default setting) at the time the instance joins the
cluster or when restarting, and if a full checkpoint exists and is not deemed too
old compared to the current instance SCN, then it is used by the starting instance
to create its HNSW graph in memory. If these two conditions are not met, then the
starting instance uses the duplication mechanism to create the HNSW graph in memory
from scratch. If the VECTOR_INDEX_NEIGHBOR_GRAPH_RELOAD
instance
parameter is set to OFF
at the time the instance joins the cluster
or when restarting, then the HNSW graph is not reloaded.
Parent topic: About In-Memory Neighbor Graph Vector Index