HNSW RAC Duplication

Learn how Hierarchical Navigable Small World (HNSW) indexes are populated during index creation, index repopulation, or instance startup in an Oracle Real Application Clusters (Oracle RAC) or a non-RAC environment.

HNSW Index Creation in RAC Environment

Currently HNSW index in an Oracle RAC environment is created in centralized mode. The diagram below depicts the workflow for HNSW index creation in an Oracle RAC environment:

  1. The RAC instance that creates the HNSW index is responsible for creating the HNSW ROWID-to-VID mapping table on the disk. As explained in Optimizer Plans for HNSW Vector Indexes, this table is needed by the optimizer to run certain optimization plans.

  2. By default, once the first HNSW slice graph is created, all other RAC instances are informed to start their own HNSW In-Memory slice graph creation concurrently. This operation is called an HNSW duplication operation. Duplication is achieved in two steps: The primary RAC instance creates a disk checkpoint of the HNSW slice graph that it created and then all the other secondary RAC nodes use that disk checkpoint to load the HNSW slice graph. Creating HNSW index in the centralized mode ensures that all nodes share an identical graph structure.

    HNSW slice graphs constructed on the RAC nodes are categorized into the following two categories, based on whether the base table is partitioned. In this context, a slice refers to the subset of vector data used to build an index. A slice can represent either the entire table or an individual partition or sub partition of a table. Each HNSW graph is constructed over a single slice:
    • HNSW index table-slice graph
    • HNSW index partition-slice graph

Note:

Instances that do not have enough vector memory cannot participate in the parallel RAC-wide HNSW index population.

In addition to the duplication mechanism that creates the HNSW index across the RAC nodes, full repopulation is used to perform a HNSW slice graph refresh duplication. See HNSW Index Architecture: Transaction Support and Persistence for more information about why and when HNSW index full repopulation operation is triggered. The HNSW slice graph refresh duplication process is identical to that used during index creation: The primary instance initiates the HNSW slice graph refresh and creates a new full disk checkpoint. The other instances reload the HNSW slice graph from the latest checkpoint.

HNSW Index Reload at Instance Restart or New Node Joining Cluster

Because HNSW indexes reside in memory, the in-memory HNSW slice graphs representing your HNSW indexes are lost if an instance shuts down. Upon restart or when a new node joins the cluster, Oracle triggers a reload mechanism to quickly restore the HNSW slice graphs in memory. This reload mechanism is enabled by default for both Oracle RAC and non-RAC single instance environments. As explained in HNSW Index Creation in RAC Environment, during index creation time, the in-memory HNSW slice graph and the ROWID-to-VID mapping table on disks are created. If enabled, a full checkpoint is also created on disks. The way the reload mechanism is processed depends on the existence of a full HNSW slice graph checkpoint on disk. Read HNSW Index Architecture: Transaction Support and Persistence to understand more about checkpoints.
  • If a recent full checkpoint exists: The instance loads the HNSW slice graph directly from the checkpoint, ensuring rapid recovery.

  • If no valid checkpoint is available: HNSW slice graph is recreated from the scratch and the new HNSW slice graph will not be identical to the older HNSW slice graph.

The reload behavior is governed by the VECTOR_INDEX_NEIGHBOR_GRAPH_RELOAD parameter. If the VECTOR_INDEX_NEIGHBOR_GRAPH_RELOAD instance parameter is set to RESTART (default setting) at the time the instance joins the cluster or when restarting, and if a full checkpoint exists and is not deemed too old compared to the current instance SCN, then it is used by the starting instance to create its HNSW slice graph in memory. If these two conditions are not met, then the starting instance uses the duplication mechanism to create the HNSW slice graph in memory from scratch. If the VECTOR_INDEX_NEIGHBOR_GRAPH_RELOAD instance parameter is set to OFF at the time the instance joins the cluster or when restarting, then the HNSW slice graph is not reloaded.