Hierarchical Navigable Small World Index Syntax and Parameters

Syntax and examples for creating Hierarchical Navigable Small World vector indexes.

Syntax

CREATE VECTOR INDEX vector_index_name 
ON table_name (vector_column)
[GLOBAL] ORGANIZATION INMEMORY [NEIGHBOR] GRAPH
[WITH] [DISTANCE [CUSTOM [<schema>.][<package>.]] metric name]
[WITH TARGET ACCURACY percentage_value]
[QUANTIZATION SCALAR COMPRESSION RATIO {2|4|8}]
[PARAMETERS (TYPE HNSW , 
    {NEIGHBORS max_closest_vectors_connected | M max_closest_vectors_connected}, 
    EFCONSTRUCTION max_candidates_to_consider,
    RESCORE FACTOR rescore_factor,
    ALGORITHM UNIFORM_QUANTIZATION)]
[PARALLEL degree_of_parallelism]

HNSW Parameters

NEIGHBORS and M are equivalent and represent the maximum number of neighbors a vector can have on any layer. The last vertex has one additional flexibility that it can have up to 2M neighbors.

EFCONSTRUCTION represents the maximum number of closest vector candidates considered at each step of the search during insertion.

RESCORE FACTOR is an optional integer parameter that defines the rescoring multiplier for semantic search queries. It has a default value of 1 and is applied only when the index is quantized (when QUANTIZATION is specified).

The valid range for HNSW vector index parameters are:

Usage Notes

You can optionally use the QUANTIZATION keyword together with the COMPRESSION RATIO parameter in order to apply scalar quantization to a new vector index.

The COMPRESSION RATIO specifies the level of compression for a vector column when creating a scalar-quantized index. Higher compression ratios yield smaller storage requirements but increased quantization. The possible values for COMPRESSION RATIO are 2, 4, and 8.

The following table shows the resulting data type stored in the index for each compression ratio, depending on the original vector column format.

Compression Ratio FLOAT64 Input FLOAT32 Input FLOAT16 Input
2 Float32 Float16 Int8
4 Float16 Int8 Not Supported
6 Int8 Not Supported Not Supported

Note that "not supported" means that scalar quantization is not available for the given combination of compression ratio and vector column format.

Hierarchical Navigable Small World CREATE INDEX ONLINE

In situations where a base table needs to be available for ongoing updates and cannot be locked during index creation, you can create your HNSW (Hierarchical Navigable Small World) index online with the ONLINE clause of CREATE VECTOR INDEX. This way an application with heavy DML need not stop updating the base table for indexing.

Syntax

CREATE VECTOR INDEX index_name ON table_name(vector_column)
  ORGANIZATION INMEMORY NEIGHBOR GRAPH
  [WITH TARGET ACCURACY 95]
  [DISTANCE EUCLIDEAN]
  PARAMETERS (type HNSW, neighbors 32, efConstruction 500)
  ONLINE;
Things to consider for online index creation
  • All vectors indexed need to have the same dimension and storage type throughout the index creation. Any changes or inconsistencies during index creation result in errors (for example, ORA-51902, ORA-51934).
  • If the base table experiences a very high rate of DML during index build, then index creation time may increase.
  • Online index creation with included columns (columns in addition to the vector column) is not currently supported.
  • Online distributed HNSW index creation is not currently supported.

Examples

CREATE VECTOR INDEX galaxies_hnsw_idx ON galaxies (embedding) 
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 95;

CREATE VECTOR INDEX galaxies_hnsw_idx ON galaxies (embedding) 
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 90 PARAMETERS (type HNSW, neighbors 40, efconstruction 500)
PARALLEL 8;

CREATE VECTOR INDEX galaxies_quantized_idx ON galaxies (embedding)
ORGANIZATION INMEMORY NEIGHBOR GRAPH WITH TARGET ACCURACY 90
DISTANCE EUCLIDEAN
QUANTIZATION SCALAR COMPRESSION RATIO 2
PARAMETERS (type HNSW, neighbors 32, efConstruction 200,
             rescore factor 2, algorithm uniform_quantization);

CREATE VECTOR INDEX galaxies_hnsw_idx ON galaxies(embedding) 
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE EUCLIDEAN
WITH TARGET ACCURACY 95
PARAMETERS (type HNSW, neighbors 32, efConstruction 500)
ONLINE;

For detailed information, see CREATE VECTOR INDEX in Oracle AI Database SQL Language Reference.