Scalar Quantized HNSW Indexes

Scalar quantized HNSW indexes can be used to reduce memory requirements and accelerate similarity search.

In the context of vector databases, scalar quantization is a method of compressing or approximating high-dimensional data vectors by independently quantizing each dimension (or scalar component) of the vector. This approach is used to reduce the storage requirements of a vector database while preserving the essential structure of the data for tasks such as similarity search. Scalar quantization can be particularly beneficial when simplicity and low storage requirements are high priorities.

High-dimensional vectors stored as floating-point numbers require significant storage space. Scalar quantization saves memory by compressing the vectors. For instance, a 128-dimensional vector stored in 8 bits (INT8) per dimension requires only 128 bytes of memory, while 32-bit floating-point storage (Float32) would require 512 bytes and 64-bit double storage (Float64) would require 1024 bytes. Because of this reduced storage requirement, scalar quantization can enable vector databases to handle very large datasets, comprised of even billions of vectors, that would otherwise be impractical due to storage constraints.

Another benefit of scalar quantization is improved search speed. Approximate nearest neighbor (ANN) search methods can leverage quantized data for faster comparisons, as operations on integers or binary codes are faster than floating-point calculations.

With Oracle AI Database, uniform scalar quantization is the implemented compression method, meaning the input range is divided into equal-size intervals. Each interval is represented by a single quantization level, usually its midpoint. The range of each dimension is divided into 2b equal intervals, where b is the number of bits (for example, 8 bits for int8). Each float value is mapped to the nearest quantization level.

Note:

With scalar quantization, the amount of memory allocated in the vector pool is determined based on the quantization scale, not on the actual data type size.

Scalar quantization is supported with HNSW indexes and can be specified during index creation using the CREATE VECTOR INDEX DDL. For syntax and additional information, see Hierarchical Navigable Small World Index Syntax and Parameters and Oracle AI Database SQL Language Reference.

When performing inference on scalar quantized vectors, a rescore factor parameter is available as an option of the SELECT statement and can be used to refine your top-K results during query execution. For information and syntax, see Oracle AI Database SQL Language Reference.