Scalar Quantized HNSW Indexes
Scalar quantized HNSW indexes can be used to reduce memory requirements and accelerate similarity search.
In the context of vector databases, scalar quantization is a method of compressing or approximating high-dimensional data vectors by independently quantizing each dimension (or scalar component) of the vector. This approach is used to reduce the storage requirements of a vector database while preserving the essential structure of the data for tasks such as similarity search. Scalar quantization can be particularly beneficial when simplicity and low storage requirements are high priorities.
High-dimensional vectors stored as floating-point numbers require
significant storage space. Scalar quantization saves memory by compressing the
vectors. For instance, a 128-dimensional vector stored in 8 bits
(INT8) per dimension requires only 128 bytes of memory, while
32-bit floating-point storage (Float32) would require 512 bytes and
64-bit double storage (Float64) would require 1024 bytes. Because
of this reduced storage requirement, scalar quantization can enable vector databases
to handle very large datasets, comprised of even billions of vectors, that would
otherwise be impractical due to storage constraints.
Another benefit of scalar quantization is improved search speed. Approximate nearest neighbor (ANN) search methods can leverage quantized data for faster comparisons, as operations on integers or binary codes are faster than floating-point calculations.
With Oracle AI Database, uniform scalar quantization is the implemented compression method, meaning the input range is divided into equal-size intervals. Each interval is represented by a single quantization level, usually its midpoint. The range of each dimension is divided into 2b equal intervals, where b is the number of bits (for example, 8 bits for int8). Each float value is mapped to the nearest quantization level.
Note:
With scalar quantization, the amount of memory allocated in the vector pool is determined based on the quantization scale, not on the actual data type size.Scalar quantization is supported with HNSW indexes and can be specified
during index creation using the CREATE VECTOR INDEX DDL. For syntax
and additional information, see Hierarchical Navigable Small World Index Syntax and Parameters and Oracle AI Database SQL
Language Reference.
When performing inference on scalar quantized vectors, a rescore
factor parameter is available as an option of the
SELECT statement and can be used to refine your top-K results
during query execution. For information and syntax, see Oracle AI Database SQL
Language Reference.
Parent topic: In-Memory Neighbor Graph Vector Index