38 Support for Vector Data Type in OCI

This section describes the OCI (Oracle Call Interface) API enhancements to fetch and modify the vector column values. It lists the different C data types that can be used for binding or defining of the vector columns. It also provides details to access the vector metadata.

Topics:

38.1 OCIVector Descriptor

This section describes how the OCIVector descriptor is used to represent a VECTOR in OCI (Oracle Call Interface).

OCIVector descriptor is identified by the descriptor type OCI_DTYPE_VECTOR. The OCIVector descriptor can be allocated and freed using OCIDescriptorAlloc() and OCIDescriptorFree() functions respectively. An instance of OCIVector descriptor can be used to read (OCIDefine*()) and write (OCIBind*()) VECTOR content from or to the server. The OCIVectorFromText or OCIVectorFromArray function can also be used to populate the vector descriptor. The vector data present in the OCIVector descriptor can be fetched using the OCIVectorToText or OCIVectorToArray function.

OCI Descriptor Description OCI Descriptor Type Constant
OCIVector * VECTOR descriptor OCI_DTYPE_VECTOR

38.2 Attributes of OCIVector Descriptor

This section describes the attributes of OCIVector descriptor.

OCI_ATTR_VECTOR_DATA_FORMAT

Mode
read-only
Description

This attribute refers to the format in which each dimension of the vector is stored.

The following input formats are supported:
  • IEEE FLOAT16
  • IEEE FLOAT32
  • IEEE FLOAT64
  • INT8

The following are the corresponding values of the attribute:

  • VECTOR_FORMAT_FLOAT16 (1)
  • VECTOR_FORMAT_FLOAT32 (2)
  • VECTOR_FORMAT_FLOAT64 (3)
  • VECTOR_FORMAT_INT8 (4)
Attribute Data Type
ub1

OCI_ATTR_VECTOR_DIMENSION

Mode
read-only
Description

Dimension of the vector.

Attribute Data Type
ub4

OCI_ATTR_VECTOR_PROPERTY

Mode
read-only
Description
Additional details of the vector. 0x01 bit value represents OCI_ATTR_VECTOR_COL_PROPERTY_IS_FLEX flag to check if the vector has the flexible dimensions.
Attribute Data Type
ub4

OCI_ATTR_VECTOR_CHARSET_ID

Mode
read-only
Description
Identity of the character set.
Attribute Data Type
ub2

OCI_ATTR_VECTOR_ERROR_POSITION

Mode
read-only
Description
Position of the error in the input vector data.
Attribute Data Type
ub4

38.3 External VECTOR Data Type and OCI

This section introduces the Oracle Call Interface(OCI) data type constant for the SQL data type VECTOR. VECTOR is an Oracle Database recognized datatype.

Starting with Oracle Database Release 23ai, the OCI drivers provide support for the VECTOR data type. In OCI, VECTOR data type is represented using the constant SQLT_VEC. In other words, a client can write (OCI defines) and read (OCI binds) the VECTOR descriptors using SQLT_VEC as the data type constant.
External Data type Program Variable SQL Type Constant
VECTOR OCIVector* SQLT_VEC

38.4 Bind or Define Support for VECTOR SQL Data Type

To support bind or define for the VECTOR data type, OCI enables the external data type constant SQLT_VEC (OCIVector*) to set the native representation of the array of double, or float or integers using OCIVectorFromArray for binding purpose. The OCIVector descriptor has the canonical representation during define. User can get an array of native representation using OCIVectorToArray function.

Oracle database enables implicit conversion from char, nchar, varchar2, nvarchar2, CLOB and NCLOB to VECTOR data type and vice-versa. Therefore, users can make use of these external data types. Oracle Database takes care of the implicit conversions for binds and the client takes care of implicit conversions for defines. For older clients that cannot specify VECTOR as a bind or define type, the only bind or define type supported is the string type bind or define that corresponds to char, char varchar2, nvarchar2, CLOB, and NCLOB.

The describe Information for tables with vector column type is exposed through OCIDescribeAny().

A new external type SQLT_VEC has been introduced for PL/SQL out binds. This external data type is overloaded to do bind or define for the columns of the VECTOR data type.

38.5 OCI Vector Support Functions

OCI vector support functions are provided for converting native C array to or from the OCIVector descriptor. Additionally, char, nchar, varchar2, nvarchar2, CLOB, and NCLOB external data types can be used in binding or defining of the vector columns.

Table 38-1 OCI Vector Support Functions

Function Purpose
OCIVectorFromText Converts a text representation of a vector to the vector format.
OCIVectorFromArray Converts an array representation of a vector to the vector format.
OCIVectorToText Converts a vector to the text format.
OCIVectorToArray Converts a vector to an array format.

38.5.1 OCIVectorFromText

Converts a text representation of a vector to the vector format.

Purpose

To convert a text representation of a vector to the vector format. It takes an OCIVectorDescriptor, error handle, vector format, vector text, text length, and various vector flags to convert the input text to a vector stored in the descriptor.

Syntax

sword OCIVectorFromText(OCIVector *vectord, OCIError *errhp, ub1 vformat, ub4 vdim
                        const OraText *vtext, ub4 vtextlen, ub4 mode);

Parameters

Parameters Purpose
OCIVector *vectord (IN/OUT) Specifies the OCIVector descriptor associated with the Vector. Stores the vector representation of the input text.
OCIError *errhp Specifies the error handle passed into handle errors associated with the conversions.
ub1 vformat Specifies the format of the elements present in the text (OCI_ATTR_VECTOR_FORMAT_* values).
ub4 vdim Specifies the number of dimensions in the vector text.
const OraText * vtext Specifies the input text to be converted to Vector.
ub4 vtextlen Specifies the length of input text.
ub4 mode Specifies the flags that can be useful in future.

Currently default flag is OCI_DEFAULT.

Returns

OCI_SUCCESS, if it runs successfully.

OCI_ERROR, if an error is returned.

38.5.2 OCIVectorFromArray

Converts an array representation of a vector to the vector format.

Purpose

OCIVectorFromArray function takes OCIVectorDescriptor, error handle, vector format, vector dimension, vector array and various other vector flags to convert the input array to a vector stored in the descriptor.

Syntax

sword OCIVectorFromArray(OCIVector *vectord, OCIError *errhp, ub1 vformat,
                         ub4 vdim, void *vecarray, ub4 mode);

Table 38-2 Parameters

Parameters Purpose
OCIVector *vectord (IN/OUT) Specifies the OCIVector descriptor that stores the vector representation of the input text.
OCIError *errhp Specifies the error handle being passed into the handle errors associated with the conversions.
ub1 vformat Specifies the format of the elements present in the array (OCI_ATTR_VECTOR_FORMAT_* values). Uses the value as follows:
  • OCI_ATTR_VECTOR_FORMAT_FLEX 0
  • OCI_ATTR_VECTOR_FORMAT_FLOAT16 1
  • OCI_ATTR_VECTOR_FORMAT_FLOAT32 2
  • OCI_ATTR_VECTOR_FORMAT_FLOAT64 3
  • OCI_ATTR_VECTOR_FORMAT_INT8 4
ub4 vdim Specifies the number of elements in the array.
void *vecarray Specifies the array to be converted to Vector.
ub4 mode Specifies the flags that can be useful in the future.

Currently default flag is OCI_DEFAULT.

Returns

OCI_SUCCESS, if it runs successfully.

OCI_ERROR, if an error is returned.

38.5.3 OCIVectorToText

Converts a vector to the text format.

Purpose

OCIVectorToText function takes OCIVectorDescriptor, error handle, text pointer, text len pointer, and various other vector flags to convert the vector stored in the descriptor to text format.

Syntax

sword OCIVectorToText(OCIVector *vectord, OCIError *errhp, text *vtext,
                      ub4 *vtextlen, ub4 mode);

Table 38-3 Parameters

Parameter Purpose
OCIVector *vectord Specifies OCIVector descriptor that stores the Vector representation of the data.
OCIError *errhp Specifies the error handle being passed into the handle errors associated with the conversions.
OraText *vtext Specifies the pointer to the buffer used for text representation of the Vector. The caller of this function must allocate this memory.
ub4 *vtextlen Specifies the pointer to the length of the text buffer that is passed to the function

Note:

If the buffer size is small, then an error 51810 is returned and the data is truncated.

Note:

Error: 51810, "Insufficient buffer size for VECTOR to CHAR or VARCHAR conversion."

The preceding error message is returned as the VECTOR column cannot be converted to the specified CHAR or VARCHAR format due to insufficient buffer size.

To resolve this, ensure that the specified CHAR or VARCHAR buffer size is sufficient for storing the converted VECTOR column value. Function LENGTH(FROM_VECTOR(<vector>)) is used to determine the appropriate buffer size.
ub4 mode Specifies the flags that can be useful in the future.

Currently default flag is OCI_DEFAULT.

Returns

  • OCI_SUCCESS, if it runs successfully.
  • OCI_ERROR, if an error is returned. OCI_ERROR is returned if the input buffer does not have enough allocated memory to hold the vector.

38.5.4 OCIVectorToArray

Converts a vector to an array format.

Purpose

OCIVectorToArray function takes OCIVectorDescriptor, error handle, vector format, pointer to vector dimension, pointer to array, and various other vector flags to convert the vector stored in the descriptor to an array format.

Syntax

sword OCIVectorToArray(OCIVector *vectord, OCIError *errhp, ub1 vformat,
                      ub4 *vdim, void *vecarray, ub4 mode);

Table 38-4 Parameters

Parameter Purpose
OCIVector *vectord Specifies the OCIVector descriptor that stores the Vector representation of the data.
OCIError *errhp Specifie the error handle being passed into the handle errors associated with the conversions.
ub1 vformat Specifies the format of the vector stored in descriptor (OCI_ATTR_VECTOR_FORMAT_* values).
ub4 *vdim Specifies the pointer to the number of dimensions in the array buffer, output value is the actual number of dimensions in the vector.
void *vecarray Specifies the pointer to the Vector array. The caller of this function must allocate this memory to accomodate the number of dimensions of the vector, which is specified using vdim parameter multiplied by the size of the vector format which is specified using vformat parameter.
ub4 mode Specifies the flags that can be useful in the future.

Currently default flag is OCI_DEFAULT.

Returns

  • OCI_SUCCESS, if it runs successfully.
  • OCI_ERROR, if an error is returned. OCI_ERROR is returned if the input buffer does not have enough allocated memory to hold the vector.

38.6 Binding and Defining OCIVector *

OCI users can allocate a descriptor of type OCIVector*, assign array of integers, or float, or double content to it, and then use it to write to the database table columns whose SQL data type is VECTOR. The input data type for bind and define must be SQLT_VEC.

The OCI application can also bind and define using the following SQL data types:
  • SQLT_CHR: Character string
  • SQLT_CLOB: Character LOB
  • SQLT_STR: Null-terminated string
  • SQLT_LNG: Long character string
  • SQLT_LVC: Longer long string
  • SQLT_AFC: ANSI fixed character string
  • SQLT_AVC: ANSI variable character string
  • SQLT_VCS: Variable character string

38.7 OCIDescribeAny Enhancements

This section describes the enhancements included in OCIDescribeAny to provide support for describe on tables or views with the new VECTOR column type.

OCI provides the ability to explicitly describe a database object (for example: table) to obtain its metadata. OCI also implicitly receives metadata of the columns being selected as part of the response to the query execution. In both explicit and implicit describe cases, the column metadata is accessed through a parameter handle of type OCIParam * or the Vector descriptor. The newly introduced OCIParam handle attributes OCI_ATTR_VECTOR_DIMENSION, OCI_ATTR_VECTOR_DATA_FORMAT, and OCI_ATTR_VECTOR_PROPERTY are used to access the vector dimension, vector data format, and vector property respectively.

The following call returns the vector dimension for the given column parameter handle, or returns a 0, if the column has a flexible dimension and is not fixed. The Vector_dimension_len is not populated as the vector_dimension has a fixed length of ub4.
OCIAttrGet((dvoid*) colParamHandle, (ub4) OCI_DTYPE_PARAM, 
           (dvoid*) &vector_dimension, (ub4 *) &vector_dimension_len,
           (ub4) OCI_ATTR_VECTOR DIMENSION, 
           (OCIError *) errhp)

The following call returns the pointer to the data format of the vector for the given column parameter handle, or returns 0, if the vector can contain any data format. Vector_data_format_len is not returned as the vector_data_type has a fixed length of ub1.
OCIAttrGet((dvoid*) colParamHandle, (ub4) OCI_DTYPE_PARAM, 
           (dvoid*) &vector_data_format, (ub4 *) &vector_data_format_len,
           (ub4) OCI_ATTR_VECTOR_DATA_FORMAT, 
           (OCIError *) errhp) 

The following call returns the property of the vector for the given column parameter handle, or returns 0, if the column has no properties associated with it. Currently, the only property implemented is OCI_ATTR_VECTOR_COL_PROPERTY_IS_FLEX.

OCIAttrGet((dvoid*) colParamHandle, (ub4) OCI_DTYPE_PARAM, 
           (dvoid*) &vector_property, (ub4 *) &vector_property_len,
           (ub4) OCI_ATTR_VECTOR_PROPERTY, 
           (OCIError *) errhp) 

The attributes OCI_ATTR_VECTOR_DIMENSION, OCI_ATTR_VECTOR_DATA_FORMAT, and OCI_ATTR_VECTOR_PROPERTY must be honored for base table columns, view columns, and all elements in a SELECT list that have a vector associated with it.

You can also call OCIAttrGet on the Vector Descriptor to obtain the attributes, as shown in the following code snippets:
  • OCIAttrGet((dvoid *)vectorDescriptor, OCI_DTYPE_VECTOR, (void *)&vectdim, (ub4*) 0, OCI_ATTR_VECTOR_DIMENSION, (OCIError *)errhp);
  • OCIAttrGet((dvoid *)vectorDescriptor, OCI_DTYPE_VECTOR, (void *)&vectformat, (ub4*) 0, OCI_ATTR_VECTOR_FORMAT, (OCIError *)errhp);
  • OCIAttrGet((dvoid *)vectorDescriptor, OCI_DTYPE_VECTOR, (void *)&vectprop, (ub4*) 0, OCI_ATTR_VECTOR_PROPERTY, (OCIError *)errhp);

38.8 Example Code Snippets for Vectors

SELECT statement

OCIStmt   *ociStmt = (OCIStmt *)NULL;
  	OCIDefine *defnhp1 = NULL;
  	OCIVector *vecp = NULL;
  	OraText *selstmt = "SELECT embedding FROM test ORDER BY id";
  
 	 sb2 ind1 = 0;

 	OCIHandleAlloc(ociEnv, (void*) &ociStmt, OCI_HTYPE_STMT, 0, 0);
  	OCIStmtPrepare2(ociSvcCtx, ociStmt, ociError, selstmt, (ub4)sizeof((selstmt) - 1),
                         (ub4)OCI_NTV_SYNTAX, (ub4)OCI_DEFAULT);
  	OCIDescriptorAlloc(ociEnv, (void**) &vecp, OCI_DTYPE_VECTOR, 0, 0);
  	OCIDefineByPos(ociStmt, &defnhp1, ociError,1, &vecp, 0, SQLT_VEC, &ind1, NULL, 0, OCI_DEFAULT);
  	OCIStmtExecute(ociSvcCtx, ociStmt, ociError, 0, 0, 0, 0, OCI_DEFAULT);
INSERT Statement with Literals
OraText *insStmt = "INSERT INTO test VALUES(%d, '[%d.1, %d.2, %d.3]')";

OCIHandleAlloc(ociEnv, (void*) &insStmt, OCI_HTYPE_STMT, 0, 0);
  	OCIStmtPrepare2(ociSvcCtx, ociStmt, ociError, insStmt, (ub4)stmtlen,
                         (ub4)OCI_NTV_SYNTAX, (ub4)OCI_DEFAULT);
  	OCIStmtExecute(ociSvcCtx, ociStmt, ociError, 0, 0, 0, 0, OCI_DEFAULT);
INSERT Statement with BIND
OCIStmt * ociStmt = (OCIStmt *)NULL;
OCIBind * ociBind1 = (OCIBind *)NULL;
OCIBind * ociBind2 = (OCIBind *)NULL;
OraText *insstmtbnd = "INSERT INTO test VALUES(:1, :2)";
signed int bnd1 = 500;
OCIVector *vecp = NULL;
sb2 ind1 = 0;
sb2 ind2 = 0;
OCIHandleAlloc(ociEnv, (void*) &ociStmt, OCI_HTYPE_STMT, 0, 0);
OCIStmtPrepare2(ociSvcCtx, ociStmt, ociError, insstmtbnd, (ub4)(sizeof(insstmtbnd) - 1),
                   (ub4)OCI_NTV_SYNTAX, (ub4)OCI_DEFAULT);
OCIDescriptorAlloc(ociEnv, (void**) &vecp, OCI_DTYPE_VECTOR, 0, 0);
OCIBindByPos(ociStmt, &ociBind1, ociError,                       
                1, &bnd1, sizeof(signed int), SQLT_INT,                       
				&ind1, NULL, NULL, 0, NULL, OCI_DEFAULT);
OCIBindByPos(ociStmt, &ociBind2, ociError,2, &vecp, 0, SQLT_VEC, &ind2, NULL, NULL, 0, NULL,                                          
OCI_DEFAULT);

OCIVector API

ub2   vformat = LVECTOR_IEEE_FLOAT32;
ub4   vdim = 3;
float vfarr[3];
ub4   indx;

OCIVectorToText(vecp, ociError, &vtext[0], &vtextl, OCI_DEFAULT);
for (indx=0; indx++; indx < vdim)
     vfarr[idx] = indx + indx * 3.1427;
OCIVectorFromArray(vecp, ociError, vformat, vdim,
                               (void *)&vfarr[0], OCI_DEFAULT);