SUM
The SUM function takes a vector expression as input and
returns the sum as a vector with format FLOAT64.
Purpose
SUM is mainly used, in Natural Language Processing
(NLP), to compute a representation of a sentence or a document. It is common to sum
the word embeddings of all words. The resulting vector represents the entire text,
allowing models to work with sentences instead of just words.
The following diagram illustrates the vector representation, in a 2D space, of word embeddings. If you take S1, "The astronomer observed the stars", and S2, "The stargazer watched the celestial bodies", you can observe that the corresponding vector sum of each constituting word of each sentence (S1 and S2 in the diagram) are very close, for example, in terms of cosine similarity. This means that the two sentences are very similar in meaning.
The result of SUM with a vector expression is the equivalent of
consecutively performing vector addition operations on all non-NULL
inputs. The returned vector has the same number of dimensions as the input and has
the format FLOAT64. When the expression has a flexible number of
dimensions, all inputs must have the same number of dimensions within each aggregate
group.
The SUM function with vector expressions as input can be
used as a single set aggregate or in the GROUP BY clause. Using
ROLLUP is also supported. The SUM function
accepts vector expressions as input for aggregate operations but cannot currently be
applied to analytic operations.
NULL vectors are ignored and are not counted when
calculating the sum vector. If all inputs within an aggregate group are
NULL, the result is NULL for that group. If
the result overflows the FLOAT64 maximum value, an error is raised,
regardless of the format of the input vector type.
With vector inputs, using DISTINCT,
CUBE, and GROUPING SETS is not supported.
Also, BINARY and SPARSE vectors cannot be supplied
as input.
For the full definition and implementation of the SUM
function, see Oracle AI Database SQL
Language Reference.
CREATE TABLE sum_t (v VECTOR, k1 NUMBER, k2 VARCHAR2(100));
INSERT INTO sum_t VALUES ('[2, 4, 6]', 2, 'even');
INSERT INTO sum_t VALUES ('[8, 10, 12]', 2, 'even');
INSERT INTO sum_t VALUES ('[1, 3, 5]', 3, 'odd');
INSERT INTO sum_t VALUES ('[7, 9, 11]', 3, 'odd');
SELECT SUM(v) v_sum FROM sum_t;
V_SUM
---------------------------------------------------
[1.8E+001,2.6E+001,3.4E+001]
SELECT SUM(v) v_sum, k1 FROM sum_t GROUP BY k1;
V_SUM K1
--------------------------------------- ----------
[1.0E+001,1.4E+001,1.8E+001] 2
[8.0E+000,1.2E+001,1.6E+001] 3
SELECT SUM(v) v_sum FROM sum_t GROUP BY ROLLUP(k1, k2);
V_SUM
---------------------------------------------------
[1.0E+001,1.4E+001,1.8E+001]
[1.0E+001,1.4E+001,1.8E+001]
[8.0E+000,1.2E+001,1.6E+001]
[8.0E+000,1.2E+001,1.6E+001]
[1.8E+001,2.6E+001,3.4E+001]
CREATE TABLE sum_diff_dim_t (v VECTOR, k1 NUMBER, k2 VARCHAR2(100));
INSERT INTO sum_diff_dim_t VALUES ('[2, 4, 6]', 2, 'even');
INSERT INTO sum_diff_dim_t VALUES ('[8, 10, 12]', 2, 'even');
INSERT INTO sum_diff_dim_t VALUES ('[1, 3, 5, 7]', 3, 'odd');
INSERT INTO sum_diff_dim_t VALUES ('[9, 11, 13, 15]', 3, 'odd');
SELECT SUM(v) v_sum, k2 FROM sum_diff_dim_t GROUP BY k2;
V_SUM K2
--------------------------------------- ----------
[1.0E+001,1.4E+001,1.8E+001] even
[1.0E+001,1.4E+001,1.8E+001,2.2E+001] odd
SELECT SUM(v) v_sum FROM sum_diff_dim_t;
ERROR:
ORA-51808: SUM(vector) requires all vectors to have the same dimension count.
Encountered (3, 4).
Parent topic: Aggregate Functions

