8 Functions: Colt Aggregate

Oracle CQL provides a set of built-in aggregate functions based on the Colt open source libraries for high performance scientific and technical computing.

For more information, see Section 1.1.9, "Functions".

8.1 Introduction to Oracle CQL Built-In Aggregate Colt Functions

Table 8-1 lists the built-in aggregate Colt functions that Oracle CQL provides.

Table 8-1 Oracle CQL Built-in Aggregate Colt-Based Functions

Colt Package Function

Colt Package	Function
`cern.jet.stat.Descriptive` A set of basic descriptive statistics functions.	`autocorrelation` `correlation` `covariance` `geometricmean` `geometricmean1` `harmonicmean` `kurtosis` `lag1` `mean` `meandeviation` `median` `moment` `pooledmean` `pooledvariance` `product` `quantile` `quantileinverse` `rankinterpolated` `rms` `samplekurtosis` `samplekurtosisstandarderror` `sampleskew` `sampleskewstandarderror` `samplevariance` `skew` `standarddeviation` `standarderror` `sumofinversions` `sumoflogarithms` `sumofpowerdeviations` `sumofpowers` `sumofsquareddeviations` `sumofsquares` `trimmedmean` `variance` `weightedmean` `winsorizedmean`

cern.jet.stat.Descriptive

A set of basic descriptive statistics functions.

autocorrelation
correlation
covariance
geometricmean
geometricmean1
harmonicmean
kurtosis
lag1
mean
meandeviation
median
moment
pooledmean
pooledvariance
product
quantile
quantileinverse
rankinterpolated
rms
samplekurtosis
samplekurtosisstandarderror
sampleskew
sampleskewstandarderror
samplevariance
skew
standarddeviation
standarderror
sumofinversions
sumoflogarithms
sumofpowerdeviations
sumofpowers
sumofsquareddeviations
sumofsquares
trimmedmean
variance
weightedmean
winsorizedmean

Note:

Built-in function names are case sensitive and you must use them in the case shown (in lower case).

Note:

In stream input examples, lines beginning with h (such as h 3800) are heartbeat input tuples. These inform Oracle CEP that no further input will have a timestamp lesser than the heartbeat value.

In relation output examples, the first tuple output is:

-9223372036854775808:+

This value is -Long.MIN_VALUE() and represents the largest negative timestamp possible.

For more information, see:

8.1.1 Oracle CQL Colt Aggregate Function Signatures and Tuple Arguments

Note that the signatures of the Oracle CQL Colt aggregate functions do not match the signatures of the corresponding Colt aggregate functions.

Consider the following Colt aggregate function:

double autocorrelation(DoubleArrayList data, int lag, double mean, double variance)

In this signature, data is the Collection over which aggregates will be calculated and mean and variance are the other two parameter aggregates which are required to calculate autoCorrelation (where mean and variance aggregates are calculated on data).

In Oracle CEP, data will never come in the form of a Collection. The Oracle CQL function receives input data in a stream of tuples.

So suppose our stream is defined as S:(double val, integer lag). On each input tuple, the Oracle CQL autocorrelation function will compute two intermediate aggregates, mean and variance, and one final aggregate, autocorrelation.

Since the function expects a stream of tuples having a double data value and an integer lag value only, the signature of the Oracle CQL autocorrelation function is:

double autocorrelation (double data, int lag)

autocorrelation

Syntax

Purpose

autocorrelation is based on cern.jet.stat.Descriptive.autoCorrelation(DoubleArrayList data, int lag, double mean, double variance). It returns the auto-correlation of a data sequence of the input arguments as a double.

Note:

This function has semantics different from "lag1"

This function takes the following tuple arguments:

double1: data value.
int1: lag.

For more information, see

Examples

Consider the query qColtAggr1 in Example 8-1. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-2, the query returns the relation in Example 8-3.

Example 8-1 autocorrelation Function Query

<query id="qColtAggr1"><![CDATA[ 
     select autocorrelation(c3, c1) from SColtAggrFunc
]]></query>

Example 8-2 autocorrelation Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-3 autocorrelation Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+

correlation

Syntax

Purpose

correlation is based on cern.jet.stat.Descriptive.correlation(DoubleArrayList data1, double standardDev1, DoubleArrayList data2, double standardDev2) . It returns the correlation of two data sequences of the input arguments as a double.

This function takes the following tuple arguments:

double1: data value 1.
double2: data value 2.

For more information, see

Examples

Consider the query qColtAggr2 in Example 8-4. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-5, the query returns the relation in Example 8-6.

Example 8-4 correlation Function Query

<query id="qColtAggr2"><![CDATA[ 
     select correlation(c3, c3) from SColtAggrFunc
]]></query>

Example 8-5 correlation Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-6 correlation Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           NaN
1000:       -           NaN
1000:       +           2.0
1200:       -           2.0
1200:       +           1.5
2000:       -           1.5
2000:       +           1.333333333333333

covariance

Syntax

Purpose

covariance is based on cern.jet.stat.Descriptive.covariance(DoubleArrayList data1, DoubleArrayList data2). It returns the correlation of two data sequences (see Figure 8-1) of the input arguments as a double.

Figure 8-1 cern.jet.stat.Descriptive.covariance

This function takes the following tuple arguments:

double1: data value 1.
double2: data value 2.

For more information, see:

Examples

Consider the query qColtAggr3 in Example 8-7. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-8, the query returns the relation in Example 8-9.

Example 8-7 covariance Function Query

<query id="qColtAggr3"><![CDATA[ 
     select covariance(c3, c3) from SColtAggrFunc
]]></query>

Example 8-8 covariance Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-9 covariance Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           NaN
1000:       -           NaN
1000:       +           50.0
1200:       -           50.0
1200:       +           100.0
2000:       -           100.0
2000:       +           166.66666666666666

geometricmean

Syntax

Purpose

geometricmean is based on cern.jet.stat.Descriptive.geometricMean(DoubleArrayList data). It returns the geometric mean of a data sequence (see Figure 8-2) of the input argument as a double.

Figure 8-2 cern.jet.stat.Descriptive.geometricMean(DoubleArrayList data)

This function takes the following tuple arguments:

double1: data value.

Note that for a geometric mean to be meaningful, the minimum of the data values must not be less than or equal to zero.

For more information, see:

Examples

Consider the query qColtAggr6 in Example 8-10. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-11, the query returns the relation in Example 8-12.

Example 8-10 geometricmean Function Query

<query id="qColtAggr6"><![CDATA[ 
    select geometricmean(c3) from SColtAggrFunc
]]></query>

Example 8-11 geometricmean Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-12 geometricmean Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           40.0
1000:       -           40.0
1000:       +           34.64101615137755
1200:       -           34.64101615137755
1200:       +           28.844991406148168
2000:       -           28.844991406148168
2000:       +           22.133638394006436

geometricmean1

Syntax

Purpose

geometricmean1 is based on cern.jet.stat.Descriptive.geometricMean(double sumOfLogarithms). It returns the geometric mean of a data sequence (see Figure 8-3) of the input arguments as a double.

Figure 8-3 cern.jet.stat.Descriptive.geometricMean1(int size, double sumOfLogarithms)

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr7 in Example 8-13. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-14, the query returns the relation in Example 8-15.

Example 8-13 geometricmean1 Function Query

<query id="qColtAggr7"><![CDATA[ 
    select geometricmean1(c3) from SColtAggrFunc
]]></query>

Example 8-14 geometricmean1 Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-15 geometricmean1 Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           Infinity
1000:       -           Infinity
1000:       +           Infinity
1200:       -           Infinity
1200:       +           Infinity
2000:       -           Infinity
2000:       +           Infinity

harmonicmean

Syntax

Purpose

harmonicmean is based on cern.jet.stat.Descriptive.harmonicMean(int size, double sumOfInversions). It returns the harmonic mean of a data sequence as a double.

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr8 in Example 8-16. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-17, the query returns the relation in Example 8-18.

Example 8-16 harmonicmean Function Query

<query id="qColtAggr8"><![CDATA[ 
    select harmonicmean(c3) from SColtAggrFunc
]]></query>

Example 8-17 harmonicmean Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-18 harmonicmean Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           0.0
1000:       -           0.0
1000:       +           0.0
1200:       -           0.0
1200:       +           0.0
2000:       -           0.0
2000:       +           0.0

kurtosis

Syntax

Purpose

kurtosis is based on cern.jet.stat.Descriptive.kurtosis(DoubleArrayList data, double mean, double standardDeviation). It returns the kurtosis or excess (see Figure 8-4) of a data sequence as a double.

Figure 8-4 cern.jet.stat.Descriptive.kurtosis(DoubleArrayList data, double mean, double standardDeviation)

This function takes the following tuple arguments:

double1: data value.

For more information, see

Examples

Consider the query qColtAggr12 in Example 8-19. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-20, the query returns the relation in Example 8-21.

Example 8-19 kurtosis Function Query

<query id="qColtAggr12"><![CDATA[ 
    select kurtosis(c3) from SColtAggrFunc
]]></query>

Example 8-20 kurtosis Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-21 kurtosis Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           NaN
1000:       -           NaN
1000:       +           -2.0
1200:       -           -2.0
1200:       +           -1.5000000000000002
2000:       -           -1.5000000000000002
2000:       +           -1.3600000000000003

lag1

Syntax

Purpose

lag1 is based on cern.jet.stat.Descriptive.lag1(DoubleArrayList data, double mean). It returns the lag - 1 auto-correlation of a dataset as a double.

Note:

This function has semantics different from "autocorrelation".

This function takes the following tuple arguments:

double1: data value.

For more information, see

Examples

Consider the query qColtAggr14 in Example 8-22. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-23, the query returns the relation in Example 8-24.

Example 8-22 lag1 Function Query

<query id="qColtAggr14"><![CDATA[ 
    select lag1(c3) from SColtAggrFunc
]]></query>

Example 8-23 lag1 Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-24 lag1 Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           NaN
1000:       -           NaN
1000:       +           -0.5
1200:       -           -0.5
1200:       +           0.0
2000:       -           0.0
2000:       +           0.25

mean

Syntax

Purpose

mean is based on cern.jet.stat.Descriptive.mean(DoubleArrayList data). It returns the arithmetic mean of a data sequence (see Figure 8-5) as a double.

Figure 8-5 cern.jet.stat.Descriptive.mean(DoubleArrayList data)

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr16 in Example 8-25. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-26, the query returns the relation in Example 8-27.

Example 8-25 mean Function Query

<query id="qColtAggr16"><![CDATA[ 
    select mean(c3) from SColtAggrFunc
]]></query>

Example 8-26 mean Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-27 mean Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           40.0
1000:       -           40.0
1000:       +           35.0
1200:       -           35.0
1200:       +           30.0
2000:       -           30.0
2000:       +           25.0

meandeviation

Syntax

Purpose

meandeviation is based on cern.jet.stat.Descriptive.meanDeviation(DoubleArrayList data, double mean). It returns the mean deviation of a dataset (see Figure 8-6) as a double.

Figure 8-6 cern.jet.stat.Descriptive.meanDeviation(DoubleArrayList data, double mean)

This function takes the following tuple arguments:

double1: data value.

For more information, see

Examples

Consider the query qColtAggr17 in Example 8-28. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-29, the query returns the relation in Example 8-30.

Example 8-28 meandeviation Function Query

<query id="qColtAggr17"><![CDATA[ 
    select meandeviation(c3) from SColtAggrFunc
]]></query>

Example 8-29 meandeviation Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-30 meandeviation Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           0.0
1000:       -           0.0
1000:       +           5.0
1200:       -           5.0
1200:       +           6.666666666666667
2000:       -           6.666666666666667
2000:       +           10.0

median

Syntax

Purpose

median is based on cern.jet.stat.Descriptive.median(DoubleArrayList sortedData). It returns the median of a sorted data sequence as a double.

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr18 in Example 8-31. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-32, the query returns the relation in Example 8-33.

Example 8-31 median Function Query

<query id="qColtAggr18"><![CDATA[ 
    select median(c3) from SColtAggrFunc
]]></query>

Example 8-32 median Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-33 median Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           40.0
1000:       -           40.0
1000:       +           35.0
1200:       -           35.0
1200:       +           30.0
2000:       -           30.0
2000:       +           25.0

moment

Syntax

Purpose

moment is based on cern.jet.stat.Descriptive.moment(DoubleArrayList data, int k, double c). It returns the moment of the k-th order with constant c of a data sequence (see Figure 8-7) as a double.

Figure 8-7 cern.jet.stat.Descriptive.moment(DoubleArrayList data, int k, double c)

This function takes the following tuple arguments:

double1: data value.
int1: k.
double2: c.

For more information, see:

Examples

Consider the query qColtAggr21 in Example 8-34. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-35, the query returns the relation in Example 8-36.

Example 8-34 moment Function Query

<query id="qColtAggr21"><![CDATA[ 
    select moment(c3, c1, c3) from SColtAggrFunc
]]></query>

Example 8-35 moment Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-36 moment Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           0.0
1000:       -           0.0
1000:       +           5000.0
1200:       -           5000.0
1200:       +           3000.0
2000:       -           3000.0
2000:       +           1.7045E11

pooledmean

Syntax

Purpose

pooledmean is based on cern.jet.stat.Descriptive.pooledMean(int size1, double mean1, int size2, double mean2). It returns the pooled mean of two data sequences (see Figure 8-8) as a double.

Figure 8-8 cern.jet.stat.Descriptive.pooledMean(int size1, double mean1, int size2, double mean2)

This function takes the following tuple arguments:

double1: mean 1.
double2: mean 2.

For more information, see

Examples

Consider the query qColtAggr22 in Example 8-37. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-38, the query returns the relation in Example 8-39.

Example 8-37 pooledmean Function Query

<query id="qColtAggr22"><![CDATA[ 
    select pooledmean(c3, c3) from SColtAggrFunc
]]></query>

Example 8-38 pooledmean Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-39 pooledmean Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           40.0
1000:       -           40.0
1000:       +           35.0
1200:       -           35.0
1200:       +           30.0
2000:       -           30.0
2000:       +           25.0

pooledvariance

Syntax

Purpose

pooledvariance is based on cern.jet.stat.Descriptive.pooledVariance(int size1, double variance1, int size2, double variance2). It returns the pooled variance of two data sequences (see Figure 8-9) as a double.

Figure 8-9 cern.jet.stat.Descriptive.pooledVariance(int size1, double variance1, int size2, double variance2)

This function takes the following tuple arguments:

double1: variance 1.
double2: variance 2.

For more information, see

Examples

Consider the query qColtAggr23 in Example 8-40. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-41, the query returns the relation in Example 8-42.

Example 8-40 pooledvariance Function Query

<query id="qColtAggr23"><![CDATA[ 
    select pooledvariance(c3, c3) from SColtAggrFunc
]]></query>

Example 8-41 pooledvariance Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-42 pooledvariance Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           0.0
1000:       -           0.0
1000:       +           25.0
1200:       -           25.0
1200:       +           66.66666666666667
2000:       -           66.66666666666667
2000:       +           125.0

product

Syntax

Purpose

product is based on cern.jet.stat.Descriptive.product(DoubleArrayList data). It returns the product of a data sequence (see Figure 8-10) as a double.

Figure 8-10 cern.jet.stat.Descriptive.product(DoubleArrayList data)

Surrounding text describes Figure 8-10 .

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr24 in Example 8-43. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-44, the query returns the relation in Example 8-45.

Example 8-43 product Function Query

<query id="qColtAggr24"><![CDATA[ 
    select product(c3) from SColtAggrFunc
]]></query>

Example 8-44 product Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-45 product Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           40.0
1000:       -           40.0
1000:       +           1200.0
1200:       -           1200.0
1200:       +           24000.0
2000:       -           24000.0
2000:       +           240000.0

quantile

Syntax

Purpose

quantile is based on cern.jet.stat.Descriptive.quantile(DoubleArrayList sortedData, double phi). It returns the phi-quantile as a double; that is, an element elem for which holds that phi percent of data elements are less than elem.

This function takes the following tuple arguments:

double1: data value.
double2: phi; the percentage; must satisfy 0 <= phi <= 1.

For more information, see:

Examples

Consider the query qColtAggr26 in Example 8-46. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-47, the query returns the relation in Example 8-48.

Example 8-46 quantile Function Query

<query id="qColtAggr26"><![CDATA[ 
    select quantile(c3, c2) from SColtAggrFunc
]]></query>

Example 8-47 quantile Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-48 quantile Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:         -
  10:         +         40.0
1000:         -         40.0
1000:         +         36.99999988079071
1200:         -         36.99999988079071
1200:         +         37.799999713897705
2000:         -         37.799999713897705
2000:         +         22.000000178813934

quantileinverse

Syntax

Purpose

quantileinverse is based on cern.jet.stat.Descriptive.quantileInverse(DoubleArrayList sortedList, double element). It returns the percentage phi of elements <= element (0.0 <= phi <= 1.0) as a double. This function does linear interpolation if the element is not contained but lies in between two contained elements.

This function takes the following tuple arguments:

double1: data.
double2: element.

For more information, see:

Examples

Consider the query qColtAggr27 in Example 8-49. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-50, the query returns the relation in Example 8-51.

Example 8-49 quantileinverse Function Query

<query id="qColtAggr27"><![CDATA[ 
    select quantileinverse(c3, c3) from SColtAggrFunc
]]></query>

Example 8-50 quantileinverse Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-51 quantileinverse Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           1.0
1000:       -           1.0
1000:       +           0.5
1200:       -           0.5
1200:       +           0.3333333333333333
2000:       -           0.3333333333333333
2000:       +           0.25

rankinterpolated

Syntax

Purpose

rankinterpolated is based on cern.jet.stat.Descriptive.rankInterpolated(DoubleArrayList sortedList, double element). It returns the linearly interpolated number of elements in a list less or equal to a given element as a double.

The rank is the number of elements <= element. Ranks are of the form{0, 1, 2,..., sortedList.size()}. If no element is <= element, then the rank is zero. If the element lies in between two contained elements, then linear interpolation is used and a non-integer value is returned.

This function takes the following tuple arguments:

double1: data value.
double2: element.

For more information, see:

Examples

Consider the query qColtAggr29 in Example 8-52. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-53, the query returns the relation in Example 8-54.

Example 8-52 rankinterpolated Function Query

<query id="qColtAggr29"><![CDATA[ 
    select rankinterpolated(c3, c3) from SColtAggrFunc
]]></query>

Example 8-53 rankinterpolated Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-54 rankinterpolated Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       -
  10:       +           1.0
1000:       -           1.0
1000:       +           1.0
1200:       -           1.0
1200:       +           1.0
2000:       -           1.0
2000:       +           1.0

rms

Syntax

Purpose

rms is based on cern.jet.stat.Descriptive.rms(int size, double sumOfSquares). It returns the Root-Mean-Square (RMS) of a data sequence (see Figure 8-11) as a double.

Figure 8-11 cern.jet.stat.Descriptive.rms(int size, double sumOfSquares)

Surrounding text describes Figure 8-11 .

This function takes the following tuple arguments:

double1: data value.

For more information, see

Examples

Consider the query qColtAggr30 in Example 8-55. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-56, the query returns the relation in Example 8-57.

Example 8-55 rms Function Query

<query id="qColtAggr30"><![CDATA[ 
    select rms(c3) from SColtAggrFunc
]]></query>

Example 8-56 rms Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-57 rms Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           40.0
1000:       -           40.0
1000:       +           35.35533905932738
1200:       -           35.35533905932738
1200:       +           31.09126351029605
2000:       -           31.09126351029605
2000:       +           27.386127875258307

samplekurtosis

Syntax

Purpose

samplekurtosis is based on cern.jet.stat.Descriptive.sampleKurtosis(DoubleArrayList data, double mean, double sampleVariance). It returns the sample kurtosis (excess) of a data sequence as a double.

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr31 in Example 8-58. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-59, the query returns the relation in Example 8-60.

Example 8-58 samplekurtosis Function Query

<query id="qColtAggr31"><![CDATA[ 
     select samplekurtosis(c3) from SColtAggrFunc
]]></query>

Example 8-59 samplekurtosis Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-60 samplekurtosis Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           NaN
1000:       -           NaN
1000:       +           NaN
1200:       -           NaN
1200:       +           NaN
2000:       -           NaN
2000:       +           -1.1999999999999993

samplekurtosisstandarderror

Syntax

Purpose

samplekurtosisstandarderror is based on cern.jet.stat.Descriptive.sampleKurtosisStandardError(int size). It returns the standard error of the sample Kurtosis as a double.

This function takes the following tuple arguments:

int1: data value.

For more information, see:

Examples

Consider the query qColtAggr33 in Example 8-61. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-62, the query returns the relation in Example 8-63.

Example 8-61 samplekurtosisstandarderror Function Query

<query id="qColtAggr33"><![CDATA[ 
     select samplekurtosisstandarderror(c1) from SColtAggrFunc
]]></query>

Example 8-62 samplekurtosisstandarderror Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-63 samplekurtosisstandarderror Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           0.0
1000:       -           0.0
1000:       +           Infinity
1200:       -           Infinity
1200:       +           Infinity
2000:       -           Infinity
2000:       +           2.6186146828319083

sampleskew

Syntax

Purpose

sampleskew is based on cern.jet.stat.Descriptive.sampleSkew(DoubleArrayList data, double mean, double sampleVariance). It returns the sample skew of a data sequence as a double.

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr34 in Example 8-64. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-65, the query returns the relation in Example 8-66.

Example 8-64 sampleskew Function Query

<query id="qColtAggr34"><![CDATA[ 
    select sampleskew(c3) from SColtAggrFunc
]]></query>

Example 8-65 sampleskew Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-66 sampleskew Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           NaN
1000:       -           NaN
1000:       +           NaN
1200:       -           NaN
1200:       +           0.0
2000:       -           0.0
2000:       +           0.0

sampleskewstandarderror

Syntax

Purpose

sampleskewstandarderror is based on cern.jet.stat.Descriptive.sampleSkewStandardError(int size). It returns the standard error of the sample skew as a double.

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr36 in Example 8-67. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-68, the query returns the relation in Example 8-69.

Example 8-67 sampleskewstandarderror Function Query

<query id="qColtAggr36"><![CDATA[ 
    select sampleskewstandarderror(c1) from SColtAggrFunc
]]></query>

Example 8-68 sampleskewstandarderror Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-69 sampleskewstandarderror Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           -0.0
1000:       -           -0.0
1000:       +           Infinity
1200:       -           Infinity
1200:       +           1.224744871391589
2000:       -           1.224744871391589
2000:       +           1.01418510567422

samplevariance

Syntax

Purpose

samplevariance is based on cern.jet.stat.Descriptive.sampleVariance(DoubleArrayList data, double mean). It returns the sample variance of a data sequence (see Figure 8-12) as a double.

Figure 8-12 cern.jet.stat.Descriptive.sampleVariance(DoubleArrayList data, double mean)

Surrounding text describes Figure 8-12 .

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr38 in Example 8-70. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-71, the query returns the relation in Example 8-72.

Example 8-70 samplevariance Function Query

<query id="qColtAggr38"><![CDATA[ 
    select samplevariance(c3) from SColtAggrFunc
]]></query>

Example 8-71 samplevariance Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-72 samplevariance Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           NaN
1000:       -           NaN
1000:       +           50.0
1200:       -           50.0
1200:       +           100.0
2000:       -           100.0
2000:       +           166.66666666666666

skew

Syntax

Purpose

skew is based on cern.jet.stat.Descriptive.skew(DoubleArrayList data, double mean, double standardDeviation). It returns the skew of a data sequence of a data sequence (see Figure 8-13) as a double.

Figure 8-13 cern.jet.stat.Descriptive.skew(DoubleArrayList data, double mean, double standardDeviation)

Surrounding text describes Figure 8-13 .

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr41 in Example 8-73. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-74, the query returns the relation in Example 8-75.

Example 8-73 skew Function Query

<query id="qColtAggr41"><![CDATA[ 
    select skew(c3) from SColtAggrFunc
]]></query>

Example 8-74 skew Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-75 skew Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           NaN
1000:       -           NaN
1000:       +           0.0
1200:       -           0.0
1200:       +           0.0
2000:       -           0.0
2000:       +           0.0

standarddeviation

Syntax

Purpose

standarddeviation is based on cern.jet.stat.Descriptive.standardDeviation(double variance). It returns the standard deviation from a variance as a double.

This function takes the following tuple arguments:

double1: data value.

For more information, see

Examples

Consider the query qColtAggr44 in Example 8-76. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-77, the query returns the relation in Example 8-78.

Example 8-76 standarddeviation Function Query

<query id="qColtAggr44"><![CDATA[ 
    select standarddeviation(c3) from SColtAggrFunc
]]></query>

Example 8-77 standarddeviation Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-78 standarddeviation Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           0.0
1000:       -           0.0
1000:       +           5.0
1200:       -           5.0
1200:       +           8.16496580927726
2000:       -           8.16496580927726
2000:       +           11.180339887498949

standarderror

Syntax

Purpose

standarderror is based on cern.jet.stat.Descriptive.standardError(int size, double variance). It returns the standard error of a data sequence (see Figure 8-14) as a double.

Figure 8-14 cern.jet.stat.Descriptive.cern.jet.stat.Descriptive.standardError(int size, double variance)

Surrounding text describes Figure 8-14 .

This function takes the following tuple arguments:

double1: data value.

For more information, see

Examples

Consider the query qColtAggr45 in Example 8-79. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-80, the query returns the relation in Example 8-81.

Example 8-79 standarderror Function Query

<query id="qColtAggr45"><![CDATA[ 
     select standarderror(c3) from SColtAggrFunc
]]></query>

Example 8-80 standarderror Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-81 standarderror Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           0.0
1000:       -           0.0
1000:       +           3.5355339059327378
1200:       -           3.5355339059327378
1200:       +           4.714045207910317
2000:       -           4.714045207910317
2000:       +           5.5901699437494745

sumofinversions

Syntax

Purpose

sumofinversions is based on cern.jet.stat.Descriptive.sumOfInversions(DoubleArrayList data, int from, int to). It returns the sum of inversions of a data sequence (see Figure 8-15) as a double.

Figure 8-15 cern.jet.stat.Descriptive.sumOfInversions(DoubleArrayList data, int from, int to)

Surrounding text describes Figure 8-15 .

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr48 in Example 8-82. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-83, the query returns the relation in Example 8-84.

Example 8-82 sumofinversions Function Query

<query id="qColtAggr48"><![CDATA[ 
     select sumofinversions(c3) from SColtAggrFunc
]]></query>

Example 8-83 sumofinversions Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-84 sumofinversions Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           Infinity
1000:       -           Infinity
1000:       +           Infinity
1200:       -           Infinity
1200:       +           Infinity
2000:       -           Infinity
2000:       +           Infinity

sumoflogarithms

Syntax

Purpose

sumoflogarithms is based on cern.jet.stat.Descriptive.sumOfLogarithms(DoubleArrayList data, int from, int to). It returns the sum of logarithms of a data sequence (see Figure 8-16) as a double.

Figure 8-16 cern.jet.stat.Descriptive.sumOfLogarithms(DoubleArrayList data, int from, int to)

Surrounding text describes Figure 8-16 .

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr49 in Example 8-85. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-86, the query returns the relation in Example 8-87.

Example 8-85 sumoflogarithms Function Query

<query id="qColtAggr49"><![CDATA[ 
    select sumoflogarithms(c3) from SColtAggrFunc
]]></query>

Example 8-86 sumoflogarithms Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-87 sumoflogarithms Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           -Infinity
1000:       -           -Infinity
1000:       +           -Infinity
1200:       -           -Infinity
1200:       +           -Infinity
2000:       -           -Infinity
2000:       +           -Infinity

sumofpowerdeviations

Syntax

Purpose

sumofpowerdeviations is based on cern.jet.stat.Descriptive.sumOfPowerDeviations(DoubleArrayList data, int k, double c). It returns sum of power deviations of a data sequence (see Figure 8-17) as a double.

Figure 8-17 cern.jet.stat.Descriptive.sumOfPowerDeviations(DoubleArrayList data, int k, double c)

Surrounding text describes Figure 8-17 .

This function is optimized for common parameters like c == 0.0, k == -2 .. 4, or both.

This function takes the following tuple arguments:

double1: data value.
int1: k.
double2: c.

For more information, see:

Examples

Consider the query qColtAggr50 in Example 8-88. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-89, the query returns the relation in Example 8-90.

Example 8-88 sumofpowerdeviations Function Query

<query id="qColtAggr50"><![CDATA[ 
    select sumofpowerdeviations(c3, c1, c3) from SColtAggrFunc
]]></query>

Example 8-89 sumofpowerdeviations Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-90 sumofpowerdeviations Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           0.0
1000:       -           0.0
1000:       +           10000.0
1200:       -           10000.0
1200:       +           9000.0
2000:       -           9000.0
2000:       +           6.818E11

sumofpowers

Syntax

Purpose

sumofpowers is based on cern.jet.stat.Descriptive.sumOfPowers(DoubleArrayList data, int k). It returns the sum of powers of a data sequence (see Figure 8-18) as a double.

Figure 8-18 cern.jet.stat.Descriptive.sumOfPowers(DoubleArrayList data, int k)

Surrounding text describes Figure 8-18 .

This function takes the following tuple arguments:

double1: data value.
int1: k.

For more information, see:

Examples

Consider the query qColtAggr52 in Example 8-91. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-92, the query returns the relation in Example 8-93.

Example 8-91 sumofpowers Function Query

<query id="qColtAggr52"><![CDATA[ 
    select sumofpowers(c3, c1) from SColtAggrFunc
]]></query>

Example 8-92 sumofpowers Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-93 sumofpowers Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           40.0
1000:       -           40.0
1000:       +           3370000.0
1200:       -           3370000.0
1200:       +           99000.0
2000:       -           99000.0
2000:       +           7.2354E12

sumofsquareddeviations

Syntax

Purpose

sumofsquareddeviations is based on cern.jet.stat.Descriptive.sumOfSquaredDeviations(int size, double variance). It returns the sum of squared mean deviation of a data sequence (see Figure 8-19) as a double.

Figure 8-19 cern.jet.stat.Descriptive.sumOfSquaredDeviations(int size, double variance)

Surrounding text describes Figure 8-19 .

This function takes the following tuple arguments:

double1: data value.

For more information, see

Examples

Consider the query qColtAggr53 in Example 8-94. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-95, the query returns the relation in Example 8-96.

Example 8-94 sumofsquareddeviations Function Query

<query id="qColtAggr53"><![CDATA[ 
    select sumofsquareddeviations(c3) from SColtAggrFunc
]]></query>

Example 8-95 sumofsquareddeviations Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-96 sumofsquareddeviations Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           0.0
1000:       -           0.0
1000:       +           25.0
1200:       -           25.0
1200:       +           133.33333333333334
2000:       -           133.33333333333334
2000:       +           375.0

sumofsquares

Syntax

Purpose

sumofsquares is based on cern.jet.stat.Descriptive.sumOfSquares(DoubleArrayList data). It returns the sum of squares of a data sequence (see Figure 8-20) as a double.

Figure 8-20 cern.jet.stat.Descriptive.sumOfSquares(DoubleArrayList data)

Surrounding text describes Figure 8-20 .

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr54 in Example 8-97. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-98, the query returns the relation in Example 8-99.

Example 8-97 sumofsquares Function Query

<query id="qColtAggr54"><![CDATA[ 
    select sumofsquares(c3) from SColtAggrFunc
]]></query>

Example 8-98 sumofsquares Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-99 sumofsquares Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           1600.0
1000:       -           1600.0
1000:       +           2500.0
1200:       -           2500.0
1200:       +           2900.0
2000:       -           2900.0
2000:       +           3000.0

trimmedmean

Syntax

Purpose

trimmedmean is based on cern.jet.stat.Descriptive.trimmedMean(DoubleArrayList sortedData, double mean, int left, int right). It returns the trimmed mean of an ascending sorted data sequence as a double.

This function takes the following tuple arguments:

double1: data value.
int1: left.
int2: right.

For more information, see:

Examples

Consider the query qColtAggr55 in Example 8-100. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-101, the query returns the relation in Example 8-102.

Example 8-100 trimmedmean Function Query

<query id="qColtAggr55"><![CDATA[ 
    select trimmedmean(c3, c1, c1) from SColtAggrFunc
]]></query>

Example 8-101 trimmedmean Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-102 trimmedmean Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+

variance

Syntax

Purpose

variance is based on cern.jet.stat.Descriptive.variance(int size, double sum, double sumOfSquares). It returns the variance of a data sequence (see Figure 8-21) as a double.

Figure 8-21 cern.jet.stat.Descriptive.variance(int size, double sum, double sumOfSquares)

Surrounding text describes Figure 8-21 .

This function takes the following tuple arguments:

double1: data value.

For more information, see:

Examples

Consider the query qColtAggr57 in Example 8-103. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-104, the query returns the relation in Example 8-105.

Example 8-103 variance Function Query

<query id="qColtAggr57"><![CDATA[ 
    select variance(c3) from SColtAggrFunc
]]></query>

Example 8-104 variance Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-105 variance Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           0.0
1000:       -           0.0
1000:       +           25.0
1200:       -           25.0
1200:       +           66.66666666666667
2000:       -           66.66666666666667
2000:       +           125.0

weightedmean

Syntax

Purpose

weightedmean is based on cern.jet.stat.Descriptive.weightedMean(DoubleArrayList data, DoubleArrayList weights). It returns the weighted mean of a data sequence (see Figure 8-22) as a double.

Figure 8-22 cern.jet.stat.Descriptive.weightedMean(DoubleArrayList data, DoubleArrayList weights)

Surrounding text describes Figure 8-22 .

This function takes the following tuple arguments:

double1: data value.
double2: weight value.

For more information, see:

Examples

Consider the query qColtAggr58 in Example 8-106. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-107, the query returns the relation in Example 8-108.

Example 8-106 weightedmean Function Query

<query id="qColtAggr58"><![CDATA[ 
    select weightedmean(c3, c3) from SColtAggrFunc
]]></query>

Example 8-107 weightedmean Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-108 weightedmean Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+
  10:       - 
  10:       +           40.0
1000:       -           40.0
1000:       +           35.714285714285715
1200:       -           35.714285714285715
1200:       +           32.22222222222222
2000:       -           32.22222222222222
2000:       +           30.0

winsorizedmean

Syntax

Purpose

winsorizedmean is based on cern.jet.stat.Descriptive.winsorizedMean(DoubleArrayList sortedData, double mean, int left, int right). It returns the winsorized mean of a sorted data sequence as a double.

This function takes the following tuple arguments:

double1: data value.
int1: left.
int2: right.

For more information, see:

Examples

Consider the query qColtAggr60 in Example 8-109. Given the data stream SColtAggrFunc with schema (c1 integer, c2 float, c3 double, c4 bigint) in Example 8-110, the query returns the relation in Example 8-111.

Example 8-109 winsorizedmean Function Query

<query id="qColtAggr60"><![CDATA[ 
    select winsorizedmean(c3, c1, c1) from SColtAggrFunc
]]></query>

Example 8-110 winsorizedmean Function Stream Input

Timestamp   Tuple
  10        1, 0.5, 40.0, 8
1000        4, 0.7, 30.0, 6
1200        3, 0.89, 20.0, 12
2000        8, 0.4, 10.0, 4
h 8000
h 200000000

Example 8-111 winsorizedmean Function Relation Output

Timestamp   Tuple Kind  Tuple
-9223372036854775808:+