2.5 Oracle Machine Learning for SQL Statistical Functions
Various SQL statistical functions are available in Oracle Database to explore and analyze data.
A variety of scalable statistical functions are accessible through SQL in Oracle Database. These statistical functions are implemented as SQL functions. The SQL statistical functions can be used to compute standard univariate statistics such as MEAN
, MAX
, MIN
, MEDIAN
, MODE
, and standard deviation on the data. Users can also perform various other statistical functions such as ttest, ftest, aggregate functions, analytic functions, or ANOVA. The functions listed in the following table are available from SQL.
Table 25 SQL Statistical Functions Supported by OML4SQL
Function  Description 

APPROX_COUNT 
Returns approximate count of an expression 
APPROX_SUM 
Returns approximate sum of an expression 
APPROX_RANK 
Returns approximate value in a group of values 
CORR 
Retuns the coefficient of correlation of a set of number pairs 
CORR_S 
Calculates the Spearman's rho correlation coefficient 
CORR_K 
Calculates the Kendall's taub correlation coefficient 
COVAR_POP 
Returns the population covariance of a set of number pairs 
COVAR_SAMP 
Returns the sample covariance of a set of number pairs. 
LAG 
LAG is an analytic function. It provides access to more than one row of a table at the same time without a self join.

LEAD 
LEAD is an analytic function. It provides access to more than one row of a table at the same time without a self join.

STATS_BINOMIAL_TEST 
STATS_BINOMIAL_TEST is an exact probability test used for dichotomous variables, where only two possible values exist.

STATS_CROSSTAB 
STATS_CROSSTAB is a method used to analyze two nominal variables.

STATS_F_TEST 
STATS_F_TEST tests whether two variances are significantly different.

STATS_KS_TEST 
STATS_KS_TEST is a KolmogorovSmirnov function that compares two samples to test whether they are from the same population or from populations that have the same distribution.

STATS_MODE 
Takes as its argument a set of values and returns the value that occurs with the greatest frequency 
STATS_MW_TEST 
A Mann Whitney test compares two independent samples to test the null hypothesis that two populations have the same distribution function against the alternative hypothesis that the two distribution functions are different. 
STATS_ONE_WAY_ANOVA 
Tests differences in means (for groups or variables) for statistical significance by comparing two different estimates of variance 
STATS_T_TEST_* 
The ttest measures the significance of a difference of means 
STATS_T_TEST_ONE 
A onesample ttest 
STATS_T_TEST_PAIRED 
A twosample, paired ttest (also known as a crossed ttest) 
STATS_T_TEST_INDEP and STATS_T_TEST_INDEPU 
A ttest of two independent groups with the same variance (pooled variances)
A ttest of two independent groups with unequal variance (unpooled variances) 
STDDEV 
returns the sample standard deviation of a set of numbers 
STDDEV_POP 
Computes the population standard deviation and returns the square root of the population variance 
STDDEV_SAMP 
Computes the cumulative sample standard deviation and returns the square root of the sample variance 
SUM 
Returns the sum of values 
DBMS_STAT_FUNCS
PL/SQL package is also available for users.