Oracle9i SQL Reference Release 1 (9.0.1) Part Number A90125-01 |
|
Functions, 98 of 166
The linear regression functions are:
linear_regr::=
linear_regr
The linear regression functions fit an ordinary-least-squares regression line to a set of number pairs. You can use them as both aggregate and analytic functions.
Oracle applies the function to the set of (expr1, expr2) pairs after eliminating all pairs for which either expr1 or expr2 is null. Oracle computes all the regression functions simultaneously during a single pass through the data.
expr1 is interpreted as a value of the dependent variable (a "y value"), and expr2 is interpreted as a value of the independent variable (an "x value"). Both expressions must be numbers.
REGR_SLOPE
returns the slope of the line. The return value is a number and can be null. After the elimination of null (expr1, expr2) pairs, it makes the following computation:
COVAR_POP(expr1, expr2) / VAR_POP(expr2)
REGR_INTERCEPT
returns the y-intercept of the regression line. The return value is a number and can be null. After the elimination of null (expr1, expr2) pairs, it makes the following computation:
AVG(expr1) - REGR_SLOPE(expr1, expr2) * AVG(expr2)
REGR_COUNT
returns an integer that is the number of non-null number pairs used to fit the regression line.
REGR_R2
returns the coefficient of determination (also called "R-squared" or "goodness of fit") for the regression. The return value is a number and can be null. VAR_POP
(expr1) and VAR_POP
(expr2) are evaluated after the elimination of null pairs. The return values are:
NULL if VAR_POP(expr2) = 0 1 if VAR_POP(expr1) = 0 and VAR_POP(expr2) != 0 POWER(CORR(expr1,expr),2) if VAR_POP(expr1) > 0 and VAR_POP(expr2 != 0
All of the remaining regression functions return a number and can be null:
REGR_AVGX
evaluates the average of the independent variable (expr2) of the regression line. It makes the following computation after the elimination of null (expr1, expr2) pairs:
AVG(expr2)
REGR_AVGY
evaluates the average of the dependent variable (expr1) of the regression line. It makes the following computation after the elimination of null (expr1, expr2) pairs:
AVG(expr1)
REGR_SXY
, REGR_SXX
, REGR_SYY
are auxiliary functions that are used to compute various diagnostic statistics.
REGR_SXX
makes the following computation after the elimination of null (expr1, expr2) pairs:
REGR_COUNT(expr1, expr2) * VAR_POP(expr2)
REGR_SYY
makes the following computation after the elimination of null (expr1, expr2) pairs:
REGR_COUNT(expr1, expr2) * VAR_POP(expr1)
REGR_SXY
makes the following computation after the elimination of null (expr1, expr2) pairs:
REGR_COUNT(expr1, expr2) * COVAR_POP(expr1, expr2)
The following examples are based on the sample tables sh.sales
and sh.products
.
The following example provides a comparison of the various linear regression functions:
SELECT s.channel_id, REGR_SLOPE(s.quantity_sold, p.prod_list_price) SLOPE , REGR_INTERCEPT(s.quantity_sold, p.prod_list_price) INTCPT , REGR_R2(s.quantity_sold, p.prod_list_price) RSQR , REGR_COUNT(s.quantity_sold, p.prod_list_price) COUNT , REGR_AVGX(s.quantity_sold, p.prod_list_price) AVGLISTP , REGR_AVGY(s.quantity_sold, p.prod_list_price) AVGQSOLD FROM sales s, products p WHERE s.prod_id=p.prod_id AND p.prod_category='Men' AND s.time_id=to_DATE('10-OCT-2000') GROUP BY s.channel_id ; C SLOPE INTCPT RSQR COUNT AVGLISTP AVGQSOLD - ---------- ---------- ---------- ---------- ---------- ---------- C -.03529838 16.4548382 .217277422 17 87.8764706 13.3529412 I -.0108044 13.3082392 .028398018 43 116.77907 12.0465116 P -.01729665 11.3634927 .026191191 33 80.5818182 9.96969697 S -.01277499 13.488506 .000473089 71 52.571831 12.8169014 T -.01026734 5.01019929 .064283727 21 75.2 4.23809524
The following example determines the slope and intercept of the regression line for the amount of sales and sale profits for each fiscal year.
SELECT t.fiscal_year, REGR_SLOPE(s.amount_sold, s.quantity_sold) "Slope", REGR_INTERCEPT(s.amount_sold, s.quantity_sold) "Intercept" FROM sales s, times t WHERE s.time_id = t.time_id GROUP BY t.fiscal_year; FISCAL_YEAR Slope Intercept ----------- ---------- ---------- 1998 54.7377214 45.3884971 1999 54.4868592 44.3616117 2000 55.4035957 44.717026
The following example determines the cumulative slope and cumulative intercept of the regression line for the amount of and quantity of sales for the fourth quarter of 1998:
SELECT t.fiscal_month_number "Month", t.day_number_in_month "Day", REGR_SLOPE(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) AS CUM_SLOPE, REGR_INTERCEPT(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) AS CUM_ICPT FROM sales s, times t WHERE s.time_id = t.time_id AND t.fiscal_year=1998 AND t.fiscal_quarter_number = 4 ORDER BY t.fiscal_month_desc, t.day_number_in_month;
Month Day CUM_SLOPE CUM_ICPT ---------- ---------- ---------- ---------- ... 11 18 47.775583 40.028992 11 18 47.775583 40.028992 11 18 47.775583 40.028992 11 18 47.775583 40.028992 11 18 47.775583 40.028992 11 18 47.775583 40.028992 11 18 47.775583 40.028992 11 18 47.775583 40.028992 11 19 47.6878438 40.6296492 11 19 47.6878438 40.6296492 ...
The following example returns the number of customers in the customers
table (out of a total of 319) who have account managers.
SELECT REGR_COUNT(customer_id, account_mgr_id) FROM customers; REGR_COUNT(CUSTOMER_ID,ACCOUNT_MGR_ID) -------------------------------------- 231
The following example computes the cumulative number of transactions for each day in April of 1998:
SELECT UNIQUE t.day_number_in_month, REGR_COUNT(s.amount_sold, s.quantity_sold) OVER (PARTITION BY t.fiscal_month_number ORDER BY t.day_number_in_month) "Regr_Count" FROM sales s, times t WHERE s.time_id = t.time_id AND t.fiscal_year = 1998 AND t.fiscal_month_number = 4; DAY_NUMBER_IN_MONTH Regr_Count ------------------- ---------- 1 825 2 1650 3 2475 4 3300 . . . 26 21450 30 22200
The following example computes the coefficient of determination of the regression line for amount of sales greater than 5000 and quantity sold:
SELECT REGR_R2(amount_sold, quantity_sold) FROM sales WHERE amount_sold > 5000; REGR_R2(AMOUNT_SOLD,QUANTITY_SOLD) ---------------------------------- .005208421
The following example computes the cumulative coefficient of determination of the regression line for monthly sales amounts and quantities for each month during 1998:
SELECT t.fiscal_month_number, REGR_R2(SUM(s.amount_sold), SUM(s.quantity_sold)) OVER (ORDER BY t.fiscal_month_number) "Regr_R2" FROM sales s, times t WHERE s.time_id = t.time_id AND t.fiscal_year = 1998 GROUP BY t.fiscal_month_number ORDER BY t.fiscal_month_number; FISCAL_MONTH_NUMBER Regr_R2 ------------------- ---------- 1 2 1 3 .763816809 4 .581171805 5 .854723188 6 .877870333 7 .907073344 8 .905223336 9 .912142295 10 .858149007 11 .74838262 12 .738707443
The following example calculates the regression average for the amount and quantity of sales for each year:
SELECT t.fiscal_year, REGR_AVGY(s.amount_sold, s.quantity_sold) "Regr_AvgY", REGR_AVGX(s.amount_sold, s.quantity_sold) "Regr_AvgX" FROM sales s, times t WHERE s.time_id = t.time_id GROUP BY t.fiscal_year; FISCAL_YEAR Regr_AvgY Regr_AvgX ----------- ---------- ---------- 1998 745.788191 12.7955581 1999 741.839772 12.8008509 2000 752.701384 12.7786717
The following example calculates the cumulative averages for the amount and quantity of sales profits in December of 1998:
SELECT t.day_number_in_month, REGR_AVGY(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) "Regr_AvgY", REGR_AVGX(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) "Regr_AvgX" FROM sales s, times t WHERE s.time_id = t.time_id AND t.fiscal_month_desc = '1998-12' ORDER BY t.day_number_in_month; DAY_NUMBER_IN_MONTH Regr_AvgY Regr_AvgX ------------------- ---------- ---------- 1 695.028571 12.9 1 695.028571 12.9 1 695.028571 12.9 . . . 27 692.061411 12.9648677 27 692.061411 12.9648677 27 692.061411 12.9648677
The following example computes the REGR_SXY
, REGR_SXX
, and REGR_SYY
values for the regression analysis of amount and quantity of sales for each year in the sample sh.sales
table:
SELECT t.fiscal_year, REGR_SXY(s.amount_sold, s.quantity_sold) "Regr_sxy", REGR_SYY(s.amount_sold, s.quantity_sold) "Regr_syy", REGR_SXX(s.amount_sold, s.quantity_sold) "Regr_sxx" FROM sales s, times t WHERE s.time_id = t.time_id GROUP BY t.fiscal_year; FISCAL_YEAR Regr_sxy Regr_syy Regr_sxx ----------- ---------- ---------- ---------- 1998 1757092061 2.5677E+11 32100204.7 1999 2112447869 3.0619E+11 38769859.3 2000 2338925878 3.4321E+11 42216138.7
The following example computes the cumulative REGR_SXY
, REGR_SXX
, and REGR_SYY
statistics for amount and quantity of sales for each year-month value in 1998:
SELECT t.day_number_in_month, REGR_SXY(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_sxy", REGR_SYY(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_syy", REGR_SXX(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_sxx" FROM sales s, times t WHERE s.time_id = t.time_id AND t.fiscal_month_desc = '1998-02' ORDER BY t.day_number_in_month; DAY_NUMBER_IN_MONTH Regr_sxy Regr_syy Regr_sxx ------------------- ---------- ---------- ---------- 1 144226271 2.1996E+10 2497704.77 . . . 30 144226271 2.1996E+10 2497704.77
|
Copyright © 1996-2001, Oracle Corporation. All Rights Reserved. |
|