WIDTH_BUCKET

For a given expression, the WIDTH_BUCKET function returns the bucket number into which the value of this expression would fall after being evaluated.

Return Value

An INTEGER.

Syntax

WIDTH_BUCKET (expr , min_value , max_value , num_buckets)

Arguments

expr

The expression for which the histogram is being created. This expression must evaluate to a number or a datetime value. When expr evaluates to NA, then the expression returns NA.

min_value

An expression that resolves to the minimum end point of the acceptable range for expr. This expression must evaluate to number or datetime values, and cannot evaluate to NA.

max_value

An expression that resolves to the maximum end point of the acceptable range for expr. This expression must evaluate to number or datetime values, and cannot evaluate to NA.

num_buckets

An expression that resolves to a constant indicating the number of buckets. This expression must evaluate to a positive INTEGER.

Notes

Underflow and Overflow Buckets

WIDTH_BUCKET also creates (when needed) an underflow bucket numbered 0 and an overflow bucket numbered num_buckets+1. These buckets handle values less than min_value and more than max_value and are helpful in checking the reasonableness of endpoints.

Constructing Equiwidth Histograms

WIDTH_BUCKET lets you construct equiwidth histograms, in which the histogram range is divided into intervals that have identical size. (Compare this function with NTILE, which creates equiheight histograms.) Ideally each bucket is a "closed-open" interval of the real number line. For example, a bucket can be assigned to cores between 10.00 and 19.999... to indicate that 10 is included in the interval and 20 is excluded. This is sometimes denoted (10, 20).

Examples

Example 26-43 Grouping Values Into a Ten-Bucket Histogram

The following example limits the customers to those customers who have a customer_id of 825 through 853 and, then, creates a ten-bucket histogram on the credit_limit variable for those customer. It uses the credit_group formula to calculate the bucket number for each customer. Customers with credit limits greater than the maximum value are assigned to the overflow bucket, 11.

DESCRIBE
DEFINE customer_id DIMENSION INTEGER
DEFINE cust_last_name VARIABLE TEXT <customer_id>
DEFINE credit_limit VARIABLE INTEGER <customer_id>
DEFINE credit_group FORMULA INTEGER <customer_id> 
EQ WIDTH_BUCKET(credit_limit, 100, 5000, 10)

LIMIT customer_id to 825 TO 853
REPORT cust_last_name, credit_limit, credit_group

CUSTOMER_ID     CUST_LAST_NAME    CREDIT_LIMIT     CREDIT_GROUP
-------------- ---------------- ---------------- ----------------
           825 Dreyfuss                      500                1
           826 Barkin                        500                1
           827 Siegel                        500                1
           828 Minnelli                    2,300                5
           829 Hunter                      2,300                5
           830 Dutt                        3,500                7
           831 Bel Geddes                  3,500                7
           832 Spacek                      3,500                7
           833 Moranis                     3,500                7
           834 Idle                        3,500                7
           835 Eastwood                    1,200                3
           836 Berenger                    1,200                3
           837 Stanton                     1,200                3
           838 Nicholson                   3,500                7
           839 Johnson                     3,500                7
           840 Elliott                     1,400                3
           841 Boyer                       1,400                3
           842 Stern                       1,400                3
           843 Oates                         700                2
           844 Julius                        700                2
           845 Fawcett                     5,000               11
           846 Brando                      5,000               11
           847 Streep                      5,000               11
           848 Olmos                       1,800                4
           849 Kaurusmdki                  1,800                4
           850 Finney                      2,300                5
           851 Brown                       2,300                5
           852 Tanner                      2,300                5
           853 Palin                         400                1