ORA_HASH

Syntax

Purpose

ORA_HASH is a function that computes a hash value for a given expression. Use this function to analyze a subset of data and generate a random sample.

  • The expr argument determines the data for which you want Oracle Database to compute a hash value. There are no restrictions on the length of data represented by expr, which commonly resolves to a column name. The expr cannot be a LONG or LOB type. It cannot be a user-defined object type unless it is a nested table type. The hash value for nested table types does not depend on the order of elements in the collection. All other data types are supported for expr.

  • The optional max_bucket argument determines the maximum bucket value returned by the hash function. You can specify any value between 0 and 4294967295. The default is 4294967295.

    Note that the default hash value is a 32 bit unsigned number. The default max_bucket of 2^32-1 simply returns this value.

    Bucketing is done using a MOD function to the default value. If max_bucket = N, then the bucket value is computed by (default hash value) MOD (N + 1), which results in a bucket value between 0 and N.

    Note that this technique does not result in a statistically uniform distribution of values across buckets and is somewhat biased towards smaller bucket numbers, except when N + 1 is a power of 2. This is not noticeable when max_bucket (i.e N) is small relative to the default (4294967295) but may be noticeable for max_bucket values that are very large, especially within an order of magnitude of the default, say > 100M.

  • The optional seed_value argument enables Oracle to produce many different results for the same set of data. Oracle applies the hash function to the combination of expr and seed_value. You can specify any value between 0 and 4294967295. The default is 0.

The function returns a NUMBER value.

Examples

The following example creates a hash value for each combination of customer ID and product ID in the sh.sales table, divides the hash values into a maximum of 100 buckets, and returns the sum of the amount_sold values in the first bucket (bucket 0). The third argument (5) provides a seed value for the hash function. You can obtain different hash results for the same query by changing the seed value.

SELECT SUM(amount_sold)
  FROM sales
  WHERE ORA_HASH(CONCAT(cust_id, prod_id), 99, 5) = 0;

SUM(AMOUNT_SOLD)
----------------
       989431.14