Optimizing Star Queries

A typical query in the access layer is a join between the fact table and some number of dimension tables and is often referred to as a star query. In a star query each dimension table is joined to the fact table using a primary key to foreign key join. Normally the dimension tables do not join to each other.

Typically, in this kind of query all of the WHERE clause predicates are on the dimension tables and the fact table. Optimizing this type of query is very straight forward.

To optimize this query, do the following:

  • Create a bitmap index on each of the foreign key columns in the fact table or tables

  • Set the initialization parameter STAR_TRANSFORMATION_ENABLED to TRUE.

This enables the optimizer feature for star queries which is off by default for backward compatibility.

If your environment meets these two criteria, your star queries should use a powerful optimization technique that rewrites or transforms your SQL called star transformation. Star transformation executes the query in two phases:

  1. Retrieves the necessary rows from the fact table (row set).
  2. Joins this row set to the dimension tables.

Example 5-2 Star Transformation

Provides the step by step process to use STAR_TRANSFORMATION to optimize a star query.

A business question that could be asked against the star schema in Figure 3-1 would be "What was the total number of umbrellas sold in Boston during the month of May 2008?"

  1. The original query.

    select SUM(quantity_sold) total_umbrellas_sold_in_Boston
    From Sales s, Customers c, Products p, Times t
    Where s.cust_id=cust_id
    And s.prod_id = p.prod_id
    And s.time_id=t.time_id
    And c.cust_city='BOSTON'
    And p.product='UMBRELLA'
    And t.month='MAY'
    And t.year=2012;

    As you can see all of the where clause predicates are on the dimension tables and the fact table (Sales) is joined to each of the dimensions using their foreign key, primary key relationship.

  2. Take the following actions:

    1. Create a bitmap index on each of the foreign key columns in the fact table or tables.

    2. Set the initialization parameter STAR_TRANSFORMATION_ENABLED to TRUE.

  3. The rewritten query. Oracle rewrites and transfers the query to retrieve only the necessary rows from the fact table using bitmap indexes on the foreign key columns

    select SUM(quantity_sold
    From Sales
    Where cust_id IN
    (select c.cust_id From Customers c Where c.cust_city='BOSTON')
    And s.prod_id IN
    (select p.prod_id From Products p Where  p.product='UMBRELLA')
    And s.time_id IN
    (select t.time_id From Times(Where t.month='MAY' And t.year=2012);

    By rewriting the query in this fashion you can now leverage the strengths of bitmap indexes. Bitmap indexes provide set based processing within the database, allowing you to use various fact methods for set operations such as AND, OR, MINUS, and COUNT. So, you use the bitmap index on time_id to identify the set of rows in the fact table corresponding to sales in May 2008. In the bitmap the set of rows are actually represented as a string of 1's and 0's. A similar bitmap is retrieved for the fact table rows corresponding to the sale of umbrellas and another is accessed for sales made in Boston. At this point there are three bitmaps, each representing a set of rows in the fact table that satisfy an individual dimension constraint. The three bitmaps are then combined using a bitmap AND operation and this newly created final bitmap is used to extract the rows from the fact table needed to evaluate the query.

  4. Using the rewritten query, Oracle joins the rows from fact tables to the dimension tables.

    The join back to the dimension tables is normally done using a hash join, but the Oracle Optimizer selects the most efficient join method depending on the size of the dimension tables.

The rows from the fact table are retrieved by using bitmap joins between the bitmap indexes on all of the foreign key columns. The end user never needs to know any of the details of STAR_TRANSFORMATION, as the optimizer automatically chooses STAR_TRANSFORMATION when it is appropriate.

The following figure shows the typical execution plan for a star query when STAR_TRANSFORMATION has kicked in. The execution plan may not look exactly as you expected. There is no join back to the customer table after the rows have been successfully retrieved from the Sales table. If you look closely at the select list, you can see that there is not anything actually selected from the Customers table so the optimizer knows not to bother joining back to that dimension table. You may also notice that for some queries even if STAR_TRANSFORMATION does kick in it may not use all of the bitmap indexes on the fact table. The optimizer decides how many of the bitmap indexes are required to retrieve the necessary rows from the fact table. If an additional bitmap index would not improve the selectivity, the optimizer does not use it. The only time you see the dimension table that corresponds to the excluded bitmap in the execution plan is during the second phase or the join back phase.