As Is and As Was Queries

Describes the two common query techniques, "as is" and "as was" queries:

Characteristics of an As Is Query

An As Is query has the following characteristics:

  • The resulting report shows the data as it happened.

  • The snowflake dimension tables are also joined using the surrogate key columns (that is the primary key and foreign key columns).

  • The fact table is joined with the dimension tables (at leaf level) using the surrogate key column.

  • Slowly-changing data in the dimensions are joined with their corresponding fact records and are presented individually.

  • It is possible to add up the components if the different versions share similar characteristics.

Characteristics of an As Was Query

An As Was query (also known as point-in-time analysis) has the following characteristics:

  • The resulting report shows the data that would result from freezing the dimensions and dimension hierarchy at a specific point in time.

  • Each snowflake table is initially filtered by applying a point-in-time date filter which selects the records or versions which are valid as of the analysis date. This structure is called the point-in-time version of the snowflake.

  • The filtered snowflake is joined with an unfiltered version of itself by using the natural key. All of the snowflake attributes are taken from the point-in-time version alias. The resulting structure is called the composite snowflake.

  • A composite dimension is formed by joining the individual snowflakes on the surrogate key.

  • The fact table is joined with the composite dimension table at the leaf level using the surrogate key column.

  • The point-in-time version is super-imposed on all other possible SCD versions of the same business entity -- both backward and forward in time. Joining in this fashion gives the impression that the dimension is composed of only the specific point-in-time records.

  • All of the fact components for various versions add up correctly due to the super-imposition of point-in-time attributes within the dimensions.

Example: As Is and As Was Queries Against Oracle Communications Data Model

Provides examples that illustrate the characteristics of As Is and As Was queries.

The following examples illustrate the characteristics of As Is and As Was queries:

Data used for the examples

Assume that your data warehouse has a Customer table, a County, and a TaxPaid fact table. As of January 1, 2012, these tables include the values shown below.

Customer Table

Table 5-2 Customer Table

Customer ID Customer Code Customer Name Gender Marital Status County ID County Code County Name ... Eff Frm Eff To

101

JoD

John Doe

Male

Single

5001

SV

Sunnyvale

...

1-Jan-12

31-Dec-99

102

JaD

Jane Doe

Female

Single

5001

SV

Sunnyvale

...

1-Jan-12

31-Dec-99

103

JiD

Jim Doe

Male

Married

5002

CU

Cupertino

...

1-Jan-12

31-Dec-99

Table 5-3 County Table

County ID County Cde County Name Population ... Eff Frm Eff To

5001

SV

Sunnyvale

Very High

...

1-Jan-12

31-Dec-99

5002

CU

Cupertino

High

...

1-Jan-12

31-Dec-99

Table 5-4 Tax Paid Table

Customer ID Day Tax Type Tax

101

1-Jan-12

Professional Tax

100

102

1-Jan-12

Professional Tax

100

103

1-Jan-12

Professional Tax

100

Assume that the following events occurred in January 2012:

  • On January 20, 2012, Jane Doe marries.

  • On Jan 29, 2012, John Doe moves from Sunnyvale to Cupertino.

Consequently, as shown below, on February 1, 2012, the Customer and TaxPaid tables have new data while the values in the County table stay the same.

Table 5-5 Customer table

Customer ID Customer Code Customer Name Gender Marital Status County Id County Code County Name ... Eff Frm Eff To

101

JoD

John Doe

Male

Single

5001

SV

Sunnyvale

...

1-Jan-12

29-Jan-12

102

JaD

Jane Doe

Female

Single

5001

SV

Sunnyvale

...

1-Jan-12

20-Jan-12

103

JiD

Jim Doe

Male

Married

5002

CU

Cupertino

...

1-Jan-12

31-Dec-99

104

JaD

Jane Doe

Female

Married

5001

SV

Sunnyvale

...

21-Jan-12

31-Dec-99

105

JoD

John Doe

Male

Single

5002

CD

Cupertino

...

30-Jan-12

31-Dec-99

Table 5-6 County table

County Id County CD County Name Population ... Eff Frm Eff To

5001

SV

Sunnyvale

Very High

...

1-Jan-12

31-Dec-99

5002

CU

Cupertino

High

...

1-Jan-12

31-Dec-99

Table 5-7 Tax Paid Table

Customer ID Day Tax Type Tax

101

1-Jan-12

Professional Tax

100

102

1-Jan-12

Professional Tax

100

103

1-Jan-12

Professional Tax

100

105

1-Feb-12

Professional Tax

100

104

1-Feb-12

Professional Tax

100

103

1-Feb-12

Professional Tax

100

As Is Query for Tax Collection Split by Marital Status

Assuming the Data used for the examples, to show the tax collection data split by martial status, the following SQL statement that joins the TaxPaid fact table and the Customer dimension table on the cust_id surrogate key and the Customer and County snowflakes on the cnty_id surrogate key:

SELECT cust.cust_nm, cust.m_status, SUM(fct.tx) 
FROM taxpaid fct, customer cust, county cnty 
WHERE fct.cust_id = cust.cust_id 
AND cust.cnty_id = cnt.cnt_id
GROUP BY cust.cust_nm, cust.m_status
ORDER BY 1,2,3;

Note that there are two rows for Jane Doe; one row for a marital status of Married and another for a marital status of Single.

Table 5-8 County Table

Customer Name Marital Status Tax

Jane Doe

Married

100

Jane Doe

Single

100

Jim Doe

Married

200

John Doe

Single

200

Example 5-6 As Was Queries for Tax Collection Split by Marital Status

Assuming the Data used for the examples, issue the following SQL statement to show the tax collection data split by marital status using an analysis date of January 15, 2012:

select
   cust.cust_nm, cust.m_status, sum(fct.tax)
from
   TaxPaid fct, 
   (
      select 
         cust_act.cust_id, cust_pit.cust_cd, cust_pit.cust_nm,
         cust_pit.m_status, cust_pit.gender,
         cust_pit.cnty_id, cust_pit.cnty_cd, cust_pit.cnty_nm
      from Customer cust_act
      inner join (
         select
            cust_id, cust_cd, cust_nm,
            m_status, gender,
            cnty_id, cnty_cd, cnty_nm
         from Customer cust_all
         where to_date('15-JAN-2012', 'DD-MON-YYYY') between eff_from and eff_to
      ) cust_pit   
      on (cust_act.cust_cd = cust_pit.cust_cd)
   ) cust,
   (
      select 
         cnty_act.cnty_id, cnty_pit.cnty_cd, cnty_pit.cnty_nm
      from County cnty_act
      inner join (
         select
            cnty_id, cnty_cd, cnty_nm
         from County cnty_all
         where to_date('15-JAN-2012', 'DD-MON-YYYY') between eff_from and eff_to
      ) cnty_pit
      on (cnty_act.cnty_cd = cnty_pit.cnty_cd)
   ) cnty  
where fct.cust_id = cust.cust_id
and cust.cnty_id = cnty.cnty_id
GROUP BY cust.cust_nm, cust.m_status
order by 1,2,3;

The results of this query are shown below. Since Jane Doe was single on January 15, 2012 (the analysis date), all tax for Jane Doe is accounted under her Single status.

Example 5-7 As Was Queries for Tax Collection Split by Marital Status

As Was Queries for Tax Collection Split by Marital Status

Table 5-9 County Table

Customer Name Marital Status Tax

Jane Doe

Single

200

Jim Doe

Married

200

John Doe

Single

200

Assume instead that you issued the exact same query except that for the to_date phrase you specify 09-FEB-2012 rather than 15-JAN-2012. Since Jane Doe was married on February 9, 2012, then, as shown below all tax for Jane Doe would be accounted under her Married status.

Table 5-10 County Table

Customer Name Marital Status Tax

Jane Doe

Married

200

Jim Doe

Married

200

John Doe

Single

200

Example 5-8 As Is Query for Tax Collection Data Split by County

As Is Query for Tax Collection Data Split by County

Assuming the Data used for the examples, issue the following SQL statement to show the tax collection data split by county:

SELECT cust.cust_nm, cnty.cnty_nm, SUM(fct.tax)
FROM TaxPaid fct, customer cust, county cnty
WHERE fct.cust_id = cust.cust_id
AND cust.cnty_id = cnty.cnty_ID
GROUP BY cut.cust_nm, cnty.cnty_nm
ORDER BY 1,2,3;

The results of this query are shown below. Note that since John Doe lived in two different counties, there are two rows of data for John Doe.

Table 5-11 County Table

Customer Name County Name Tax

Jane Doe

Sunnyvale

200

Jim Doe

Cupertino

200

John Doe

Cupertino

100

John Doe

Sunnyvale

100

Example 5-9 As Was Queries for Tax Collection Data Split by County

As Was Queries for Tax Collection Data Split by County

Assuming the Data used for the examples, issue the following SQL statement to show the tax collection data split by county using an analysis date of January 15, 2012:

select
   cust.cust_nm, cnty.cnty_nm, sum(fct.tax)
from
  TaxPaid fct,
   (
      select 
         cust_act.cust_id, cust_pit.cust_cd, cust_pit.cust_nm,
         cust_pit.m_status, cust_pit.gender,
         cust_pit.cnty_id, cust_pit.cnty_cd, cust_pit.cnty_nm
      from Customer cust_act
      inner join (
         select
            cust_id, cust_cd, cust_nm,
            m_status, gender,
            cnty_id, cnty_cd, cnty_nm
         from Customer cust_all
         where to_date('15-JAN-2012', 'DD-MON-YYYY') between eff_from and eff_to
      ) cust_pit
      on (cust_act.cust_cd = cust_pit.cust_cd
   ) cust,
   (
      select 
         cnty_act.cnty_id, cnty_pit.cnty_cd, cnty_pit.cnty_nm
      from County cnty_act
      inner join (
         select
            cnty_id, cnty_cd, cnty_nm
         from County cnty_all
         where to_date('15-JAN-2012', 'DD-MON-YYYY') between eff_from and eff_to
      ) cnty_pit
      on (cnty_act.cnty_cd = cnty_pit.cnty_cd)
   ) cnty
where fct.cust_id = cust.cust_id
and cust.cnty_id = cnty.cnty_id
GROUP BY cust.cust_nm, cnty.cnty_nm
order by 1,2,3;

The results of this query are shown below. Note that since John Doe was in Sunnyvale as of the analysis date of January 15, 2012, all tax for John Doe is accounted for under the Sunnyvale county.

Table 5-12 County Table

Customer Name County Name Tax

Jane Doe

Sunnyvale

200

Jim Doe

Cupertino

200

John Doe

Sunnyvale

200

Assume instead that you issued the exact same query except that for the to_date phrase you specify 09-FEB-2012 rather than 15-JAN-2012. Since John Doe lived in Cupertino on February 9, 2012, then, as shown below all tax for John Doe would be accounted under Cupertino.

Table 5-13 County Table

Customer Name County Name Tax

Jane Doe

Sunnyvale

200

Jim Doe

Cupertino

200

John Doe

Cupertino

200