Describes the two common query techniques, "as is" and "as was" queries:
Characteristics of an As Is Query
An As Is query has the following characteristics:
The resulting report shows the data as it happened.
The snowflake dimension tables are also joined using the surrogate key columns (that is the primary key and foreign key columns).
The fact table is joined with the dimension tables (at leaf level) using the surrogate key column.
Slowly-changing data in the dimensions are joined with their corresponding fact records and are presented individually.
It is possible to add up the components if the different versions share similar characteristics.
Characteristics of an As Was Query
An As Was query (also known as point-in-time analysis) has the following characteristics:
The resulting report shows the data that would result from freezing the dimensions and dimension hierarchy at a specific point in time.
Each snowflake table is initially filtered by applying a point-in-time date filter which selects the records or versions which are valid as of the analysis date. This structure is called the point-in-time version of the snowflake.
The filtered snowflake is joined with an unfiltered version of itself by using the natural key. All of the snowflake attributes are taken from the point-in-time version alias. The resulting structure is called the composite snowflake.
A composite dimension is formed by joining the individual snowflakes on the surrogate key.
The fact table is joined with the composite dimension table at the leaf level using the surrogate key column.
The point-in-time version is super-imposed on all other possible SCD versions of the same business entity -- both backward and forward in time. Joining in this fashion gives the impression that the dimension is composed of only the specific point-in-time records.
All of the fact components for various versions add up correctly due to the super-imposition of point-in-time attributes within the dimensions.
Provides examples that illustrate the characteristics of As Is and As Was queries.
The following examples illustrate the characteristics of As Is and As Was queries:
Data used for the examples
Assume that your data warehouse has a Customer
table, a County
, and a TaxPaid
fact table. As of January 1, 2012, these tables include the values shown below.
Customer Table
Table 5-2 Customer Table
Customer ID | Customer Code | Customer Name | Gender | Marital Status | County ID | County Code | County Name | ... | Eff Frm | Eff To |
---|---|---|---|---|---|---|---|---|---|---|
101 |
JoD |
John Doe |
Male |
Single |
5001 |
SV |
Sunnyvale |
... |
1-Jan-12 |
31-Dec-99 |
102 |
JaD |
Jane Doe |
Female |
Single |
5001 |
SV |
Sunnyvale |
... |
1-Jan-12 |
31-Dec-99 |
103 |
JiD |
Jim Doe |
Male |
Married |
5002 |
CU |
Cupertino |
... |
1-Jan-12 |
31-Dec-99 |
Table 5-3 County Table
County ID | County Cde | County Name | Population | ... | Eff Frm | Eff To |
---|---|---|---|---|---|---|
5001 |
SV |
Sunnyvale |
Very High |
... |
1-Jan-12 |
31-Dec-99 |
5002 |
CU |
Cupertino |
High |
... |
1-Jan-12 |
31-Dec-99 |
Table 5-4 Tax Paid Table
Customer ID | Day | Tax Type | Tax |
---|---|---|---|
101 |
1-Jan-12 |
Professional Tax |
100 |
102 |
1-Jan-12 |
Professional Tax |
100 |
103 |
1-Jan-12 |
Professional Tax |
100 |
Assume that the following events occurred in January 2012:
On January 20, 2012, Jane Doe marries.
On Jan 29, 2012, John Doe moves from Sunnyvale to Cupertino.
Consequently, as shown below, on February 1, 2012, the Customer
and TaxPaid
tables have new data while the values in the County
table stay the same.
Table 5-5 Customer table
Customer ID | Customer Code | Customer Name | Gender | Marital Status | County Id | County Code | County Name | ... | Eff Frm | Eff To |
---|---|---|---|---|---|---|---|---|---|---|
101 |
JoD |
John Doe |
Male |
Single |
5001 |
SV |
Sunnyvale |
... |
1-Jan-12 |
29-Jan-12 |
102 |
JaD |
Jane Doe |
Female |
Single |
5001 |
SV |
Sunnyvale |
... |
1-Jan-12 |
20-Jan-12 |
103 |
JiD |
Jim Doe |
Male |
Married |
5002 |
CU |
Cupertino |
... |
1-Jan-12 |
31-Dec-99 |
104 |
JaD |
Jane Doe |
Female |
Married |
5001 |
SV |
Sunnyvale |
... |
21-Jan-12 |
31-Dec-99 |
105 |
JoD |
John Doe |
Male |
Single |
5002 |
CD |
Cupertino |
... |
30-Jan-12 |
31-Dec-99 |
Table 5-6 County table
County Id | County CD | County Name | Population | ... | Eff Frm | Eff To |
---|---|---|---|---|---|---|
5001 |
SV |
Sunnyvale |
Very High |
... |
1-Jan-12 |
31-Dec-99 |
5002 |
CU |
Cupertino |
High |
... |
1-Jan-12 |
31-Dec-99 |
Table 5-7 Tax Paid Table
Customer ID | Day | Tax Type | Tax |
---|---|---|---|
101 |
1-Jan-12 |
Professional Tax |
100 |
102 |
1-Jan-12 |
Professional Tax |
100 |
103 |
1-Jan-12 |
Professional Tax |
100 |
105 |
1-Feb-12 |
Professional Tax |
100 |
104 |
1-Feb-12 |
Professional Tax |
100 |
103 |
1-Feb-12 |
Professional Tax |
100 |
As Is Query for Tax Collection Split by Marital Status
Assuming the Data used for the examples, to show the tax collection data split by martial status, the following SQL statement that joins the TaxPaid
fact table and the Customer
dimension table on the cust_id
surrogate key and the Customer
and County
snowflakes on the cnty_id
surrogate key:
SELECT cust.cust_nm, cust.m_status, SUM(fct.tx) FROM taxpaid fct, customer cust, county cnty WHERE fct.cust_id = cust.cust_id AND cust.cnty_id = cnt.cnt_id GROUP BY cust.cust_nm, cust.m_status ORDER BY 1,2,3;
Note that there are two rows for Jane Doe; one row for a marital status of Married and another for a marital status of Single.
Table 5-8 County Table
Customer Name | Marital Status | Tax |
---|---|---|
Jane Doe |
Married |
100 |
Jane Doe |
Single |
100 |
Jim Doe |
Married |
200 |
John Doe |
Single |
200 |
Example 5-6 As Was Queries for Tax Collection Split by Marital Status
Assuming the Data used for the examples, issue the following SQL statement to show the tax collection data split by marital status using an analysis date of January 15, 2012:
select cust.cust_nm, cust.m_status, sum(fct.tax) from TaxPaid fct, ( select cust_act.cust_id, cust_pit.cust_cd, cust_pit.cust_nm, cust_pit.m_status, cust_pit.gender, cust_pit.cnty_id, cust_pit.cnty_cd, cust_pit.cnty_nm from Customer cust_act inner join ( select cust_id, cust_cd, cust_nm, m_status, gender, cnty_id, cnty_cd, cnty_nm from Customer cust_all where to_date('15-JAN-2012', 'DD-MON-YYYY') between eff_from and eff_to ) cust_pit on (cust_act.cust_cd = cust_pit.cust_cd) ) cust, ( select cnty_act.cnty_id, cnty_pit.cnty_cd, cnty_pit.cnty_nm from County cnty_act inner join ( select cnty_id, cnty_cd, cnty_nm from County cnty_all where to_date('15-JAN-2012', 'DD-MON-YYYY') between eff_from and eff_to ) cnty_pit on (cnty_act.cnty_cd = cnty_pit.cnty_cd) ) cnty where fct.cust_id = cust.cust_id and cust.cnty_id = cnty.cnty_id GROUP BY cust.cust_nm, cust.m_status order by 1,2,3;
The results of this query are shown below. Since Jane Doe was single on January 15, 2012 (the analysis date), all tax for Jane Doe is accounted under her Single status.
Example 5-7 As Was Queries for Tax Collection Split by Marital Status
As Was Queries for Tax Collection Split by Marital Status
Table 5-9 County Table
Customer Name | Marital Status | Tax |
---|---|---|
Jane Doe |
Single |
200 |
Jim Doe |
Married |
200 |
John Doe |
Single |
200 |
Assume instead that you issued the exact same query except that for the to_date
phrase you specify 09-FEB-2012
rather than 15-JAN-2012
. Since Jane Doe was married on February 9, 2012, then, as shown below all tax for Jane Doe would be accounted under her Married status.
Table 5-10 County Table
Customer Name | Marital Status | Tax |
---|---|---|
Jane Doe |
Married |
200 |
Jim Doe |
Married |
200 |
John Doe |
Single |
200 |
Example 5-8 As Is Query for Tax Collection Data Split by County
As Is Query for Tax Collection Data Split by County
Assuming the Data used for the examples, issue the following SQL statement to show the tax collection data split by county:
SELECT cust.cust_nm, cnty.cnty_nm, SUM(fct.tax) FROM TaxPaid fct, customer cust, county cnty WHERE fct.cust_id = cust.cust_id AND cust.cnty_id = cnty.cnty_ID GROUP BY cut.cust_nm, cnty.cnty_nm ORDER BY 1,2,3;
The results of this query are shown below. Note that since John Doe lived in two different counties, there are two rows of data for John Doe.
Table 5-11 County Table
Customer Name | County Name | Tax |
---|---|---|
Jane Doe |
Sunnyvale |
200 |
Jim Doe |
Cupertino |
200 |
John Doe |
Cupertino |
100 |
John Doe |
Sunnyvale |
100 |
Example 5-9 As Was Queries for Tax Collection Data Split by County
As Was Queries for Tax Collection Data Split by County
Assuming the Data used for the examples, issue the following SQL statement to show the tax collection data split by county using an analysis date of January 15, 2012:
select cust.cust_nm, cnty.cnty_nm, sum(fct.tax) from TaxPaid fct, ( select cust_act.cust_id, cust_pit.cust_cd, cust_pit.cust_nm, cust_pit.m_status, cust_pit.gender, cust_pit.cnty_id, cust_pit.cnty_cd, cust_pit.cnty_nm from Customer cust_act inner join ( select cust_id, cust_cd, cust_nm, m_status, gender, cnty_id, cnty_cd, cnty_nm from Customer cust_all where to_date('15-JAN-2012', 'DD-MON-YYYY') between eff_from and eff_to ) cust_pit on (cust_act.cust_cd = cust_pit.cust_cd ) cust, ( select cnty_act.cnty_id, cnty_pit.cnty_cd, cnty_pit.cnty_nm from County cnty_act inner join ( select cnty_id, cnty_cd, cnty_nm from County cnty_all where to_date('15-JAN-2012', 'DD-MON-YYYY') between eff_from and eff_to ) cnty_pit on (cnty_act.cnty_cd = cnty_pit.cnty_cd) ) cnty where fct.cust_id = cust.cust_id and cust.cnty_id = cnty.cnty_id GROUP BY cust.cust_nm, cnty.cnty_nm order by 1,2,3;
The results of this query are shown below. Note that since John Doe was in Sunnyvale as of the analysis date of January 15, 2012, all tax for John Doe is accounted for under the Sunnyvale county.
Table 5-12 County Table
Customer Name | County Name | Tax |
---|---|---|
Jane Doe |
Sunnyvale |
200 |
Jim Doe |
Cupertino |
200 |
John Doe |
Sunnyvale |
200 |
Assume instead that you issued the exact same query except that for the to_date
phrase you specify 09-FEB-2012
rather than 15-JAN-2012
. Since John Doe lived in Cupertino on February 9, 2012, then, as shown below all tax for John Doe would be accounted under Cupertino.
Table 5-13 County Table
Customer Name | County Name | Tax |
---|---|---|
Jane Doe |
Sunnyvale |
200 |
Jim Doe |
Cupertino |
200 |
John Doe |
Cupertino |
200 |