A parent-child hierarchy is a hierarchy of members that all have the same type.
This contrasts with level-based hierarchies, where members of the same type occur only at a single level of the hierarchy.
This section contains the following topics:
The most common real-life occurrence of a parent-child hierarchy is an organizational reporting hierarchy chart.
In an organizational reporting hierarchy chart, the following apply:
Each individual in the organization is an employee.
Each employee, apart from the top-level managers, reports to a single manager.
The reporting hierarchy has many levels.
These conditions illustrate the basic features that define a parent-child hierarchy, namely:
A parent-child hierarchy is based on a single logical table, for example, the "Employees" table
Each row in the table contains two identifying keys, one to identify the member itself, the other to identify the "parent" of the member, for example, Emp_ID and Mgr_ID
The image shows an example of a multi-level parent-child hierarchy.
The following table shows how this parent-child hierarchy could be represented by the rows and key values in an Employees table.
Emp_ID | Mgr_ID |
---|---|
Andrew |
null |
Barbara |
Andrew |
Carlos |
Andrew |
Dawn |
Barbara |
Emre |
Barbara |
You can expose logical dimensions with parent-child hierarchies to Oracle BI Answers users by creating presentation hierarchies that are based on particular logical dimensions. Creating hierarchies in the Presentation layer enables users to create hierarchy-based queries. See Working with Presentation Hierarchies and Levels for more information.
This section contains the following topics:
All the dimension members of a parent-child hierarchy occur in a single logical column, unlike with level-based hierarchies.
In a parent-child hierarchy, the parent of a member is in another row in the same logical column, pointed to by the parent key. This is unlike a level-based hierarchy, where the parent of a member is in a different logical column in the same row. In other words, navigation in a parent-child hierarchy follows data values, while navigation in a level-based hierarchy follows the metadata structure.
In level-based hierarchies, each level is named, and occupies a position in the hierarchy that corresponds to a real-world attribute or category that is deemed useful for analysis. Unlike level-based hierarchies, where the number of levels is fixed at design time, there is no limit to the depth of a parent-child hierarchy, and the depth can change at run time due to new data.
Every Oracle BI Server parent-child hierarchy has two system-generated entities, Total and Detail, that are automatically defined for each parent-child hierarchy when the logical dimension is created. The Detail entity contains all the hierarchy members.
These two system-generated entities are different from the implicit, inter-member levels between ancestors and descendants in a parent-child hierarchy. These implicit levels are referred to as parent-child hierarchical levels in this section.
Closely associated with levels is the concept of distance in parent-child hierarchies. The distance of one member from another is the number of parent-child hierarchical levels from the member to an ancestor or to a descendant. For example, the distance from a member to its parent is always 1, and the distance from Andrew to Emre is 2. See About Parent-Child Hierarchies for an image of this example.
In relational tables, the relationships between different members in a parent-child hierarchy are implicitly defined by the identifier key values in the associated base table.
However, for each Oracle BI Server parent-child hierarchy defined on a relational table, you must also explicitly define the inter-member relationships in a separate parent-child relationship table.
The parent-child relationship table must include four columns, as follows:
A column that identifies the member
A column that identifies an ancestor of the member
Note:
The ancestor may be the parent of the member, or a higher-level ancestor.
A distance column that specifies the number of parent-child hierarchical levels from the member to the ancestor
A leaf column that indicates if the member is a leaf member (1=Yes, 0=No)
The column names can be user defined. The data types of the columns must satisfy the following conditions:
The member and ancestor identifier columns have the same data type as the associated columns in the logical table that contains the hierarchy members.
Note:
The example shown in the table uses text strings for readability, but you normally use integer surrogate keys for member_key and ancestor_key, if they exist in the source dimension table.
The distance and leaf columns are INTEGER
columns.
Note the following about the rows in a parent-child relationship table:
Each member must have a row pointing at itself, with distance zero.
Each member must have a row pointing at each of its ancestors. For a root member, this is a termination row with null for the parent and distance values.
The table shows an example of a parent-child relationship table with rows that represent the inter-member relationships for the employees. See the figure in About Parent-Child Hierarchies.
Member_Key | Ancestor_Key | Distance | Isleaf |
---|---|---|---|
Andrew |
Andrew |
0 |
0 |
Barbara |
Barbara |
0 |
0 |
Carlos |
Carlos |
0 |
0 |
Dawn |
Dawn |
0 |
0 |
Emre |
Emre |
0 |
0 |
Andrew |
null |
null |
0 |
Barbara |
Andrew |
1 |
0 |
Carlos |
Andrew |
1 |
1 |
Dawn |
Barbara |
1 |
1 |
Dawn |
Andrew |
2 |
1 |
Emre |
Barbara |
1 |
1 |
Emre |
Andrew |
2 |
1 |
You generate scripts to create and populate the parent-child relationship table through a wizard that you can invoke when you define the parent-child hierarchy. Note the following about the create and load scripts:
You run the create script only once, to create the parent-child relationship table in the data source.
You must run the load script after each time the data changes in the dimension table. Because of this, you typically call the load script in your ETL processing. The load script reloads the entire parent-child relationship table; it is not incremental.
If you do not choose to use the wizard, then you must create the parent-child relationship table manually and then import it into the Physical layer before associating it with the parent-child hierarchy. In this latter case, it is also your responsibility to populate the table with the data required to describe the inter-member relationships in the parent-child hierarchy.
The key elements that you must define for a parent-child hierarchy are the identifier columns for the member and the parent of the member.
This basic principle applies to all parent-child hierarchies, regardless of the data source from which the hierarchy is derived.
Parent-child hierarchies that are based on relational tables must have an accompanying parent-child relationship table. See About Parent-Child Relationship Tables for more information.
To create dimensions with a parent-child hierarchy:
For parent-child hierarchies based on relational tables, you must define a parent-child relationship table.
See About Parent-Child Relationship Tables for more information.
When you create the parent-child relationship table, you must choose one of the following options:
Select a previously-created parent-child relationship table
Use a wizard that will generate scripts to create and populate the parent-child relationship table
To define parent-child relationship tables:
Fact tables in level-based hierarchies might only contain facts for a single level of the hierarchy.
Facts for higher-level dimension members can be calculated by aggregating the facts from the lower-level fact table or from a higher-level summary table.
In contrast, parent-child hierarchies require data modelers to make some additional decisions related to the following:
How to store the base facts in the fact table
How to aggregate the base facts to obtain the facts for higher-level members of the parent-child hierarchy
This section describes how to store and aggregate facts for parent-child hierarchies and contains the following topics:
There are two options for storing the base facts in the fact table for parent-child hierarchies.
You can use the following options:
Store facts for only the leaf members of the parent-child hierarchy.
Store facts for members at any level of the parent-child hierarchy, including non-leaf members.
The first option is more appropriate if the facts for the non-leaf members of the parent-child hierarchy can be derived entirely from the facts of the leaf members. For example, if you have a parent-child product hierarchy in which the actual product members appear only as leaf members of the hierarchy, then it makes sense for a revenue fact table to only record revenue facts for the leaf members of this product hierarchy. The revenue figures for the non-leaf members of the product hierarchy such as the product categories can be derived entirely by aggregating the facts for the leaf product members at the bottom of the hierarchy.
The image shows example data for a situation where facts are stored only for leaf members in a parent-child hierarchy.
The following table shows example data for the dimension table PRODUCT_DIM:
MemberKey | Name | ParentKey |
---|---|---|
P1 |
Product1 |
C1 |
P2 |
Product2 |
C1 |
C1 |
Category1 |
C2 |
C2 |
Category2 |
C3 |
C3 |
Category3 |
- |
The following table shows example data for the fact table REVENUE_FACTS:
ProductKey | YearKey | Revenue |
---|---|---|
P1 |
2011 |
100,000 |
P1 |
2012 |
105,000 |
P2 |
2011 |
75,000 |
P2 |
2012 |
80,000 |
The second option in which facts are stored for members at any level of the parent-child hierarchy is necessary when the facts for the non-leaf members are not completely derived from facts of the leaf members. A good example is a sales person hierarchy in which a sales person might report to a manager who is also a sales person. Each individual sales person, including the manager, could have a different revenue figure stored in the fact table.
The table shows example data for this situation.
Facts Stored for Both Leaf and Non-Leaf Members
The following table shows example data for the dimension table SALES_REP_DIM:
MemberKey | Name | ParentKey |
---|---|---|
101 |
Phillip |
201 |
102 |
Vivian |
201 |
201 |
Jacob |
301 |
202 |
Audrey |
301 |
301 |
Ryan |
- |
The following table shows example data for the fact table REVENUE_FACTS:
SalesRepKey | YearKey | Revenue |
---|---|---|
101 |
2012 |
1,200,000 |
102 |
2012 |
1,100,000 |
201 |
2012 |
250,000 |
202 |
2012 |
1,400,000 |
Another case in which storing facts for both leaf and non-leaf members is appropriate is when the rules for aggregating the parent-child hierarchy are complex, or when aggregating the hierarchy at query time is expensive and would lead to unacceptably long query response times. In this case, the fact table would store preaggregated facts for the non-leaf members in addition to the facts stored for the leaf members.
Whether or not the fact table contains data for both leaf members and non-leaf members or just for leaf members, the data modeler must determine how to aggregate the stored facts to calculate the aggregated facts for higher level members of the parent-child hierarchy.
In addition to choosing the correct aggregation function for the measure, the data modeler must decide whether the fact values recorded for lower-level members should be rolled up to calculate the values for higher-level members. In some cases, rolling up the facts of lower-level members of the parent-child hierarchy makes sense. In other cases, such as with a preaggregated fact table or a measure that is intended to show each member's individual contribution, rolling up the facts from lower-level members of the parent-child hierarchy is incorrect.
Rolling up Facts from Lower-Level Members of a Parent-Child Hierarchy
If a fact table only stores facts for the leaf members of a parent-child hierarchy or if the fact table only records each member's individual contribution, then most likely the values stored in the fact table must be rolled up to obtain the correct aggregated value for higher-level members of the parent-child hierarchy. Rolling up the facts along a parent-child hierarchy is achieved by joining the fact table to the dimension table through the parent-child relationship table. See Adding the Parent-Child Relationship Table to the Model for more information.
For a fact table that stores facts only for the leaf members such as the product revenue fact table in the example, this modeling technique calculates aggregate values that correctly summarize all the facts for the leaf-level members.
For a fact table that stores the individual contribution of both leaf members and non-leaf members, this technique computes a hierarchical aggregate that summarizes the individual contributions of the member and all the members beneath it.
Modeling Individual Contribution Measures
To be able to report the individual contribution of each member in addition to reporting the summarized hierarchical aggregate that rolls up the individual contributions of multiple members, the data modeler must create two separate fact logical table sources. One fact logical table source maps the base fact table and the parent child relationship table. This is the logical table source for the hierarchical aggregate measure. The second fact logical table source maps only an alias of the fact table. This fact table alias should join directly with the dimension table rather that joining indirectly through the parent-child relationship table. This is the logical table source for the individual contribution measure.
Modeling Preaggregated Measures
Some fact tables contain preaggregated data that is populated for all members of the parent-child hierarchy. For example, the fact value for a root member might be populated with the aggregation of the data for all of its descendent members. It is important to ensure that queries do not aggregate the members from this dimension to avoid erroneous results.
To correctly model this type of parent-child hierarchy, you must still create a parent-child relationship table to support hierarchical filter functions like IsAncestor and IsDescendant. However, you can join the parent-child dimension table directly with the fact table rather than joining through the parent-child relationship table. Doing so ensures that the preaggregated member value is returned, rather than rolling up all the descendants.
Note:
Do not modify the parent-child relationship table script to only include the "self" rows, because doing so would break the IsAncestor and IsDescendant functions.
To achieve the correct aggregation for dimensions of this type, you must first determine what you want to see as a grand total when the parent-child hierarchy is aggregated. For example, assume that your hierarchy contains a single root member, and you want to display the preaggregated value for this root member. In this case, you first create an additional fact logical table source mapped at the Total level of the parent-child hierarchy. Then, in the logical table source, create a WHERE clause filter that selects only the root member.
With this model in place, for queries that are at the Total level of the parent-child hierarchy, the Oracle BI Server selects the aggregate logical table source and applies the root member WHERE clause filter. For queries that are at the Detail level, the Oracle BI Server selects the detailed logical table source and returns the preaggregated member values. In either case, it does not matter how the aggregation rule is set, because there is a preaggregated source at each level.
Note that this approach only works if the queries are either at the Total or Detail level of the parent-child dimension. However, for queries that group by some non-unique attribute of the parent-child dimension, the aggregation might not be correct. For example, if an Employee dimension has a Location attribute, and a query groups by Employee.Location, then there will likely be some double counting because an employee often reports to other employees at the same location. Because of this, when fact tables contain preaggregated member values, you should avoid grouping by non-unique attributes of the parent-child dimension. If grouping by those attributes is unavoidable, then you should model them as separate dimensions.
For measures in fact tables that should be aggregated by rolling up the facts from lower-level members, you must edit Physical layer joins to include the parent-child relationship table.
You also need to add the parent-child relationship table to the appropriate logical table source.
Note:
For fact tables containing preaggregated data for a parent-child hierarchy or for individual contribution measures, you should join the parent-child dimension table directly with the fact table rather than joining through the parent-child relationship table.
Joining the parent-child dimension table directly with the fact table ensures that the preaggregated value or individual contribution value is returned, rather than rolling up all the descendants. When preaggregated data is populated for all members, do not add the parent-child relationship table to the logical table source to avoid overcounting. See Modeling Preaggregated Measures for more information.
To add the parent-child relationship table to the model:
In the Administration Tool, in the Physical layer of the repository, open the Physical Diagram so that it shows the parent-child relationship table and associated dimension table and fact tables. To do this, right-click the appropriate physical tables and select Physical Diagram > Selected Object(s) Only.
Delete the direct joins from the dimension table to each of the fact tables.
Create joins from each the fact tables to the dimension table through the parent-child closure table, as follows:
Create a join from the parent-child relationship table to the dimension table using the ancestor key.
Create joins from the fact tables to the parent-child relationship table using the member key.
In the Business Model and Mapping layer, double-click the logical table source for the logical fact table that is used in your parent-child hierarchy.
In the General tab of the Logical Table Source dialog, click the Add button.
Browse to locate the parent-child relationship table in the Physical layer and click Select.
Click OK in the Logical Table Source dialog.
For parent-child hierarchies based on relational tables, you must ensure that the data in the parent-child relationship table accurately reflects the inter-member relationships in the dimension.
You may have created scripts manually to create and populate the parent-child relationship table, or you may have used the Generate Parent-Child Relationship Table Wizard to create the scripts. You must run these scripts, adapting them if necessary, as often as required to guarantee the integrity of the parent-child relationships in the hierarchy. You typically want to add the Populate script to your ETL process so that it runs after the dimension table is updated.