Using Trees

This section provides an overview of using trees with PS/nVision and discusses how to:

  • Enhance tree performance with SQL techniques.

  • Set tree performance options.

PS/nVision performance may suffer when trees are used, especially when the SQL statements used to retrieve data access two or more trees at once. On some database platforms, the Where criteria used with most tree joins can cause the database optimizer to select the wrong access path, making the PSTREESELECTnn table an obstacle to performance rather than an aid. This often happens when the tree uses ranges of detail values.

To address these performance issues, PS/nVision includes techniques for building SQL that implement tree criteria, and it also implements SQL that is readily understood by database optimizers, yielding better access paths with less need for index tuning and so on. Additionally, you have control over the techniques used, so you can tune the performance of individual reports.

PS/nVision includes a number of SQL techniques to improve performance whenever trees are used. You can use static selectors or dynamic selectors. You can specify where or how to use selectors. A selector represents nodes of a single tree and is represented by the set of rows in the PSTREESELECTnn table having a single SELECTOR_NUM value.

Static Selectors

A static selector represents the entire tree, and it remains valid until the tree changes. For all database platforms, these selectors contain ranges (unless the tree had no ranges).

Static selectors do not need to be rebuilt except when the tree changes. However, the SQL statements that join static selectors to fact tables (such as ledgers) can be complex because they include both range predicates (if the tree has ranges) and node criteria to select that portion of the tree required on a particular section of a report. This can be difficult for database optimizers to process, especially if multiple trees use this technique.

Note: You should not run reports while you are modifying trees; it could lead to incorrect results.

Dynamic Selectors

A dynamic selector is created for use in a section of a single report, so it only lasts to the end of the report request. This section, however, may be selected several times, especially if the report uses a scope to produce multiple instances.

In addition to the ability to use a pre-existing (static) selector, PS/nVision can build one on the fly when preparing to run a report. This technique can boost performance, but can also create more overhead, especially if there are multiple users running the same report (using static selectors, users can share selectors).

PS/nVision builds each dynamic tree selector for a specific set of criteria (such as a set of rows or the current instance node), so that a selector (SELECTOR_NUM value) has exactly the nodes needed for a group of rows or columns to be retrieved with a Select statement. This eliminates the need for the often-cumbersome selection criteria PS/nVision generates for a static selector:

TREE_NODE_NUM BETWEEN x AND y OR TREE_NODE_NUM BETWEEN...

A dynamic selector creates a new SELECTOR_NUM value that PS/nVision uses and then deletes once the report is complete. Therefore, distribution statistics (or skew statistics) are not present for that selector. (Distribution statistics are still a factor for static selectors.) With certain PS/nVision reports, the absence of distribution statistics can improve performance significantly; that’s because distribution st atistics can make the SELECTOR_NUM criteria appear less selective to the database system optimizer, preventing the optimizer from accessing the selector table first.

The disadvantage of dynamic selectors is that one or more selectors may be needed to process a single report. In some cases, the time used to create the selectors can exceed the time saved by using them. Dynamic selectors are most effective on joins that drive the access path; static selectors may be just as fast, or faster, for additional criteria that do not affect the access plan.

Single-Value Selectors

Use single-value selectors only in conjunction with dynamic selectors. Combining dynamic and single-value selector techniques improves the performance of PS/nVision in many cases where trees are used.

Single-value selectors enable a more efficient equi-join between PSTREESELECTnn.RANGE_FROM_nn and the criteria field in the fact table (the one you are selecting data from). In building the dynamic selector, we do not merely copy the ranges of values (such as account numbers) from PSTREELEAF into PSTREESELECTnn. Instead, we join the tree ranges to the underlying detail table (such as the GL_ACCOUNT_TBL), and insert the individual detail values into the RANGE_FROM_nn column of PSTREESELECTnn. This may generate more rows in this table, but it can also generate a more efficient join without maintaining the tree with individual detail values (the only way you could get equi-joins without this option).

A disadvantage of this technique is that, especially where the tree has large ranges containing many detail values, single value selectors can contain many more rows than ranged selectors have. Unless the join is processed in a more efficient manner, the number of rows in the selector can mean slower join processing.

Suppress Join

The suppress join technique eliminates a SQL join by retrieving the detail ranges associated with the selected node and coding them in the Select statement. This technique is most effective in the following cases:

  • The selected tree is used in the scope you expect to be used with this report, and each instance of that report is a tree node.

  • The node or nodes selected represent a relatively small number of detail values or ranges.

The suppress join technique cannot be used where PS/nVision needs to group the answer set by tree node number, because these numbers are not available without joining the data to the tree. This happens, for example, if multiple rows or columns with otherwise similar criteria select different nodes of this tree. This is typical in the rows of most financial reports. However, if nPlosion to underlying details is specified for these rows or columns, the suppress join technique can be used, because PS/nVision can Group By the detail values rather than tree nodes.

Additionally, when the selected node (or nodes) includes large numbers of detail ranges, the suppress join technique may not be practical or efficient. While PS/nVision can build the very large statements that can result in this case, use of the suppress join technique when tree nodes refer to very large lists of detail ranges can be slower than the other techniques, or even fail to run because the statements exceed the size limits imposed by your database platform.

Sub-Select Tree Selector

The sub-select method is very similar to a join. Instead of adding the tree selector to the From list of the main query, the tree selector criteria and its relation to the data (for example, DEPTID) in the main query is within an "Exists (Select ...)" clause in the Where portion of the main query. This is called a correlated sub-query, because part of the criteria in the sub-select relates to data in the main query (A.DEPTID=B.RANGE_FROM_05). This is what makes it so much like a join. Database engines and optimizers differ in how they handle this syntax. Some process a correlated sub-query just like a join, while others are subtly different. You may need to experiment to determine which works better for which reports on your database platform.

Data from the sub-select statement cannot be included in the main Select list; thus, none can be visible to the program (for example, PS/nVision) that is running the query. This is because the sub-select statement is hidden in the Where clause, rather than appearing in the From clause. When PS/nVision retrieves multiple nodes of data for different rows or columns of a report, it uses the node number (from the tree selector) to distinguish the data, and this means a join is required. If you specify either sub-select or Suppress Join in this case, PS/nVision ignores the option and forces a join. You need to specify join options even if not using a join method, because you may get a join after all. If you don't see a performance benefit using the sub-select method, we recommend using join instead, because it can be used whether or not tree node criteria are needed in the answer set.

Additional Options

Before the tree performance options were added, PS/nVision invoked the following type of syntax when joining a selector for a tree with ranges:

WHERE _ field >= L.RANGE_FROM_nn AND field <= L.RANGE_TO_nn _

This syntax is equivalent to using the Between predicate. It resulted in better access plans on the DB2/MVS platform. PS/nVision now includes an option to generate the following syntax:

WHERE _ field BETWEEN L.RANGE_FROM_nn AND L.RANGE_TO_nn _

This syntax should result in better access plans on certain database platforms.

Note: This option is only relevant to trees that use range selectors.

SQL Examples

Below is an example of the SQL alternatives made possible through the use of tree performance options. Here is the default query (with the node criteria highlighted):

SELECT L1.TREE_NODE_NUM, SUM(A.POSTED_TOTAL_AMT)
FROM PS_LEDGER A, PSTREESELECT05 L, PSTREESELECT06 L1
WHERE A.LEDGER='ACTUALS' AND
       A.FISCAL_YEAR=1996 AND
       A.ACCOUNTING_PERIOD BETWEEN 1 AND 8 AND
       L.SELECTOR_NUM=216 AND
       A.BUSINESS_UNIT>=L.RANGE_FROM_05 AND
       A.BUSINESS_UNIT<=L.RANGE_TO_05 AND
       (L.TREE_NODE_NUM BETWEEN 1000000000 AND 1666666665) AND
       A.CURRENCY_CD='USD' AND
       L1.SELECTOR_NUM=215 AND
       A.ACCOUNT>=L1.RANGE_FROM_06 AND
       A.ACCOUNT<=L1.RANGE_TO_06 AND
       (L1.TREE_NODE_NUM BETWEEN 1916275676 AND 1923430847) AND
       A.STATISTICS_CODE=' '
GROUP BY L1.TREE_NODE_NUM;

Here’s an alternative query using the Suppress Join technique for business unit criteria and a dynamic, single-value selector for ACCOUNT:

SELECT L1.TREE_NODE_NUM, SUM(A.POSTED_TOTAL_AMT)
FROM PS_LEDGER A, PSTREESELECT06 L1
WHERE A.LEDGER='ACTUALS' AND
      A.FISCAL_YEAR=1996 AND
      A.ACCOUNTING_PERIOD BETWEEN 1 AND 8 AND
      (A.BUSINESS_UNIT BETWEEN ‘B0006' AND ‘B0006'
       	OR A.BUSINESS_UNIT BETWEEN ‘B5030' AND ‘B5030'
       	OR A.BUSINESS_UNIT BETWEEN ‘B9013' AND ‘B9014'
       	OR A.BUSINESS_UNIT BETWEEN ‘B0015' AND ‘B0015'
       	OR A.BUSINESS_UNIT BETWEEN ‘B9026' AND ‘B9026'
       	OR A.BUSINESS_UNIT BETWEEN ‘B0019' AND ‘B0031'
       	OR A.BUSINESS_UNIT BETWEEN ‘B0016' AND ‘B0018') AND
       A.CURRENCY_CD='USD' AND
       L1.SELECTOR_NUM=1215 AND
       A.ACCOUNT=L1.RANGE_FROM_06 AND
       A.STATISTICS_CODE=' '
GROUP BY L1.TREE_NODE_NUM;

Next is another form of the same query, with dynamic business unit selectors and dynamic ACCOUNT selectors:

SELECT L1.TREE_NODE_NUM, SUM(A.POSTED_TOTAL_AMT)
FROM PS_LEDGER A, PSTREESELECT05 L, PSTREESELECT06 L1
WHERE A.LEDGER='ACTUALS' AND
      A.FISCAL_YEAR=1996 AND
      A.ACCOUNTING_PERIOD BETWEEN 1 AND 8 AND
      L.SELECTOR_NUM=1216 AND
      A.BUSINESS_UNIT = L.RANGE_FROM_05 AND
      A.CURRENCY_CD='USD' AND
      L1.SELECTOR_NUM=1215 AND
      A.ACCOUNT=L1.RANGE_FROM_06 AND
      A.STATISTICS_CODE=' '
GROUP BY L1.TREE_NODE_NUM;