Using CALCPARALLEL Parallel Calculation

To change from the default serial calculation to CALCPARALLEL parallel calculation, you change one or two configuration settings and restart the server, or add an instruction to the calculation script.

See Enabling CALCPARALLEL Parallel Calculation.

The following topics discuss the details of parallel calculation.

Analysis of Feasibility of CALPARALLEL

Essbase evaluates whether using CALPARALLEL method of parallel calculation is possible before each calculation pass for which you have enabled it.

Essbase analyzes the outline and the calculation requested for each calculation pass. Remember that one calculation may require multiple passes. Some situations may create the need for multiple passes, including dynamic calculation, the presence of a member tagged as two-pass, or calculations that create certain kinds of interdependencies. See Calculation Passes.

If Essbase determines that parallel calculation using the CALPARALLEL method is possible, Essbase splits the calculation into smaller tasks that are independent of each other. During the calculation, Essbase performs the smaller tasks simultaneously.

However, Essbase resorts to serial calculation if there are complex interdependencies between formulas that participate in the pass. Such interdependencies can render parallel calculation impossible.

Consider whether the FIXPARALLEL method might be more suited to your use case. See Using FIXPARALLEL Parallel Calculation.

CALCPARALLEL Parallel Calculation Guidelines

Outline structure and application design determine whether enabling parallel calculation can improve calculation performance. Before you enable CALCPARALLEL parallel calculation, review the following guidelines, which will help you get the full benefit of parallel calculation:

  • One or more formulas present in a calculation may prevent Essbase from using CALCPARALLEL parallel calculation even if it is enabled. For a description of formulas that may force serial calculation regardless of parallel calculation settings, see Formula Limitations.

  • Calculation tasks are generated along the last n sparse dimensions of an outline. These sparse dimensions used to identify tasks are called task dimensions. The number of task dimensions, n, is either selected dynamically by Essbase, or you can override the number by specifying a value for the calculation command SET CALCTASKDIMS (or application configuration setting CALCTASKDIMS).

    Order the sparse dimensions in an outline from smallest to largest, based on actual size of the dimension (as reported by the MaxL statement query database DBS-NAME get dbstats dimension). This ordering recommendation is consistent with recommendations for optimizing calculator cache size and consistent with other outline recommendations. For a description of situations that may need to use additional dimensions (more than the last sparse dimension) and for instructions on how to increase the number of sparse dimensions used, see Identifying Additional Tasks for Parallel Calculation.

  • CALCPARALLEL parallel calculation is effective on non-partitioned applications and these partitioned applications:

    • Replicated partitions

    • Transparent partitions if the calculation occurs at the target database. The number of sparse dimensions specified by SET CALCTASKDIMS in a calculation script must be set at 1. For information on limitations imposed by the use of parallel calculation with transparent partitions, see Transparent Partition Limitations; for information on using SET CALCTASKDIMS, see Identifying Additional Tasks for Parallel Calculation.

  • Update transactions, such as calculations and data updates, are more resource-consuming requests than MDX queries or report scripts.

Relationship Between CALCPARALLEL Parallel Calculation and Other Essbase Features

The following topics discuss the relationship between CALCPARALLEL parallel calculation and other Essbase functionality.

Retrieval Performance

Placing the largest sparse dimension at the end of the outline for maximum parallel calculation performance may slow retrieval performance. See Optimizing Query Performance.

Formula Limitations

The presence of some formulas may force serial calculation. The following formula placements likely will force serial calculation:

  • A formula on a dense member, including all stored members and any Dynamic Calc members upon which a stored member may be dependent, that causes a dependence on a member of the dimension that is used to identify tasks for parallel calculation.

  • A formula that contains references to variables declared in a calculation script that uses @VAR, @ARRAY, @XREF, or @XWRITE. Consider using FIXPARALLEL.

  • A sparse dimension member formula using @XREF, and the dimension for the sparse member is fully calculated. @XREF does not force serial calculation when it is on dense Dynamic Calc members that are not dependent on other stored members during the batch calculation.

  • A member formula that causes a circular dependence. For example, member A has a formula referring to member B, and member B has a formula referring to member C, and member C has a formula referring to member A.

  • A formula on a dense or sparse member with a dependency on a member or members from the dimension used to identify tasks for parallel processing.

  • A sparse dimension member formula that contains references to members from other sparse dimensions.

If you need to use a formula that might prevent parallel calculation, consider using FIXPARALLEL. Otherwise, you can either mark the member of the formula as Dynamic Calc, or exclude the formula from the scope of the calculation. To see whether a formula is preventing parallel calculation, check the application log. For relevant error messages, see Monitoring CALCPARALLEL Parallel Calculation.

Calculator Cache

At the start of a calculation pass, Essbase checks the calculator cache size and the degree of parallelism and then uses the calculator cache bitmap option appropriate for maximum performance. Therefore, the bitmap option used for parallel calculation may be different from that used for serial calculation.

For example, assume Essbase performs a serial calculation and uses multiple bitmaps and a single anchoring dimension. Without explicit change of the calculator cache size, Essbase might perform a parallel calculation using only a single bitmap and a single anchoring dimension.

You can determine the calculator cache mode that controls the bitmap options by checking the application log at the start of each calculation pass for an entry similar to the following:

Multiple bitmap mode calculator cache memory usage has a limit of [50000] bitmaps.

Transparent Partition Limitations

Parallel calculation with transparent partitions has the following limitations:

  • You cannot use parallel calculation across transparent partitions unless the calculation occurs at the target.

  • You must set the task dimensions to 1. To do this, use SET CALCTASKDIMS calculation command or CALCTASKDIMS configuration setting.

  • You must increase the calculator cache so that multiple bitmaps can be used. You can identify the calculator cache mode that controls the bitmap options by checking the application log at the start of each calculation pass for an entry similar to the following:

    Multiple bitmap mode calculator cache memory usage has a limit of [50000] bitmaps.

Checking Current CALCPARALLEL Settings

You can check either the application configuration or the calculation script that you plan to use to see if parallel calculation is enabled.

To check whether parallel calculation has been enabled at the application level, search for the parameter CALCPARALLEL, and check its specified value.

The number of threads that can simultaneously perform tasks to complete a calculation is specified by a value between 1 and 128. Block storage and aggregate storage databases support up to 128 threads.

To check whether a calculation script sets parallel calculation, look for the SET CALCPARALLEL command. The number of threads that can simultaneously perform tasks to complete a calculation is specified by a value between 1 and 128. Block storage and aggregate storage databases running on 64-bit platforms support up to 128 threads. Review the script carefully, because the script may enable or disable parallel calculation more than once. Alternately, a script can use FIXPARALLEL command blocks for parallel calculation.

Enabling CALCPARALLEL Parallel Calculation

To use CALCPARALLEL parallel calculation, use either of these methods:

To enable parallel calculation:

  1. If you plan to enable parallel calculation in the application configuration, check the current status to see whether an entry exists. Use the process described in Checking Current CALCPARALLEL Settings.
  2. Add SET CALCPARALLEL to a calculation script.
  3. If needed, enable Essbase to use more than the one sparse dimension to identify tasks for parallel calculation.
  4. Run the calculation.

Tip:

You can combine the use of CALCPARALLEL and SET CALCPARALLEL if the site requires it. For example, you can set CALCPARALLEL as off at the server level, and use a calculation script to enable and disable parallel calculation as needed.

Identifying Additional Tasks for Parallel Calculation

By default, Essbase uses an iterative technique to select the optimal number of task dimensions to use for CALCPARALLEL parallel calculation.

If necessary, you can enable Essbase to use a specific number, n, of task dimensions. For example, if you have a FIX statement on a member of the last sparse dimension, you can include the next-to-last sparse dimension from the outline as well. Because each unique member combination of these two dimensions is identified as a potential task, more and smaller tasks are created, increasing the opportunities for parallel processing and improving load balancing.

To specify the number of task dimensions for parallel calculation:

  1. If you are not sure, verify whether parallel calculation is enabled, using the process described in Checking Current CALCPARALLEL Settings. Without SET CALCPARALLEL enabled (or SET CALCPARALLEL used in a calculation script), CALCTASKDIMS has no effect.
  2. Optional: Essbase selects a default number, n, of task dimensions to use for parallel calculation and this number is printed in the application log file as an informational message; for example: Parallelizing using [2] task dimensions. To override the default n setting, add or modify CALCTASKDIMS configuration setting for the application, or use the calculation script command SET CALCTASKDIMS.
  3. Run the calculation script.

Note:

In some cases, Essbase uses fewer dimensions to identify tasks than is specified by CALCTASKDIMS or SET CALCTASKDIMS.

Tuning CALCPARALLEL with Log Messages

If you are using CALCPARALLEL parallel calculation, you may encounter the following log messages:

Current selection of task dimensions [n] will generate insufficient number of tasks [n] for parallel calculation. See whether calculation time can be improved by increasing the number of task dimensions by one (see SET CALCTASKDIMS topic in the documentation). Also, consider using FIXPARALLEL to make custom task selections that are different from CALCPARALLEL.

If this message is encountered, it means that during a parallel calculation, Essbase refrained from increasing the number of task dimensions, in case that would have resulted in tasks becoming too small. When tasks become too small, calculation scheduling overhead could overtake the benefits of parallelism. However, when tasks are too large, there might not be enough tasks for parallel calculation threads to work on.

If the next potential task dimension is not the first sparse dimension, consider increasing the number of task dimensions by one, using the SET CALCTASKDIMS calc command (or the CALCTASKDIMS configuration setting), and observe whether that improves the speed of the calculation. Also, consider using FIXPARALLEL to make custom task selections that are different from CALCPARALLEL (see Using FIXPARALLEL Parallel Calculation).

Current number of task dimensions [n] for parallel calculation might have caused too many tasks [n] to be generated. See whether calculation time can be improved by decreasing the number of task dimensions by one (see SET CALCTASKDIMS topic in the documentation). Also, consider using FIXPARALLEL to make custom task selections that are different from CALCPARALLEL.

For parallel calculation, having a sufficient number of tasks helps to reduce the effects of data skew. However, too many tasks (even for appropriately sized tasks) can cause the scheduling overhead to outweigh the benefits. Essbase targets an optimal range. If you see the above message, it means that Essbase tried to meet the recommended minimum number of tasks by adding one more task dimension; in doing so, it is possible that the upper boundary for task count may have been crossed.

If the last task dimension selected by Essbase is not the only task dimension, consider decreasing task dimensions by one, using the SET CALCTASKDIMS calc command (or the CALCTASKDIMS configuration setting), and observe whether that improves the speed of the calculation. Also, consider using FIXPARALLEL to make custom task selections that are different from CALCPARALLEL.

Monitoring CALCPARALLEL Parallel Calculation

You can view events related to parallel calculation in the application log.

For each calculation pass, Essbase writes several types of information to the application log to support parallel calculation:

  • If you have enabled parallel calculation and Essbase has determined that parallel calculation can be performed, Essbase writes a message in the application log:

    Calculating in parallel with n threads

    n represents the number of concurrent tasks specified in CALCPARALLEL or SET CALCPARALLEL.

  • For each formula that prevents parallel calculation (forces serial calculation), Essbase writes a message to the application log:

    Formula on (or backward dependence from) mbr memberName prevents calculation from running in parallel.

    memberName represents the name of the member where the relevant formula exists. You can look in the application log for such messages and consider removing the formula or, if possible, tagging the relevant member or members as Dynamic Calc so they do not feature in the calculation pass.

  • Essbase writes a message to the application log specifying the number of tasks that can be executed concurrently (based on the data, not the value of CALCPARALLEL or SET CALCPARALLEL):

    Calculation task schedule [576,35,14,3,2,1]

    The example message indicates that 576 tasks can be executed concurrently. After the 576 tasks complete, 35 more can be performed concurrently, and so on.

    The benefit of parallel calculation is greatest in the first few steps and diminishes as fewer concurrent tasks are performed.

    The degree of parallelism depends on the number of tasks in the task schedule. The greater the number, the more tasks that can run in parallel, and the greater the performance gains.

  • Essbase writes a message to the application log file indicating how many tasks are empty (contain no calculations):

    [Tue Jun 27 12:30:44 2007]Local/CCDemo/Finance/essexer/
    Info(1012681) Empty tasks [291,1,0,0,0,0]

    In the example, Essbase indicates that 291 of the tasks at level 0 were empty.

    If the ratio of empty tasks to the tasks specified in the calculation task schedule is greater than 50% (for example, 291 / 576), parallelism may not be giving you improved performance because of the high sparsity in the data model.

    You can change dense-sparse assignments to reduce the number of empty tasks and increase the performance gains from parallel calculation.