Make Your Calcs Run Faster Than Ever!
This is Part Two of a two-part series.
In Part One of this series, we covered the basics of parallel calculation, the use of CALCPARALLEL, and how to view test results in the calc script’s execution log.
Understanding Log Information
One of the messages that you will see when using CALCPARALLEL is one like this:
Calculating in parallel with [8] threads
This tells you that Essbase can calculate in parallel mode and will use the number of threads you specified in your SET CALCPARALLEL command. It indicates that Essbase has assessed the feasibility of using parallel calc and determined that there are no conditions that prevent it. This is important, as there are several reasons that you may not be able to use parallel calc. These include the following:
- One or more member formulas used in the calculation is problematic, such as:
- A formula on a dense member that relies upon a member in a task dimension.
- Certain formulas that use @VAR, @ARRAY, @XREF, or @XWRITE.
- Multiple member formulas create a circular reference.
- Sparse member formulas refer to a member from another sparse dimension.
In the above cases, consider using FIXPARALLEL instead. We’ll cover this in a bit.
You will also see the following message when parallel calc is enabled:
Parallelizing using [x] task dimensions
As part of Essbase’s feasibility analysis, it was determined that [x] dimensions will be used to generate the list of tasks for parallelization. These are the last [x] dimensions in your outline which, in our “hourglass on a stick” order, includes dimensions that are probably single members in your main (baseline) FIX statement: Scenario, Version, Year, etc.
Finally, you will see the following messages at the beginning and end of the block of code being calculated in parallel:
Calculation task schedule [x,y,z,…]
Empty tasks [x,y,z,…]
These messages show you the task schedule – the number of tasks that Essbase will parallelize on first, second, etc. – and the number of empty tasks for each scheduled group. Empty tasks are simply tasks for which there was nothing to calculate, such as members that have no data. In general (and especially looking at the first value in each of these task statements) if the number of empty tasks is greater than 50% of those that are scheduled, then the sparsity of the task dimension is preventing a parallel calc from improving performance.
Using CALCTASKDIMS
When using CALCPARALLEL, Essbase automatically determines the number of task dimensions, and again, these are selected from the bottom of the outline up. Though you cannot change the order in which Essbase selects the task dimensions you can specify how many task dimensions Essbase uses in a CALCPARALLEL operation by adding the following statement:
SET CALCTASKDIMS x;
Essbase will attempt to use the specified number of dimensions for tasks. If this value is too small or too large you may receive errors about insufficient or too many tasks.
Pay close attention to your log, and test!
Using FIXPARALLEL
When using CALCPARALLEL and CALCTASKDIMS you are afforded some control over how Essbase parallelizes calculations. But there are limits, especially when it comes to selecting task dimensions. You could reorder your dimensions so that the bottom ones suit your parallel calc better, but this may not be appropriate for all your calcs and could create other issues. With that in mind, let’s take a look at FIXPARALLEL.
First, a reminder of the dimensions we’re working with:
- Account* (500 total / 400 stored)
- Period* (17 / 12)
- Entity (100 / 12)
- Product (1000 / 15)
- Currency (8 / 8)
- Scenario (3 / 3)
- Version (5 / 5)
- Year (10 / 10)
*dense dimensions
Say that in your calc you are going to perform an action on the children of USA, a member of the entity dimension that is the parent of the fifty states. Your calc script may look something like this:
FIX(Budget, Working, FY23, USD, @CHILDREN(“USA”))
[action to perform]
ENDFIX
In this case, @CHILDREN(“USA”) would be an ideal member set to parallelize on, as you are performing the same action on all fifty members. But using CALCPARALLEL you can only specify the entire Entity dimension, and even then, you would have to use CALCTASKDIMS with a value of 6, which would also include all the members of the Product dimension (Remember, CALCPARALLEL selects task dimensions from the bottom up).
Here is your opportunity to use FIXPARALLEL. To do so, modify your calc to look like this:
FIX(Budget, Working, FY23, USD)
FIXPARALLEL(8, @CHILDREN(“USA”))
[action to perform]
ENDFIXPARALLEL
ENDFIX
FIXPARALLEL uses two parameters: the number of threads to use (maximum 8), and the member set to use as tasks to parallelize. In this case, the fifty states will be used as “slices” of the database to divide among the eight threads provided for parallel calculation.
Using the above, you will find two messages in the log for each FIXPARALLEL/ENDFIXPARALLEL pair:
Calculating FIXPARALLEL: Task dimensions = [1]. Total tasks = [50]
(and later in the log)
Calculated in FIXPARALLEL with [8] threads. Task dimension = [Entity] with [50] fixed members. Task size = [1]. Total tasks = [50]
Very simple and easy to understand what Essbase did.
FIXPARALLEL can be a better choice in the following situations:
- Where CALCPARALLEL gives errors or doesn’t meet performance requirements.
- When using the DATACOPY, DATAEXPORT, or CLEARBLOCK commands or the @XREF or @XWRITE functions.
- When exporting regions of the database in parallel (note that each region will produce a separate export file).
Other Considerations
When considering whether to use parallel calculation (of either type), note the following:
- Your situation may not be as clear-cut as the example above, so a bit of experimentation may be in order.
- Parallel calc isn’t usually a good solution for run-on-save business rules on a data form. Forms usually have sparse dimensions as POV/Page members, so the form represents a single block, so there won’t be anything to gain.
- In a Hybrid BSO cube where one or more sparse dimensions is fully dynamic at upper levels, FIXPARALLEL will drop into serial mode if your script references an upper-level member of such a dimension. This is because of the hybrid aggregation necessary to derive that upper-level value.
- When used to perform aggregations, parallelization may not provide much benefit in Hybrid BSO cubes in general.
Summary
Using parallel calculation can decrease calculation times when used judiciously. They are best used for large, long-running batch calculations and aggregations. Keep a backup of your script, take notes on your optimization attempts, and save your log files for later comparison.
PARC Consulting has been engaged to tune and optimize many clients’ implementations. It’s a task I enjoy immensely because the results are often quite striking. We would be happy to discuss your system and how we could make it better than ever.
PARC Consulting – EPM Solutions by EPM Experts