Make Your Calcs Run Faster Than Ever!
This is Part One of a two-part series.
One of the biggest bottlenecks we see in EPM Cloud implementations is Business Rules (calc scripts) that take a long time to execute. In this blog post, we will explore these scripts and the use of parallel calculations to make them run more efficiently.
First, for this discussion, we will use a couple of hypothetical dimensions, Entity, and Product, as pictured below. Entity has about 100 members and Product about 1000. We will also assume that Account and Period are dense dimensions with dynamic upper-level members and that the remaining dimensions are sparse.
We will also assume that dimensions are in the following order in the outline (with number of total members / stored members):
- Account* (500 total / 400 stored)
- Period* (17 / 12)
- Entity (100 / 12)
- Product (1000 / 15)
- Currency (8 / 8)
- Scenario (3 / 3)
- Version (5 / 5)
- Year (10 / 10)
*dense dimensions
This order – wherein we order dense dimensions largest-to-smallest, then aggregating sparse dimensions smallest-to-largest, then non-aggregating sparse dimensions, often referred to as the “hourglass on a stick” order – is one of several recommendations you may read if you search the Internet for “Essbase dimension order”. This order will come into play later.
When You Should Try to Optimize Using Parallel Calc
Optimization takes time. It requires that you test, test again, and test some more, and keep meticulous records of everything you do, so choose your battles wisely. Optimization improvements can be small, they can be quite dramatic, or they can even fail to achieve any gains at all. I recently spent half a day trying to improve performance on a particular script – one that ran in about 20 minutes in an overnight ruleset – and every effort I took only made the calc times worse. Fortunately, another rule in that ruleset was optimized from about four hours to about 2 1/2 hours, so my overall optimization project was a success!
If this is a calc that you run once in a while and it takes 15 minutes it may not be worth your time to improve it to 12 minutes. On the other hand, if the calc takes an hour and you run it multiple times during the day – say during close – it may be a valuable attempt to achieve any gains.
Parallel Calculation Basics
Optimizing calculations using parallel calculation is all about understanding how Essbase – the database engine that underpins EPM products like Planning – performs calculation processes. By default, calculations run in serial mode – they use one processor to perform the operations. Using parallel calculation can take advantage of multiple processors to perform the same calculations, potentially resulting in better performance.
Parallel calculation relies upon Essbase’s ability to divide the workload into “tasks” that can be split among multiple processors (or threads). These tasks rely upon the members of one or more sparse dimensions of the cube which the calculation can divide among the multiple threads.
For example, say you will be performing an action on the various States under the USA member of the entity dimension, of which there are 50. If we can instruct Essbase to use those 50 states as task members, each of the processors can use a different state, one by one, to act until all the states have been calculated. If we use five processors for the calculation, each processor will handle ten states, and, in theory, the calc will take five times less to act. Do note that there is some overhead in using parallel calculation, as it takes time to evaluate and enumerate tasks, so your results won’t be quite that good, but you should still see a very noticeable performance improvement.
Using SET CALCPARALLEL
There are two methods in Essbase to invoke parallel calculation: SET CALCPARALLEL and FIXPARALLEL. SET CALCPARALLEL is a bit simpler to use, so we will look at it first. We will cover FIXPARALLEL in Part Two of this blog post.
SET CALCPARALLEL is the simplest way to enable parallel calculation, and you invoke it by adding this to your calc script:
SET CALCPARALLEL n; (where n is a value between 1 and 128 and represents the number of processor threads to use)
I wouldn’t use a value higher than 8 here for a variety of reasons, mostly because I haven’t seen a higher value provide a significant benefit. Also, be wary of the number of people who might run your script simultaneously, as each concurrent user will commandeer this number of threads on your processors, potentially affecting server performance. It isn’t always possible to know how many available threads your server has at any given time.
To disable parallel calc, you can use this command:
SET CALCPARALLEL 0;
This places the calc back into serial mode. If you intend to run the entire calc script in parallel mode, place the first command at the top of the script – there is no need to place the calc back into serial mode at the end of the calc; parallel mode will expire when the script finishes.
Viewing Test Results
In the EPM Cloud products, the only place you can view the execution log of a calc script is in Calculation Manager, so this is where you must execute your script. When the script is complete, select the Log Messages tab in the lower pane of Calc Manager.
It’s a good idea to export the log for safekeeping (using the Export item in the Actions menu just below the Variables tab), as the log will only be visible for the most recent execution of your script and only until you execute the script again, close the script, or close Calc Manager.
In this log, you will see messages that reveal the details about how Essbase performed the parallel calculation. Note that you may see multiple instances of these messages as Essbase processes the various instructions in the script.
Calculating in parallel with [8] threads
This indicates that you have entered parallel mode and the number of threads (processors) requested in your CALCPARALLEL command.
Calculation task schedule [x,y,z,…]
This shows how Essbase has allocated dimension members to be divided among the 7 processors you requested.
Parallelizing using [x] task dimensions
This log entry shows how many dimensions were used to determine the task schedule values in the prior message.
Further down the log, you will see an entry like this:
Empty tasks [x,y,z,…]
This message exposes exactly what Essbase did task-wise during the execution of your calc.
In Part Two, we will decode what these messages mean, how you can tweak what Essbase does with CALCPARALLEL, and how to use FIXPARALLEL to gain better control of parallel calculation when CALCPARALLEL doesn’t achieve your objective.
If you have questions about optimizing your scripts or EPM Cloud environments, please contact us for more information.
PARC Consulting – EPM Solutions by EPM Experts