Distributed Calculation

Distributed calculation which you can select in the context of PL/LPG/CFS can speed up your processing time dramatically.

A non-distributed calculation uses only one node and only one thread is created, whereas the distributed calculation dynamically selects the number of nodes which have a configurable number of threads that are going to be created per node to perform the calculation.

Here are some examples of distributed calculation types.

Non-Distributed Calculation

This flow shows an example of a calculation process for non-distributed calculation.

Sample setup in the UI:  


Distributed Calculation

This flow shows an example of a calculation process for a distributed calculation.

It was built based on the non-distributed calculation flow chart but the idea is to show the processes only for one node (which in fact might have many more threads). The number of nodes for calculation is dynamically set, so it is hard to predict how many nodes and how many threads you are going to have in total. But the process is the same on all nodes.

Sample setup in the UI: 

Example

Based on this flow, let's assume that you have 6000 items to be calculated and they are going to be split among all of the nodes.

For example, if you have 3 nodes in the game, then NODE1 will have 2000 items, NODE2 also 2000 items, NODE3 the same.

The next step is to split these items into batches and jobs to be calculated (each node and thread will work in parallel).

There will be 5 threads per node and we will end up having: On NODE1 there are 400 items to be calculated (2 batches) per thread and the same for the other ones, meaning that the api.global will be shared only between threads (not nodes) and so you cannot pass items between nodes using api.global. Instead, api.getSharedCache()/set.../remove... methods should be used which can be accessed between nodes and its threads. 

In the pricefx-config.xml file, you can restrict the number of parallel slave threads for a given job by its size. The definition must be sorted from low to high. The key defines the maximum job size for this slave count ("up to" semantics).

<maxSlaveThreadsPerJobMap>
  <upto_5000>1</upto_5000>
  <upto_20000>2</upto_20000>
  <upto_50000>3</upto_50000>
  <upto_10000000>4</upto_10000000>
</maxSlaveThreadsPerJobMap>

Q&A

Q: Is there any minimum limit determining if the job creates batches or not?

A: Yes but it may vary in different environments, so please verify with Support.

  • In shared production, if there are more than 2000 items, the calculation will create batches and then calculate them. If the limit is not reached, batches will always return null and calculation will only be performed in the row calculation level.
  • This limit (5000 by default) is set in the Pricefx configuration file (server/config/pricefx-config.xml) as follows:

    <!--  Configuration values applicable if the node is a calculation slave -->
            <calculationSlave>
                    <!-- Minimum number of items to process to allow/use distributed mode -->
                    <itemThreshold>5000</itemThreshold>
            </calculationSlave>

Q: Is it worth using in all cases?

A: Generally yes. Moreover, if you are worried about performance, this is the first point you should check.


Q: I cannot use api.addOrUpdate() method in the distributed calculation, is there any workaround?

A: Yes, there is api.boundCall() which can be called including batches, but it is not recommended to use. However, if you do it safely, it will make sense (addOrUpdate only specific sort of keys, not the same).

Found an issue in documentation? Write to us.