This article explains how timeouts are handled in formula and libraries which call each other and how they are handled in general.
TL;DR
When a formula is executed, all timeouts are reset before each of the element execution
The timeout specified on any library element applies to the cumulative time spent in the library element and not the time spent in each call to the library element
see example below
The timeouts of elements are always capped by the maximum timeout configured at the cluster level
This setting default to 900s (15mn)
It can be increased by Support via the setting
formulaEngine.script.maxTimeoutInSec
The timeout of elements is set when the formula is pushed/saved to the partition
If you increased the max timeout in your cluster, don’t forget to resave your formula if needed!
Since 10.0, PA Dataloads, PA Distributed Calculation, DMModel Calculations and Model Calculations ignores completely the timeouts in their formula
Be aware that this does not apply to the libraries though!
Example when using a Library
In the example below, running liblibs.MyLib.ElementA.getAverageMargin()
for a large set of SKUs will time out after 2 seconds. The reason is that even though each time the getProductMargin
method is called, it takes less than 2 seconds each time, the cumulative total time spent inside that method (because of its reference in the loop) will be more than 2 seconds.
libMyLib.ElementA
: (timeout 300 seconds):
Code Block |
---|
Number getAverageMargin(List<String> skus){ List<Number> margins = [] for (sku in skus) { margins.add(liblibs.MyLib.ElementB(.getProductMargin(sku))) } return margins.sum() / margins.size() } |
libMyLib.ElementB
: (timeout 2 seconds):
Code Block |
---|
Number getProductMargin(String sku) { def costPrice = api.find("PX30")... // get cost from some data source def sellPrice = api.find("PLI")... // get sellprice from another data source return sellPrice - costPrice } |
...
Timeouts are essentially safety catches against a bad logic (such as accidental infinite loops). JVM does not have a setting like "run this thread, but only for x seconds". So we use a trick that we inject (at compile time) some code that throws an exception if the system time has elapsed the start time + timeout. Now the different logic elements are essentially Groovy script classes. So there is no easy sharing of a script-global "start time" variable in all thinkable cases. So we resort to using a start time variable per element (normal or lib element) which is initialized at the class instantiation. So essentially a private script class member variable which is set to the system time in the constructor of the script class. The engine "resets" that start time for lib elements by re-instantiating the lib elements before executing every normal formula element. Normal elements are only instantiated once (hence the topic of "timer starts running when the element is executed first time" when you call previous elements further down). In Collins 5.0, there will be some improvements with regards to better reporting on which exact element a timeout happened, and that a timeout change will lead to recompilation (to make it effective), but the fundamentals will stay the same. And unless someone comes up with a way to have this safety latch implemented with a single timeout value on element level without compromising the effectiveness, it will probably stay that way.
Hint: The suggestion "simply do not compile in timeouts for lib elements" does not work, as then an infinite loop in a lib would not be caught. From a system perspective we cannot trust any Groovy code to be "safe".