Third Iteration – Further Performance Boost
Loading Product Costs in a Batch
Now let’s optimize one more part of the logic. So far the logic accesses the Product Cost PX on every line to fetch the current product cost. This means if there are 100,000 products, there will be 100,000 accesses to the database. This is very inefficient.
Because the price list calculation is done in batches by 200 items (by default, the number can vary), we can fetch the costs of all the 200 products at once, cache it and then access it. If we have 100,000 products, we will access the database only 100,000 / 200 = 5,00 times which is a huge performance improvement.
We will use the approach described in Performance Improvements (Batching, api.getBatchInfo). We will use the raw API calls and not the CommonLib.
ProductCost Element
Before
def sku = api.product("sku") // get the current or target date def targetDate = api.calendar().format("yyyy-MM-dd") // set up the filters def filters = [ Filter.equal("name", "Product_Costs"), // PX name Filter.equal("sku", sku), // Sku Filter.lessOrEqual("attribute4", targetDate) // Cost Date ] // fetch the latest cost from the PX // use the expansion operator to expand a list into a varargs array def data = api.find("PX", 0, 1, "-attribute4", ["attribute1"], *filters) if (!data) { api.redAlert("Cost not found for $sku") api.local.errors << "Cost not found for $sku" return 0.0 } return data[0].attribute1?.toBigDecimal()
After
def sku = api.product("sku") if (!api.global.batchCosts) { api.global.batchCosts = [:] } if (api.global.batchCosts[sku] == null) { api.global.batchCosts = [:] def targetDate = api.calendar() def skus = api.getBatchInfo()?.collect { it[0] } ?: [sku] as Set // set up the filters def filters = [ Filter.equal("name", "Product_Costs"), // PX name Filter.in("sku", skus), // Sku Filter.lessOrEqual("attribute4", targetDate) // Cost Date ] // fetch the latest cost from the PX // use the expansion operator to expand a list into a varargs array def iter = api.stream("PX", "-attribute4", ["sku", "attribute1"], *filters) while (iter.hasNext()) { def row = iter.next() if (!api.global.batchCosts.containsKey(row.sku)) { api.global.batchCosts[row.sku] = row.attribute1.toBigDecimal() } } iter.close() } return api.global.batchCosts[sku]
Step-by-step Analysis
if (!api.global.batchCosts) { api.global.batchCosts = [:] }
Here we initialize the global map api.global.batchCosts
. The map will contain product costs stored by product sku for the current batch. We will do that only once.
if (api.global.batchCosts[sku] == null) { api.global.batchCosts.clear() ... } return api.global.batchCosts[sku]
If we do not find the cached cost for the current sku
it means we are starting a new batch. Therefore we clear the map to prepare it for the new batch.
Otherwise we return the cached cost.
def skus = api.getBatchInfo()?.collect { it[0] } ?: [sku] as Set
Here we load the batch info using api.getBatchInfo()
. In the context of a price list, the method returns a list of product [sku, key2]
which will be processed during the batch. We need only the sku
, therefore we pick just the first item in the array it[0]
.
In case the number of processed products is too small, the logic may return null which means no batching will be performed. If this is the case, we assign the skus
variable a list with a single element [sku]
which is the current product sku.
In the contexts of PLI, XPLI, PGI, XPGI api.getBatchInfo()
returns a list of arrays [sku, key2]
. For the context of a CFS run on a P, PX, C, CX the method returns a list of arrays [itemId, typedId]
.
// set up the filters def filters = [ Filter.equal("name", "Product_Costs"), // PX name Filter.in("sku", skus), // Sku Filter.lessOrEqual("attribute4", targetDate) // Cost Date ]
We set up the filters in a similar fashion as before, the only difference is that we use Filter.in
instead of Filter.equal
and pass the list of products sku
which we want to load. We are only interested in records whose Valid From is less or equal to the target date.
def iter = api.stream("PX", "-attribute4", ["sku", "attribute1"], *filters)
Next we stream the results. We sort the results by the Valid From date in descending order using -attribute4
which is important because we may receive multiple results for each sku
. We are only interested in the sku
and the cost which is attribute1
.
while (iter.hasNext()) { def row = iter.next() if (!api.global.batchCosts.containsKey(row.sku)) { api.global.batchCosts[row.sku] = row.attribute1.toBigDecimal() } }
Now we iterate over the results. Because we sorted the costs by the Valid From in the descending order, we will get the most recent cost of each product first. Therefore we do the containsKey
check because we only want to store this one and not overwrite it with the older costs. We store the most recent cost of each sku
in the api.global.batchCosts
map.
Found an issue in documentation? Write to us.