Loading Product Costs in a Batch

Now let’s optimize one more part of the logic. So far the logic accesses the Product Cost PX on every line to fetch the current product cost. This means if there are 100,000 products, there will be 100,000 accesses to the database. This is very inefficient.

Because the price list calculation is done in batches by 200 items (by default, the number can vary), we can fetch the costs of all the 200 products at once, cache it and then access it. If we have 100,000 products, we will access the database only 100,000 / 200 = 5,00 times which is a huge performance improvement.

We will use the approach described in Performance Improvements (Batching, api.getBatchInfo). We will use the raw API calls and not the CommonLib.

ProductCost Element

Before

ProductCost

def sku = api.product("sku")

// get the current or target date
def targetDate = api.calendar().format("yyyy-MM-dd")

// set up the filters
def filters = [
    Filter.equal("name", "Product_Costs"), // PX name
    Filter.equal("sku", sku), // Sku
    Filter.lessOrEqual("attribute4", targetDate) // Cost Date
]

// fetch the latest cost from the PX
// use the expansion operator to expand a list into a varargs array
def data = api.find("PX", 0, 1, "-attribute4", ["attribute1"], *filters)
if (!data) {
  api.redAlert("Cost not found for $sku")
  api.local.errors << "Cost not found for $sku"
  return 0.0
}

return data[0].attribute1?.toBigDecimal()

After

ProductCost

def sku = api.product("sku")

if (!api.global.batchCosts) {
  api.global.batchCosts = [:]
}

if (api.global.batchCosts[sku] == null) {

  api.global.batchCosts = [:]

  def targetDate = api.calendar()

  def skus = api.getBatchInfo()?.collect { it[0] } ?: [sku] as Set

  // set up the filters
  def filters = [
      Filter.equal("name", "Product_Costs"), // PX name
      Filter.in("sku", skus), // Sku
      Filter.lessOrEqual("attribute4", targetDate) // Cost Date
  ]

  // fetch the latest cost from the PX
  // use the expansion operator to expand a list into a varargs array
  def iter = api.stream("PX", "-attribute4", ["sku", "attribute1"], *filters)
  while (iter.hasNext()) {
    def row = iter.next()
    if (!api.global.batchCosts.containsKey(row.sku)) {
      api.global.batchCosts[row.sku] = row.attribute1.toBigDecimal()
    }
  }
  iter.close()
}

return api.global.batchCosts[sku]

Step-by-step Analysis

if (!api.global.batchCosts) {
	api.global.batchCosts = [:]
}

Here we initialize the global map api.global.batchCosts. The map will contain product costs stored by product sku for the current batch. We will do that only once.

if (api.global.batchCosts[sku] == null) {

	api.global.batchCosts.clear()
	...
}
return api.global.batchCosts[sku]

If we do not find the cached cost for the current sku it means we are starting a new batch. Therefore we clear the map to prepare it for the new batch.

Otherwise we return the cached cost.

def skus = api.getBatchInfo()?.collect { it[0] } ?: [sku] as Set

Here we load the batch info using api.getBatchInfo(). In the context of a price list, the method returns a list of product [sku, key2] which will be processed during the batch. We need only the sku, therefore we pick just the first item in the array it[0].

In case the number of processed products is too small, the logic may return null which means no batching will be performed. If this is the case, we assign the skus variable a list with a single element [sku] which is the current product sku.

In the contexts of PLI, XPLI, PGI, XPGI api.getBatchInfo() returns a list of arrays [sku, key2]. For the context of a CFS run on a P, PX, C, CX the method returns a list of arrays [itemId, typedId].

// set up the filters
def filters = [
	Filter.equal("name", "Product_Costs"), // PX name
	Filter.in("sku", skus), // Sku
	Filter.lessOrEqual("attribute4", targetDate) // Cost Date
]

We set up the filters in a similar fashion as before, the only difference is that we use Filter.in instead of Filter.equal and pass the list of products sku which we want to load. We are only interested in records whose Valid From is less or equal to the target date.

def iter = api.stream("PX", "-attribute4", ["sku", "attribute1"], *filters)

Next we stream the results. We sort the results by the Valid From date in descending order using -attribute4 which is important because we may receive multiple results for each sku. We are only interested in the sku and the cost which is attribute1.

while (iter.hasNext()) {
	def row = iter.next()
	if (!api.global.batchCosts.containsKey(row.sku)) {
		api.global.batchCosts[row.sku] = row.attribute1.toBigDecimal()
	}
}

Now we iterate over the results. Because we sorted the costs by the Valid From in the descending order, we will get the most recent cost of each product first. Therefore we do the containsKey check because we only want to store this one and not overwrite it with the older costs. We store the most recent cost of each sku in the api.global.batchCosts map.

Configuration Engineer Knowledge Base

Third Iteration – Further Performance Boost

Analytics

Loading Product Costs in a Batch

ProductCost Element

Step-by-step Analysis

Related content