Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

When to Use api.find and api.stream

The rule is that whenever you have a certain limited number of records to retrieve, you should always use api.find with the limits set as the parameter.

Using api.find with either a hard-coded limit such as api.find(”P”, 0, 200, ...) or with api.find(”P”, 0, api.getMaxFindResultsLimist(), ...) is a bad practice because you are expecting just a certain number of rows being present in the table. The exception is if you really want to load only a certain number of rows or you are using api.find in a loop in certain cases.

When you do not know the number of records, you have two options: either use api.stream or use api.find in a loop. The preferred way is in most cases api.stream.


Code Block
titleExample api.stream
def iter = api.stream("P", "sku", ["sku", "attribute1"], *filters)
while (iter.hasNext()) {
  def row = iter.next()
  // process the row...
  // if a performance intensive work is done here, 
  // such as another access to the DB or a datamart query
  // then use api.find instead
}
iter.close()



Code Block
titleExample api.find in a loop
def start = 0
def data = null
while (data = api.find("P", start, api.getMaxFindResultsLimit(), 
                        "sku", ["sku", "attribute1"], *filters)) {
  start += data.size()
  for (row in data) {
    // process the row
  }
}


The preferred way for loading undefined amount of data from the database is api.stream with these exceptions: 

  • If the code within the loop takes significant time, then you should use api.find instead. The reason is that api.stream maintains an open connection to the database during the processing and this can have a negative impact on the performance, whereas api.find fetches the data at once and no connection is maintained.
  • The input generation (syntax check) mode is enabled. 

(info) See also Data Querying using api.find() and General Queries (Quick Reference).

Beware of Groovy Closures Performance

It is a fact that using the Groovy closures have overhead and you should be very careful when iterating over a big amount of data. To demonstrate this here is a simple logic which just sums up numbers in a list.


Code Block
titlecollect + sum
(1..n).collect { it }.sum()


Code Block
titlewhile
long sum = 0
long i = 1
while (i <= n) {
  sum += i
  ++i
}
return sum



Code Block
titleeach
long sum = 0
(1..n).each { sum += it }
return sum


Code Block
titlefor
long sum = 0
  for (long i = 1; i <= n; ++i) {
    sum += i
  }
  return sum


Here are the measured results for a list of size n. The duration is in milliseconds.

Duration for list of size n [ms]

1 000 x

10 000 x

100 000 x

1 000 000 x

10 000 000 x

collect + sum

12

130

904

9 014

90 242

each

12

93

881

8 747

88 896

while

2

12

111

708

6 925

for

1

11

110

705

6 820

Here is a different example with a slightly more complex logic: https://dzone.com/articles/loops-performance-in-groovy

Note

It is clear that for small lists the overhead does not play a significant role in the total calculation time but for larger fields it is much better to stick to the classic while-loop or for-loop.