Do the database transaction also start and end with the calculation logic (i.e., separate transaction for each Data Feed row)?
The consumer (i.e., the thread running the ‘Calculation Logic’) performs its own transaction management, independent of the feeder. It’s important to distinguish between the transaction in the domain objects DB and the analytics DB:
In the domain object DB we keep things like Analytics object meta data (DM, DS definitions, DataLoad task definitions, JobStatusTrackers…). In the feeder scenario, those transactions are committed after each emitted item is processed, mainly to expose the progress to the clients.
In the Analytics DB, we actually have just one transaction for the whole DL job. It works like this:
On job start, all the relevant data is loaded locally.
The calculation logic processes the data and stores the result all locally.
On completion, the whole of the result is bulk-uploaded to the target table.
Then the Analytics transaction is committed.
The reason behind this design is that it’s much faster to process the data locally, and also much faster to bulk-load the results, rather than update individual rows or even smaller batches.