Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

Since version 10.0

You will use a Distributed Calculation Dataload when you need to:

  • Enrich/process a Datamart, Data Source or Data Feed with big amount of data, for example to:

    • Add new records - copy and transform data from other table to this table.

    • Modify existing records - pre-calculate values, which were not imported from an external system, but are needed for analysis and their calculation takes too long to be done on demand.

Note, that the Distributed Calculation Dataload can be still set NOT to be executed in the distributed mode. The use-cases could be, for example:

  • The processing cannot be split up to batches and executed concurrently, because of dependencies between the rows. You may need to iterate over the rows in given order to track price changes.

  • Replace the legacy Calculation Dataload with quite a bit faster version.

  • You need to test the process and would like all to run in the same thread.

Calculation Item

Calculation Item represents the "batch" of records, which will be processed together in one execution.

Unlike legacy Calculation Dataload, in the distributed calculation the calculation elements are executed for each Calculation Item, instead for each row.

Logic API

  • Logic Nature: distPACalc

    • Logic Type: Calculation/Pricing

  • Execution Types - each logic element can belong (and be executed) to the following element contexts (which are executed in 3 stages of the distributed calculation dataload process:

    1. calculation-init - prepares the list of Calculation Items. The system then stores them in company parameter DistributedCalculation [xxx]

    2. calculation - process one Calculation Item. These calculation elements will be executed once for each Calculation Item. It is likely, that more Calculation Items will be processed in parallel, likely on different machines.

    3. calculation-summary - summarize some stats about the process

element context init init calculation calculation summary summary

Execution Type

Input Generation

Normal

Input Generation

Normal

Input Generation

Normal

dist : DistFormulaContext

yes

yes

yes

yes

yes

yes

build input field definitions

yes

yes

yes

input : Map

yes

yes

yes

api.currentItem()

yes

yes

yes

generate rows for target table

yes

yes

yes

generate list of Calculation Items

yes

process a Calculation Item

yes

calculate summary of the process

yes

  • Information provided to the logic

    • binding variable dist : DistFormulaContext

    • binding variable input : Map - with values of all input fields created by the logic and set by the user

    • api.currentItem() : Map - definition of the Dataload. Not available during testing of the logic.

    • Allow object modification - true. This process can update data in tables (e.g., via api.update()).

  • Expected logic outcome

    • Input fields - during configuration of the Dataload, each element will be executed in Input Generation mode, to be able to build input fields it needs.

    • in each execution stage, the logic can generate rows for target table

    • generate list of Calculation Items - the init stage can generate the Calculation Items. The system will store them in the company parameter DistributedCalculation [xxx] (where xxx is an ID of the Dataload)

    • process a Calculation Item - in the calculation stage, the system will execute the calculation elements for each Calculation Item found

    • calculate summary of the process (e.g. some stats of the data rows generated)

      • summary values returned via elements with Display Mode Everywhere. Those will be stored in the Dataload definition.

Configuration

Besides Studio, the logic can be configured at Administration  Logics  Calculation Logic  Analytics Calculations.

The dataload is configured via Analytics  Data Manager  Data Load. See also Distributed Calculations in Analytics in documentation.

Code Samples

  • No labels