pfx-api:loaddataFile

LoaddataFile is used to process big data files and load them into a partition. You can use Excel or CSV format, GZIP, ZIP, or raw format. IM takes data from the file, validates them, splits them into batches if needed, compresses them, and sends them to the partition.

Data can be loaded into a partition in two ways:

  • Sync immediately returns the result of the operation.

  • Async returns the job of the operation when it is completed. Async load creates one or more jobs on the partition and lets the partition process them. You can check the status of the job via Unity UI – Job/Task Tracking. Optionally, IM can wait for the job to be completed. In that case, you need to use the right waitForCompletionStrategy and asyncTimeout.

Properties

Option

Type

Default

Description

Option

Type

Default

Description

mapper

string

 

Defines a mapper name which is a Spring bean or definition in the header or property.

objectType

object

 

Defines the object type. Possible values are listed at Type Codes.

connection

string

 

Defines a connection to Pricefx. This is an optional parameter and if it is omitted, the connection is taken from the Spring bean named pricefx.

pricingParameterName

string

 

Defines the Price Parameter name for lookup table values.

pricingParameterId

number

 

Defined the Price Parameter ID for lookup table values.

detectJoinFields

Boolean

 

Defines whether join fields definitions should be taken from the server. 

businessKeys

string

 

Defines the business key.

businessKeysMaxLengths

string

 

Defines the maximum length of business keys. Provide here a comma-separated list where the number of values must be the same as number of business keys. Sum of lengths must be less than or equal to 1024.

batchSize

number

5000

Defines the number of records to be processed in a single batch. The default value is 5000 but it should be adjusted based on the number of the records and memory available on the machine. The value of 5000 is there because it is used also for fetch where it is recommended to fetch data in chunks of 5000 rows.

For loaddataFile is recommended to use a much higher value, such as from 100 000 to 500 000.

async

Boolean

false

Defines if the load will be done asynchronously or not.

  • If it is set to true, the load will be done asynchronously.

  • If it is set to false, the load will be done synchronously. Async load creates a job on the partition side and the partition is responsible for loading the data. The partition will load the data in the background and the job will be completed when the data is loaded. IM receives job ID and based on waitForCompletionStrategy it will check the status of the job periodically or continue to the next route step.

asyncTimeout

number

30000 (ms)

This parameter is applied only if async=true and waitForCompletionStrategy is Always or AlwaysAndIgnoreFailures. It is important to use the right value otherwise the IM will throw an exception when the timeout is reached.

waitForCompletionStrategy

enum

Never

This parameter is used only when async=true and it determines the waiting mode. There are three options:

  • Never – Load is done asynchronously but IM does not wait for the async jobs to be finished.

  • Always – IM waits for the async jobs to be finished before continuing.

  • AlwaysAndIgnoreFailures – IM waits for the async jobs to be finished before it continues and it ignores errors.

Job/Task Tracking

The following overview is available in Unity UI:

Examples

loaddataFile Sync

The following code describes how to use loaddataFile sync.

<loadMapper id="productMapper" includeUnmappedProperties="false"> <body in="sku"/> <body in="name" out="label/> </loadMapper> <routes xmlns="http://camel.apache.org/schema/spring"> <route> <from uri="direct:create"/> <to uri="pfx-io:streamCompressedFile"/> <to uri="pfx-csv:streamingUnmarshal?skipHeaderRecord=true"/> <to uri="pfx-api:loaddataFile?batchSize=500000&amp;mapper=productMapper&amp;objectType=P"/> <log message="Done"/> </route> </routes>

loaddataFile for extension objects

The following code describes how to use loaddataFile for extension objects. Note that you have to use useReusableParser=true option in pfx-csv:streamingUnmarshal.

<loadMapper id="pxMapper" includeUnmappedProperties="false"> <body in="sku"/> <body in="name" out="label/> <constant expression="SamplePX" out="name"/> </loadMapper> <routes xmlns="http://camel.apache.org/schema/spring"> <route> <from uri="direct:create"/> <to uri="pfx-io:streamCompressedFile"/> <to uri="pfx-csv:streamingUnmarshal?skipHeaderRecord=true&amp;useReusableParser=true"/> <to uri="pfx-api:loaddataFile?batchSize=500000&amp;mapper=pxMapper&amp;objectType=PX"/> <log message="Done"/> </route> </routes>

loaddataFile async without Waiting

<loadMapper id="productMapper" includeUnmappedProperties="false"> <body in="sku"/> <body in="name" out="label/> </loadMapper> <routes xmlns="http://camel.apache.org/schema/spring"> <route> <from uri="direct:create"/> <to uri="pfx-io:streamCompressedFile"/> <to uri="pfx-csv:streamingUnmarshal?skipHeaderRecord=true"/> <to uri="pfx-api:loaddataFile?batchSize=500000&amp;async=true&amp;mapper=productMapper&amp;objectType=P"/> <log message="Done"/> </route> </routes>

loaddataFile async and Wait for Jobs to Complete

 

IntegrationManager version 5.8.0