This Rampur migration series discusses the considerations and steps involved in migrating from Greenplum to Citus for Pricefx's PA (Pricing Analytics) solution. It covers topics such as creating a Citus DB cluster, identifying large Data Source/Data Mart (DS/DMs) to distribute, configuring distribution keys, rebuilding distributed Data Source/Data Mart, and leveraging DM Publishing for performance improvements.
Articles in this series are:
Rampur PA and DM Migration Key Points
Migrate to NextGen because Citus in NextGen is more robust than Greenplum in a legacy environment.
Create a Citus DB cluster with one Coordinator and 2 Worker nodes. The Coordinator node should have sufficient resources to hold the PA data.
Identify the largest DS/DMs (typically > 100m rows) to distribute, considering the dependency between a DM's distribution key and its constituent DS(s).
Configure the distribution keys for both the DM and its main DS, as mismatched keys will result in validation errors.
Rebuild distributed DS/DMs using the IndexMaintenance DL to convert the tables to the distributed format.
Use DM Publishing to leverage the column-oriented and compressed nature of the published data table for performance improvements.
Schedule the DM Publishing DL appropriately in the overall PA data load sequence