...
By transitioning to the Citus solution in NextGen, customers can benefit from a more streamlined and reliable database infrastructure capable of meeting the complex and dynamic requirements inherent to Pricefx's PA ecosystem.
Rampur Upgrade Flowchart
The following illustration depicts the flowchart for a process involving the upgrade to Rampur version 13 and subsequent steps based on variety of conditions.
...
Rampur Upgrade Flowchart Steps
Here is a detailed breakdown of the flowchart:
Start
Upgrade to 13
Using Greenplum?
Yes: Move to the next decision point.
No: End of the process.
Migrate to NextGen?
Yes: Create Citus DB Cluster, then Identify large DS/DMs, followed by Configure Distribution Keys, and finally Done.
No: Move to the next decision point.
Use DM Publishing?
Yes: Schedule DM Publishing DL(s), then Done.
No: Rebuild distributed DS/DMs, then Done.
Rampur Upgrade Process Insights:
Upgrade Path: The process starts with an upgrade to version 13.
Greenplum Usage: The first decision point checks if Greenplum is being used and If not, then the process ends.
NextGen Migration: If Greenplum is used, the next decision point is whether to migrate to NextGen. If migrating, then a Citus DB Cluster is created, large DS/DMs are identified, and distribution keys are configured.
DM Publishing: If not migrating to NextGen, the next decision checks for DM Publishing usage.
If using DM Publishing, scheduling DL(s) is done.
If not using DM Publishing, distributed DS/DMs are rebuilt.
This flowchart provides a clear and structured approach to handle database upgrades and migrations based on specific conditions and requirements.
Additional PA Considerations for Rampur
Migrate to NexGen
The rationale for migration is that the Citus database solution employed in the NextGen environment is considered a more robust and capable option compared to the legacy Greenplum deployment. Greenplum, while functional, represents a more complex database system that requires extensive configuration and tuning efforts to ensure optimal performance across the wide-ranging and often dynamic requirements of Pricefx customers.
These customer-specific demands can encompass varied PA data schemas, significant data volumes, diverse reporting and dashboard queries, as well as the intricate pricing logic governing quotes, agreements, and batch processing workflows. Migrating to the NextGen platform with its Citus-based architecture provides a more streamlined and reliable database solution capable of meeting these complex operational needs.
Create Citus DB Cluster
The initial Citus cluster configuration for the migration involves a single Coordinator node paired with two Worker nodes. In this setup, all of the existing PA data is first migrated to the Coordinator node, which must be provisioned with sufficient computing resources to accommodate this data payload.
...
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
LEARN MORE: To learn more about this process, click here. |
Identify Large Data Sources (DS) and Data Marts (DM)
Unlike the legacy Greenplum deployment, the approach taken with the Citus-based migration does not involve automatically distributing the data across all Data Sources (DSs) and Data Marts (DMs). In contrast, a more selective approach is adopted, as most tables, when considering the total number of rows, do not stand to benefit from distributed data storage.
...
When choosing which tables to distribute, we also need to consider the dependency between a DM’s distribution key, and that of its constituent DS(s). See the next step.
Configure Distribution Keys
This concept can be best illustrated through an example. When starting with a large Transactions Data Mart (DM), the obvious choice for the distribution key would typically be the sku or productId field. This is because the out-of-the-box functionality in Pricefx's PA solution is often oriented around product-centric data and workflows.
...
Tip |
---|
NOTE: However, it is important to note that this is not a universal rule, as the optimal distribution key can vary based on the specific nuances of each customer's data landscape. |
Distribution Key Examples
For example, in the case of the Company X deployment, the reverse scenario was true, with the customer data (reflected in the customerId field) comprising the more suitable distribution key for the Transactions DM.
...
These examples illustrate the importance of carefully evaluating the unique data characteristics and relationships within each customer's PA environment to identify the most suitable distribution key for the DMs.
Distribution Key Configuration Summary
Once the optimal distribution key has been identified, it is crucial that this configuration is applied consistently across both the Data Mart (DM) and its corresponding primary Data Source (DS).
...
Careful coordination of the distribution key settings across the DMs and their primary DSs is essential to maintain the integrity and operational reliability of the Pricefx PA solution.
Rebuild distributed DS/DMs
It is important to note that when an existing Data Source (DS) or Data Mart (DM) is configured to be distributed, simply deploying the new configuration does not automatically convert the underlying database table structure. Instead, an additional step is required to physically rebuild the table to align with the distributed architecture.
...
This same method can be used when removing or changing the distribution key of a DS/DM.
Use DM Publishing
Why use this? There is a functional case and a performance based one. For a detailed explanation of what DM publishing entails,
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
LEARN MORE: To learn more about DM publishing functional and performance cases, click here /wiki/spaces/EN/pages/4591845377. |
When leveraging the Citus database solution, significant performance benefits can be realized through the utilization of the Publish DM database table. This is attributed to the column-oriented structure of the Publish DM table, coupled with the data compression capabilities inherent to its design.
...
Tip |
---|
NOTE: When using Citus, there is significant performance to be gained from the fact that the Publish DM DB table is column oriented, and its data is compressed. |
Scheduling DM Publishing DL(s)
The moment a DM’s Publishing DL has run for the first time, any client or logic querying the DM data will see only this published data. New or modified data loaded in its DSs, will not show until after the next Publishing DL run. Clearly, this new DL needs to be appropriately scheduled into the overall PA data load sequence.
...