Technical User Reference (Optimization - Product Similarity)
This section details the ModelClass and logics that the Product Similarity Accelerator deploys. For each step, its aim, outputs, and the main reasons to modify the logics are explained.
In this section:
Product Similarity Model Class
The Product Similarity Model Class organizes a list of logics to create the model architecture. It is a JSON file that refers to some logics and it is transformed into an optimized UI in the Pricefx platform.
The general architecture of the Product Similarity Model Class is:
It defines five steps:
Definition – Sets the scope of the products tables and of transactions, and sets parameters for similarity exploration model.
Similarity Weighting – Runs similarity measures and screens for more similar products, then lets the user set the weights and threshold for finest comparison of products.
Product Similarity – Looks at the outputs of the similarity analysis, the similar products of any product and finally lets the user configure the grouping.
Product Grouping – Looks at the groups and products in the groups.
Additional Products – Looks at the groups and how new products are dispatched in these groups.
Library
The logic is PSim_Lib.
Definition Step
There is no calculation logic in this step, and there are three tabs with related dashboard and evaluation logics: PSim_1_definitions_eval_productData and PSim_1_definitions_eval_transactionData and PSim_1_definitions_eval_modelConfiguration.
Similarity Weighting Step
Contains one calculation sequence that chains 4 logics PSim_2_simWeights_calc_loadData, PSim_2_simWeights_calc_textTransformers, PSim_2_simWeights_calc_approxNearestNeighbors, and PSim_2_simWeights_calc_coProductMetaData that are executed when accessing this step. The dashboard is split in two panels, one for user inputs, the other for evaluation.
Calculation: Data Aggregation
The logic is PSim_2_simWeights_calc_loadData.
Calculation: Text Transformation
The logic is PSim_2_simWeights_calc_textTransformers.
Calculation: Raw Similarity
The logic is PSim_2_simWeights_calc_approxNearestNeighbors.
Calculation: CoProductMetaData
The logic is PSim_2_simWeights_calc_coProductMetaData .
Setup Panel
The logic is PSim_2_simWeights_eval_simWeights and uses PSim_2_simWeights_eval_inputSimWeights_Configurator.
EvaluationPanel
The logic is PSim_2_simWeights_eval_simWeights .
Product Similarity Step
Starts with one calculation logic PSim_3_similarity_calc_productSimilarity that is executed when accessing this step which splits in three tabs: Similarity Overview, Similarity Dashboard, and Similarity Grouping. First, this calculation subsets the products' pairs that fulfill the minimum similarity criterion and saves them in a model table. Then, some other model tables are prepared to have the data ready for display in the dashboard’s histograms and tables, particularly similarityTable
.
Similarity Overview
The logics is PSim_3_prodSimilarity_eval_simOverview.
Similarity Dashboard
The logic is PSim_3_prodSimilarity_eval_simDashboard which proposes an interactive dashboard for exploration of one product’s similarities.
Similarity Grouping
The logic is PSim_3_prodSimilarity_eval_simGrouping.
Product Grouping Step
Starts with two calculation logics named PSim_4_community_calc_wavgCommunity and PSim_4_community_calc_namingCommunity called in sequence.
Similarity Grouping Dashboard
The logic is PSim_4_prodGrouping_eval_simGrouping.
Product Overview
The logic is PSim_4_prodGrouping_eval_prodOverview.
Setting for Additional Products
The logic is PSim_4_prodGrouping_eval_additionalProduct.
More Products Step
Contains two calculation logics named PSim_5_newProducts_calc_loadNewData and PSim_5_newProducts_calc_labelingNewProducts that are automatically triggered when accessing this step. The first one loads the data about new products using the filters defined in the previous step. The second one is more complex:
Prepares the data for comparison purposes.
Computes embeddings only for new products that are unknown (to reduce resources usage and computation time).
Labels New Products using the parameters selected by the user (metric type, most similar or majority) and a nearest neighbor approach that allows each new product to find its best neighbors in the graph of similarity made using original products. This process is multi-threaded, so each new product will explore the graph of already labelled products to find its right place in an independent thread. The number of simultaneous threads equals to the number of available CPUs. For graph search, the function
process_new_product
present in Python Engine starting with version v9 is used.Saves the results in model tables, particularly
newProductTable
.
This step can be re-run several times on different subsets of products by changing the setting from the last tab of the previous step. Each new run will result in an extension of the table which stores results with the new products affectation, including the time stamp of the run.
New Products
The logic is PSim_5_moreProducts_eval_newProducts.
Updated Groups
The logic is PSim_5_moreProducts_eval_updatedGroups.
Evaluations
The model has one evaluation: PSim_ModelEvaluation_Eval. That allows you to retrieve for one product or a list of products all the raw similarities that have been computed for it/them. For more details about model evaluations see Query Optimization Engine Results | Using the Evaluator.