Overview (Product Similarity)

Purpose

It is usually difficult to handle a large number of products, yet at the same time defining a pricing strategy for the right subset of products is key, as it is not realistic to have a strategy for each and every products.

In this scenario Product Similarity Accelerator is a science brick that provides a similarity score between products and possibly regroups them into similar groups. These groups can then be leveraged to apply a pricing strategy.

In addition, Similarity groups help enrich data for further processes, such as Clustering or Negotiation Guidance and similar products can be offered as an alternative product in a quote.

Other typical use cases:

For a Pricing Manager to understand relationships between products and to group them appropriately in order to steer pricing strategy at this level.
For a Data Manager to match competitive products with their own portfolio.
For a Spare Part Manager to match newly created parts within a meaningful product category.

Pricefx Solution

Product Similarity Accelerator walks you through the steps to easily compute a similarity score and regroup products by similarity based on product specifications. To do that, the model relies on 3 types of information used to define the products:

Text Attributes – Any textual data, such as product name or product description, that describe the product. Several fields can be used (and will be combined). The resulting text will be encoded into a set of numbers by a “Transformer” which is the T of the famous ChatGPT. This step encapsulates the meaning and then compares texts, including synonyms. What is provided:
- English text transformer
- Multilingual text transformer, including Arabic, Chinese, Dutch, English, French, German, Italian, Korean, Polish, Portuguese, Russian, Spanish, Turkish.
Categorical Attributes – Can be any kind of specification that defines the product, such as product category, brand, type, color…
Numerical Attributes – Can be any kind of numerical specification, such as size, power… or even price (you can define a price threshold to avoid comparison of products that have prices too far apart).

From those attributes, a similarity score is computed, with the possibility to give a specific weight to each attribute type. Then similarity groups are created based on the relationships and similarity among all products.

Outputs

The outputs of the model are:

List of products and similar product, including the similarity score between them.
List of products with their similarity group.

A set of dashboards is also provided in order to review and assess the outputs.

Limitations

Number of products – The current implementation can use up to 150 000 products at once.
15 languages supported – Arabic, Chinese, Dutch, English, French, German, Italian, Korean, Polish, Portuguese, Russian, Spanish, Turkish.
There is a possibility to use only English which increases the accuracy slightly. Several languages can be mixed and still be compared, possibly with a decrease of accuracy.
No out-of-the-box way to manually adjust outputs, such as reassigning products to another product group. If needed, this should be configured separately e.g. in Custom Forms.
No predefined extension point – There is no out-of-the-box extension point defined for now. If you intend to use your own metric, custom code should be written. (But then the accelerator becomes specific so it cannot be updated without extra effort to port those modifications.)
Data requirements – See Data Requirements (Product Similarity).