How Product Recommendations Are Defined

This section provides some theory behind Product Recommendations. Their methodology applies frequency based recommendations at three different levels:

  • Customer Specific – How often did a customer buy a specific product?

  • Product Specific – How often were two products purchased together?

  • Customer Segment Specific – How often did customers in a segment buy a specific product?

This gives recommendations based on the quoted customers buying history, quoted products buying history, and buying history of similar customers. The key identifier for frequency based recommendations is the field BasketId, it is the join/grouping key such as “InvoiceNumber” or “TransactionID”, that links multiple purchases into a single basket. It is used to identify Customer baskets and co-occurrences of product purchases in them.

The rest of recommendations are all product specific and similarity based. In addition to transactional data, they incorporate product to product similarity from the selected Similar Products model.

In this section:

“Frequently Purchased” Methodology

To define the Customer Specific recommendation “Frequently Purchased”, a frequency rate metric is calculated using the count of baskets of the specific Customer as the denominator and the basket count of Customer-Product purchases as the numerator.

  • Customer A basket Count: 100

  • Customer A → Product A basket Count: 20

  • Customer A → Product A Frequency Rate: 20%

Next, the frequency rate is multiplied by the products historical profitability metric to come up with a recommendation Score.

  • Product A Historical Margin: $200

  • Customer A → Product A Frequency Rate: 20%

  • Customer A → Product A Score: 20% x $200 = 40

The Score is used to rank recommendations in descending order and the Top N recommendations for each Customer are stored based on thresholds defined in the Model Class definitions.

Customer Specific Recommendations Table

 

“Bought Together” Methodology

To define the Product Specific recommendation “Bought together”, a frequency rate metric is calculated using the count of baskets with the quoted Product as the denominator and the basket count with Product and Co-product purchases as the numerator.

  • Product A basket Count: 100

  • Product A → Product B (Co-product) basket Count: 20

  • Product A → Product B (Co-product) Frequency Rate: 20 / 100 = 20%

Next, the frequency rate is multiplied by the Co-products historical profitability metric to come up with a recommendation Score.

  • Product B (Co-product) Historical Margin: $200

  • Product A → Product B (Co-product) Frequency Rate: 20%

  • Product A → Product B (Co-product) Score: 20% x $200 = 40

The Score is used to rank recommendations in descending order and the Top N recommendations for each Product are stored based on thresholds defined in the Model Class definitions.

Outputs are stored in Model Table “Product Recommendations”

image-20231228-100015.png
“Bought Together” Product Specific Recommendations Table

“Others Buy” Methodology

To define the Customer Segment Specific recommendation “Others buy”, a frequency rate metric is calculated using the count of baskets of the Customer Segment as the denominator and the basket count of Customer Segment - Product purchases as the numerator.

First we create a mapping table of Customer Segments to individual Customers. This is done either from the transaction source or it is generated within the model using a hierarchical clustering algorithm (future iteration will also allow for external clustering Model Class).

Once the mapping table is created, Customer Segments are mapped to historical transactions to allow for the frequency and score calculation to be done at the segment level, by linking each Customer basket to Customer Segment.

  • Segment A basket Count: 100

  • Segment A → Product A basket Count: 20

  • Segment A → Product A Frequency Rate: 20 / 100 = 20%

Next, the frequency rate is multiplied by the products historical profitability metric to come up with a recommendation Score.

  • Product A Historical Margin: $200

  • Segment A → Product A Frequency Rate: 20%

  • Segment A → Product A Score: 20% x $200 = 40

The Score is used to rank recommendations in descending order and the Top N recommendations for each Segment are stored based on thresholds defined in the Model Class definitions.

Outputs are stored in Model Table “Customer Segment Recommendations”:

Model Generated Customer Segments

If no customer segment is present in the transaction source, users can generate a customer segment in the Model Class. We segment based on Product Affinity, meaning customers who buy similar products.

  1. Frequency matrix of Customer x Products is created.

  2. Frequency matrix is then normalized using chi-square based standardized residuals.
    Standardized Residuals: ( Actual Frequency - Expected Frequency ) / SQRT( Expected Frequency )
    The reason for using Standardized Residuals vs. Absolute Frequency count is that most B2B have long tail distributions so Large Enterprise Customers would tend to be clustered together, which we do not want since we try to cluster based on similar product purchases.

  3. Matrix of standardized residuals is used as an input for Cosine Similarity calculation between Customers.
    Similarity metric range from 1.00 to -1.00.
    Example: Customer A - Customer B similarity score is 1.00 which means they buy all the same products.

  4. Cosine Similarity matrix is then converted to distance matrix by taking 1 - Cosine Similarity.

  5. Distance Matrix is utilized as an input into hierarchical clustering algorithm using complete linkage.

  6. Customers and their Segment IDs are extracted from clustering algorithm and exported as a mapping table into the Model Class.

“Similar Products” Methodology

The Product Similarity Accelerator is leveraged in this case and all the computations are performed directly in the Product Similarity model, so here the methodology is just to fetch the similar products for the products defined in the scope. For further details see the Product Similarity Accelerator.

For those “similar products”, the score is defined as:

  • Score = Product Similarity * Average Co-product Margin

Outputs are stored in Model Table “Similar Products Recommendations”:

“Similar Products from Brand” Methodology

The Product Similarity Accelerator is leveraged again in this case and all the computations are performed directly in the Product Similarity model, so here the methodology is just to fetch the similar products from a specific brand for the products defined in the scope. For further details see the Product Similarity Accelerator.

For those “similar products from another brand”, the score is defined as:

  • Score = Product Similarity * Average Co-product Margin

Outputs are stored in Model Table “Similar Products from selected brand”:

“Up-Sell” Methodology

From the similar products fetched from the “Product Similarity” model, the average unit margin is computed for every product from the transactions and the products with a higher unit margin are recommended. For further details see the Product Similarity Accelerator.

For these “Up-sell products”, the score is defined as:

  • Score = Product Similarity * Average Co-product Margin

The ranking of products can be defined at the Configuration Step using one of these options:

  • Higher margin (the highest unit margin first)

  • Product similarity (the most similar product first)

  • Combined margin and similarity, based on the above score

Each priority has a field that holds a value used for its ranking, and for the selected priority respective value it is used as a score.

Outputs are stored in Model Table “Up-Sell recommendations”:

“Down-Sell” Methodology

From the similar products fetched from the “Product Similarity” model, the average unit price is computed for every product from the transactions and the products with a lower unit price are recommended. For further details see the Product Similarity Accelerator.

For these “Down-sell products”, the score is defined as:

  • Score = Product Similarity * Average Co-product Margin

The ranking of products can be defined at the Configuration Step using one of these options:

  • Lower price (highest difference in price of product and co-product)

  • Product similarity (most similar product first)

  • Combined margin and similarity, based on the above score

Each priority has a field that holds value used for its ranking, and for selected priority respective value is used as a score.

Outputs are stored in Model Table “Down-Sell recommendations”:

Aggregation of All Product Specific Recommendations

All these product specific recommendations:

  • Bought Together

  • Similar Product

  • Similar Product from Brand

  • Up-sell

  • Down-sell

are aggregated together in one table “ProductAggretatedTable” where each record corresponds to one recommendation which can come from several types of recommendations, listed in the column “RecommendationsList” and with an aggregated score using the weights defined in the Configuration Step.