Create a Segmentation Model

This implementation is DEPRECATED. Refer now to the Negotiation Guidance Accelerator.

 

To describe the pricing of a segment, you cannot use the actual prices directly because different products have different prices. You need to use a “target metric” that is comparable among all the products.  To create the segments, you must define all the attributes (parameters) and observe the target metric distributed for each segment. The goal of this price optimization / Segmentation model is to reshape this distribution and increase the profit specific for each segment by targeting a higher aggregate margin %.  

    

This exercise will provide you guidelines to create a Segmentation model using an existing Datamart and model logic (algorithm).  

The Optimization capability provides a framework that helps you analyze and segment your business, optimize your product portfolio, and improve your pricing. At the end of this exercise, you will have price recommendation values to be used in your sales quotes, products pricelist among other business areas. Today you are a Pricing Analyst, and you must determine the ideal target price for products at your company, analyzing competitor pricing and market expectations.  

 

For this exercise, you will navigate through and existing Datamart and Model Type, identify the optimized prices required and create a Segmentation model. 

 

Review a Datamart

Select Datamarts from the main menu path: Analytics > Data Manager > Datamarts as follow: 

 

The system displays a list of existing Datamarts. For this exercise you will use the Standard Sales Data listed, click on it: 

 

 

The system displays the fields contains in the selected Datamart. Notice the different Data Sources used under the Owner column.  

 

Scroll down and locate those fields that were created to store specific calculated data Customer Discount and Customer Rebate Pct. Notice those fields do not have a source associated in the Owner column. These are fields that must be required in the Segmentation process later. 

Click on the Data tab to display the actual data included in this Datamart. 

 

Click on the blue arrow to go back to the main window. 

Review Model TypesSelect Model Types from the main menu path: Optimization > Pricing Guidance > Model Types as follow: 

 

The system displays a list of existing model types, the one you will use in this exercise is the Segmentation. You can review the algorithm behind the model by clicking the View definition under the Definition column to display the elements and steps used in the calculation. 

Review Model Types: 

 

Select Models from the main menu path: Optimization > Pricing Guidance > Models, as follow: 

 

The system displays existing models with number of records calculated and model type. 

 

On the top-right corner of the window, click on the Add Model icon.

 

Provide a name with your Student number (no spaces), a label/name to identify your model, and slect from the Model Type drop-down box the Segmentation option. Click on the Add icon. 

 

The system displays a confirmation message at the top of the window: 

 

The system displays the main tabs to build the model: 

 

Definition – represents the area where the logic is determined to identify the calculation context with user inputs (filters, mapping parameters, percentiles, etc.). 

Analysis – represents the area where the data is uploaded, selected, and distributed for the Segmentation.  

Calculation – represents the execution to generate the Segmentation, including several steps with calculation results. 

Results – represents the display of the Segmentation results (tree-branches) with an Evaluation area to search for specific segment (tree-search) using different attribute values. 

 

To populate the Field rows on the right area, you must select the source of data (Datamart) for this model. Open the drop-down and select Standard Sales Data (DM) option: 

 

As soon as you select the Datamart, the filters are displayed. It is very important to avoid negative values or null values in the model, so you must create a filter for data quality purposes. Click on the Set Filter option: 

Using the drop-down arrows, select the displayed filter options for Gross Margin Percent

 

You will need to create more than one filters at a time. Click on the Add Rule option, select the AND condition: 

 

Using the drop downs, complete the following filters for Gross Margin and Global List Price selections, and click on the Apply icon. 

  

Note: Using these filters will avoid using negative values (from the Datamart) for margin and list prices in the model calculations. 

 

The additional displayed fields are used as the Mapping area that tells the system logic (behind the model) which data to bring into the model from the different Data Sources/Datamarts to perform the calculations. Using the dropdowns select the following options: 

 

Note: These selections are based on the business requirements and Datamart’s content to perform the model’s calculation. 

 

The price optimization/Segmentation is based on percentiles of the target metric distribution. Every segment is given a floor and stretch percentiles that define guardrails for the target percentile. The system already displays the Percentile Values, such as floor and ceiling (stretch) prices, you can update them here based on your business requirements. If the Target is left blank, it will be calculated automatically based on the optimization/Segmentation model. For this exercise, you can leave this area as default it. 

 

Floor Percentile – represents the floor optimization target in the segment. 

Target Percentile – will be used to propose an optimization target value for items whose target is between floor and target before calculation. It is optional to be entered into the model or calculated based on each segment score. 

Ceiling (stretch) Percentile – represents the stretch optimization target value in the segment. 

The Additional Parameters area helps to determine the amount of data to be used in the Segmentation model calculation for the available transactions in the Datamart. This selection will help to build the dimensions, data profiling and drivers within the model. Update the following parameters. 

 

Max Cardinality – largest count of distinct values that a dimension is allowed to have, to be taken into the account for price drivers (across the spectrum). 

Min #Transactions in Segment – smallest size a segment can have. If the segment is too small, it won’t be created, and the model will use the above (parent) level in the segment tree. 

Min #Customers/#Products in Segment – smallest size of customer/product in a segment. 

Policy Start Date/End Date – will provide time bounds when using the policy records. 

 

After completing all the parameters, filters, and data mapping sections, click on the Save Model icon on the top-right corner of the window. 

 

The system displays a confirmation message at the top of the window. 

Now the Field Rows are available based on the data mapping fields and the Datamart used as a source in the previous steps. 

 

Click on the Source Data tab. The model is empty because you need to complete the Analysis of the model and execute the data uploading process to populate the data records from the Datamart. 

 

Click on the Analysis tab on the top-left corner of the window. 

 

This section is used to prepare the data for the model the way you require it. In this Data Preparation section, you must click on the Calculate icon to start uploading the data into the model from the Datamart. 

 

Note: DataPrep step materializes the data applying all the filters and helps to reduce the processing time of the next steps. The “Incremental Calculate” icon can be used for partial calculations for specific changes/updates. 

 

Click on the Continue icon in the validation message. 

 

The system will display the status of the model preparation passing from Pending > Processing > Ready (give couple minutes to the system to process the request, click on the Refresh icon, if necessary).  The system now displays the number of rows of data loaded from the Datamart that will be used in this model. 

You can validate the actual data loaded into the model by going back to the Definition tab. 

 

Click on the Source Data tab. 

 

 

The system displays all the loaded data into the model. Notice some functions icons on the top-right area, they are available to maintain the records within the model, if case of changes.  

 

Purge Cache – clears the cache. The cache applies to queries on the model from the Data Preparation calculation. 

Truncate – deletes all/filtered rows from the calculation. This option is used when filters in the Definition area are changed and to avoid duplications.  

Upload – loads and XLS/XLSX sheet or a CSV/TXT file.  

Mass Edit – allows you to apply mass changes to editable attributes. 

Status History – displays information related to the records status history.  

Export – allows you to export the records to an external XLS or CSV format file. 

Analysis Tab

Click back on the Analysis tab. 

Data Profile Tab

Click on the DataProfile tab. 

 

In this Data Profile step the system provides a summary of the filtered transactions dataset. You must click on the Calculate icon to calculate the dimension values. 

Note: This step is optional, only to display the result table with all the dimensions values and to automatically remove fields with too large cardinality (previously defined) from possible Segmentation levels (we recommend executing this step only for models with changes for data troubleshooting purposes). 

 

Click on the Continue icon. 

 

Note: if you get an error message on the top of the screen about the version used, just Refresh your browser at the top-right corner:  

The system will take couple minutes to process the Data Profiling. When it is “Ready” status, it will display a list of all the dimensions (flagged fields) from the Datamart with minimum and maximum values, number of records found for each dimension and other status information as a result of the calculation. Pick any dimension from the list and click on the “Show” icon (for example Country). 

 

Note: The Profile calculation provides the minimum and maximum values for each dimension based on the parameters determined in the Definition tab.  

 

As an example, for the “Country” dimension previously selected, the system displays all the records found for each country between the minimum value (Australia) and the maximum value (Vietnam). Click on the OK icon at the bottom-right corner to close the window.

Back in the Data Profile section, scroll down the list to see if there are any dimension with “Null” values. Notice the dimension ProductImage has null values. The system identifies these dimensions to be excluded from the Segmentation.  

 

 

Other dimensions to analyze if they should be or not part of the Segmentation process are the ones related to “periods”. Keep scrolling to the bottom of the list and locate Pricing Dates fields (week/month/quarter/year). 

 

Note: Notice the PricingDateWeek dimension has 105 distinct records, which is higher than the cardinality value (100), so this field will be excluded from the Segmentation. When the dimension contains several periods (e.g., years – 2020 to 2022) it wouldn’t work precisely for the Segmentation either since the dimension is too wide/large. 

PriceDrivers Tab

 

Click on the PriceDrivers tab. 

The Price Driver step helps to calculate the significance of each attribute (dimension) when it is used in the Segmentation (decision tree). The system displays the eligible dimension (attributes) list on the left area, these are the dimensions that have records under the Max Cardinality value. Click on the Expand arrow icon on the lower corner:  

 

The system displays the whole list of eligible dimensions to easily filter, search, select/unselect dimensions. You can click on the Dimension main label to sort the list or use (Ctrl+F) to use a search field browser for specific dimension search. You can search for any dimension and see if they are displayed in the list. For example, the dimension UniqueId is not part of the list since it has more than hundred records and is over the Max Cardinality value. 

 

Note: This filtering options will help to identify the missing dimensions that were excluded from the Data Profile due to Max Cardinality exceeds. 

 

For this exercise, we must reduce the calculation time, please unselect the following dimensions using the check boxes, as follows:

Note: Those unselected dimensions are not relevant for the Segmentation results that we are expecting in this exercise. During your company’s Segmentation model creation, you and your pricing team can decide which of the available dimensions are relevant (or not) for your Segmentation results.  

 

Click on the Close icon at the bottom corner. 

 

Back in the main Price Drivers window, the system offers more than one Regression Type to select from a dropdown field, this defines how the system will calculate the significance values for each dimension. For this exercise, leave it as default it.  

 

Regression options: 

Per Individual Product and Customer – the significance of each dimension is calculated first per customer, then per product. The results allow calculation of average and standard deviation of significance. 

On full Transaction data set – one decision tree is built resulting in one significance value per dimension. 

Click on the Calculate icon located on the top-right area.  Click on the Continue icon to confirm the information message. 

Note: Pricefx uses a simple CART (regression tree algorithm) approach with Gini in decision criterion and the attribute significance (level of importance) follows from that. CART is only used to identify possible “price drivers” as best variable (value) used to explain the relevance of an attribute. 

 

The system will take couple minutes to process the Data Drivers (you can use the Refresh icon, if needed). When it is “Ready” status, it will display a Histogram chart with the 10 most relevant dimensions ranked by level of importance. The system runs the “regression” algorithm and provides these results based on the previously selected eligible dimensions. Notice that “City” has the most relevant importance in this model (than Country which is at the end of the importance scale), meaning that City has the biggest impact on the price recommendation for this Segmentation scenario. (If you unselect the City dimension from the previous list, maybe the Country, State or Region will take highest relevance for the pricing Segmentation). 

 

Note: You can decide which of all the available dimensions are meaningful for your Segmentation, by selecting/unselecting the dimensions and performing the calculations several times (if necessary). By running this Drivers step, the Segmentation will automatically sort all the dimensions from the most important to the least important and preselect the significative ones based on historical statistical information. 

 

On the right-side of the window, you can see the results from the Drivers processing. Click on the “Show” icons to display the criteria selected and the results of the processing (if required). 

 

On every step you will see on the top-right corner additional icons, these are helpful to check time response during every calculation process, allows to attach additional information to the model and provides visibility to the internal tables. Click on the Job Tracking icon: 

 

Job Tracking – lists an overview of all the job/task processed within the model.  

Attachments – allows the user to upload external files relevant for the model. 

Tables – lists the model and parameters tables used in the Segmentation. 

 

The system will display the time that calculation processing takes every time you click on the “Calculate” icon. For example, these results will help you to estimate the processing time depending on the number of dimensions you select for the Segmentation (data profile and data driver processing times). 

 

Click on the Calculation tab on the top-main steps menu.  

 

The Segmentation tab will offer several Segmentation methods (with algorithms behind) and the available dimensions within the cardinality. Under the Parameters section, open the Segmentation Method dropdown and select the Manual Attribute Selection option.  

 

Note: This Segmentation method will prepare the calculation records (a work to do) for the next “scoring” step to create and populate the lookup tables with intermediate results.  

 

The system displays the Segmentation Levels already defined dimensions with their scores (level of importance previously recommended) because of the Drivers processing. If you decide to change the level of importance of the dimensions as the last-minute change, you can do it here before the next calculation process. Click on the Expand arrow icon. 

 

You can select any other available dimension and prioritize it in the list the way you need it. For example, select the State dimension and hover by the check box to grab it and move/drag it to the top of the list for the first level of importance. 

Notice some previous selected dimensions were unselected to change a bit the importance levels for a new Segmentation. Here the system provides the flexibility for you to change the level of importance as you needed to get a different Segmentation result. 

Note: These changes are very rare in real business, usually companies agreed to use the recommended level of importance that the system provides. However, if changes are relevant as part of the Segmentation testing, here is where you can make those changes. These changes will affect the way how the Segmentation branches are displayed at the end of the process. 

 

Click the Close icon to return to the Segmentation section. Click on the Calculate icon to start the Segmentation process. The system will create all the segments based on the dimensions and parameters previously defined. 

 

Click on the Continue icon to confirm the calculation. 

 

The system will take couple minutes to process the Segmentation (you can use the Refresh icon, if needed). When it is “Ready” status, the system displays for each dimension (segment) the results (totals) about number of transactions, number of customers, number of products, etc. in the Explained Variance area: 

 

Note: You will have available the “Export to Excel” icon to share the information in a spreadsheet format. The “Settings” icon will allow you to change the display of the columns in the Explained Variance section.  

 

Scroll the bottom bar to the right to see additional results related to revenue, volume, and margin for each dimension. The column “R2” represents how well (far/close) the Segmentation reduces the variance of the Optimization Target (margin %, for example). 

 

Note: All these columns show the percentage of the values present in the segments at each Segmentation level. Here you can have an idea if the dimensions and definition parameters selected are the best choice for the Segmentation. The wider the segment, the better opportunity for the target metric.  

On the All Results area on the right side, the system displays the parameters used in the calculation. Click on the “Show” for the GenerateSegments field: 

Notice the generated segments and the number of items (branches) within each segment.  For each dimension a segment was created. Close the window by clicking the OK icon at the bottom right corner. 

Click on the Scoring tab. This Score approach calculates a score (statistical distribution) for each segment. This score is used to define a target percentile (if not provided in the Definition step).   

 

Click on the Calculate icon, then Continue icon to confirm the calculation process.  

The system displays a Score Distribution chart. For each segment, the score is linearly transformed to the target price percentile (score = 0 correspond to the floor percentile and the score = 100 to the stretch percentile). 

 

Note: The wider the segment, the higher the score. 

On the right side, the system displays the Results for the scoring calculations success/failed records with respective explanations (alerts)., including the number of segments processed in the scoring calculation. For this example, no errors were reported