Data Requirements (Optimization - Product Similarity)
When using Product Similarity Accelerator, it is crucial to understand the type of data needed to make the most of the module's capabilities. Both mandatory and optional fields play an essential role in producing accurate and meaningful results.
Data can be loaded from either a Data Source or a Datamart.
Product Master, Product Extensions or Company Parameter tables are not supported; all product information should be loaded in a Data Source.
1. Mandatory Fields
These fields are necessary for the basic functioning of the system. Without them, Product Similarity cannot perform its core operations.
For Product Data
Product ID – A unique identifier for each product. It ensures that each product is treated as a distinct entity.
Some attributes, at least one of these 3 types:
Textual Attributes
Categorical Attributes
Numerical Attributes
For Transactional Data (Optional)
Transactional data is optional, however if selected, the following fields are required:
Product Identifier within Transactional Data – This should match the
ProductID
from the product data, so both tables can be joined.Some attributes, at least one of these 3 types:
Textual Attributes
Categorical Attributes
Numerical Attributes
2. Optional Fields
These fields, while not strictly mandatory, significantly enhance the system's outputs, offering richer insights to define product similarities. Most of the business value of this Product Similarity accelerator comes from the right product attributes for your business, which could be really diverse depending on the industry, so take a moment and list what makes a product specific for your own business. Here are some examples:
Brand – The manufacturer or brand name associated with the product.
Size/Dimensions – The physical size, measurements, or dimensions of the product.
Packaging – Type of packaging or any information about how the product is packaged for shipping and storage, such as bottles, bags, boxes… of different sizes.
Weight – The weight of the product.
Color – The color for the product..
Material – The materials used to make the product, especially important for clothing, furniture, and electronics.
Power or any other value that demonstrates the capabilities of the products, e.g. cooling capacities for AC.
Specific Features – Capabilities of the product, such as a camera's megapixels or a smartphone's operating system.
Specifications – Detailed technical specifications, such as processor speed, storage capacity, resolution, etc., depending on the product type.
Power Source – For electronics or appliances, information about the power source or energy requirements.
Certifications – Any industry-specific certifications or compliance with safety standards.
Country of Origin – Where the product is manufactured or produced.
For Product Data
Product Name – Highly recommended, a textual name or label of the product. It aids user recognition and can be utilized in text-based similarity computations. A product description can also be an alternative. The length is limited to 255 characters.
Textual Attributes – These can be additional textual attributes providing more context about the product, such as a long description (again, with the limitation of 255 characters).
Categorical Attributes – Fields such as hierarchical descriptors or product categories can provide valuable context for grouping products.
Numerical Attributes – Numerical specifications, such as size, power or unit price. For aggregation, data that can be either summed up or averaged.
For Transactional Data
Criteria for Filters – Users might want to limit the scope of analysis to specific time frames, e.g., the last two years.
Text Attributes – Can include product names or other descriptions that provide more context for each transaction, with the limitation of 255 characters.
Categorical Attributes – Similar to product data, this could be hierarchical descriptors or transaction categories.
Numerical Attributes – Data that can provide context for each transaction, such as transaction amount, quantity, or any other relevant metric.
By fulfilling the mandatory data requirements and supplementing them with the optional fields, you can maximize the value of Product Similarity Accelerator. It is advisable to provide as many relevant fields as possible to ensure nuanced, accurate, and comprehensive results.
Language Support
For text attributes, only the following 15 languages are supported: Arabic, Chinese, Dutch, English, French, German, Italian, Korean, Polish, Portuguese, Russian, Spanish, Turkish.