Data Requirements (Optimization - Forecast)
A Forecast model needs a transactions source to run. It can be either a Data Source or a Datamart.
The source should contain a minimum of 2 years of transaction history, with the following columns:
Field | Required? | Comment |
---|---|---|
Date | Yes | Should be defined as a dimension in DS/DM |
Product | Yes | Typically Product ID. |
Revenue | Yes | Typically end-customer price x quantity Extended to the quantity. It is better to avoid negative or null values. |
Quantity | Yes | It is better to avoid negative or null values (they are filtered out by default by the model itself). It is possible to run the model with the log of the quantity. This may produce better outputs when there are small and large quantities in the dataset (e.g. many long-tail products). |
Revenue at List Price | No | Typically list price x quantity. This is used to calculate discount rate based on the revenue. It is highly recommended to include this where possible. |
Store | Yes | Not used, will be removed in the next version. |
Product categorical features | No | Here you may include any categorical columns related to the product that may influence sales. While not strictly required to run the model, this is key to getting proper outputs. Examples include Product Category, Competitor Name, Product Life cycle, Tag for promotions/discounts. |
Product numerical features | No | Numerical features may include any numerical attributes that can be averaged over the selected time period. |
Customer | Yes | Required only if the check button “Add a Customer Dimension” is checked. Otherwise, the user entry is not set. |
Customer Categorical Features | No | User entry set only if the check button “Add a Customer Dimension” is checked. Any categorical columns related to the customer that may influence sales. |
It is also required to set the data filter in “Filter” to get complete periods (and not e.g. half a week at the beginning or the end of the scope).