Getting started with statistical models in demand planning

By Erik de Vos

In our previous blog on the value of statistical forecasting, you’ve discovered that introducing statistical models in your demand planning setup will help you kickstart your forecast automation journey. In this blog, let’s take a closer look at what it means to incorporate statistical models in demand planning. First, it is good to emphasize that statistical forecast models alone are rarely enough. If you look at the entire portfolio of the company, there will be parts that will already benefit from statistical forecasting alone, while there will also be parts that require enrichment of the forecast to cover all demand dynamics.   

The typical setup with statistical forecasting is the forecasting approach that combines statistical forecast models with human enrichment. This requires a certain setup in terms of process, people, data, and tools. Let’s focus here on what the core forecasting setup would look like.  


statistical models in demand planning


In a setup with statistical models in demand planning, the starting point will always be historical sales quantities. Consecutively, several core activities will need to be part of the forecasting setup.   

  1. Ensure full sales data availability 
  2. Validate the categorization of the portfolio and the demand 
  3. Clean the sales history so only baseline related sales remains 
  4. Organize seasonality detection and application 
  5. Select the best-fit statistical model based on a back-tested performance measure 
  6. Organize for forecast enrichment 

The potential of using statistical forecasting in a demand planning process is mostly determined by the value it adds compared to the current way of working: where does best-in-class statistical forecasting already outperform the current process, or is it on par with a fraction of the effort? But to get the most out of statistical forecasting, it is best to first understand what drives demand in the business.   

Before you start with statistical models in demand planning, we encourage you to consider a few pre-analysis steps. These steps will help you understand how to set up your statistical design. Often this is related to what you already know about your portfolio, but seeing it quantified gives you the confirmation you need to move forward. We distinguish 4 pre-analysis activities in the creation of the statistical forecast. These 4 pre-analysis activities are outlined below.


1. Business characteristics 

Statistical forecasting requires structured master data. Usually this means developing a hierarchy setup both for the product and customer angles that matches the business for demand forecasting. Often ERP hierarchies are used at the start, and equally often it turns out that these hierarchies contain gaps to perform well in forecasting. This should not be surprising as ERP hierarchies are constructed to serve the key purposes of the ERP system and not necessarily demand planning.  

Before you start with statistical forecast models, execute a hierarchy screening and validate that you are combining the right products, that you have a consistent hierarchy built-up and that you know how to deal with complex products. By doing this, it will open the discovery on the hierarchy level on which you will probably install your statistical forecasting.  

Next to the product and customer aggregation level decision you need to take before setting up statistical forecasting, you also need to determine your forecast time bucket set-up. Whether you use calendar month, week, day or something else, at least get your definitions straight and align them to how your business works.


2. Categorization

We want to start with the element of categorizing your portfolio. In essence, categorization will not make you a better forecaster, but it should make you realize how diverse your portfolio is and what the implications are for forecasting.   

To explain this further, we want to introduce the concept of a Demand Forecasting Unit (DFU). The level at which a statistical forecast is made, i.e. a product or a combination of product and region or even product, customer and region, is called a Demand Forecasting Unit (DFU). The higher the level of aggregation, the fewer the number of DFUs and the more accurate the statistical forecast for each DFU. However, you may want to select a lower level of aggregation if it better suits the planning steps in your S&OP process (for example, promotion planning, your supply, raw material, or capacity planning). A categorization matrix distinguishes between New Product Introductions (NPIs), End of Lives (EOLs), and active Demand Forecasting Units (DFUs).   

NPIs are defined as DFUs that have been introduced in the last few months or are about to be introduced. With limited data points, statistical forecast models cannot provide a reliable future statistical forecast based on DFU data alone. In a setup with statistical forecasting, you are left with a few options:   

  1. Forecast them fully manually 
  2. Forecast them based on a set of defined parameters like distribution, shelf space, loading in period.  
  3. Forecast them using reference products on which you apply enrichment 
  4. Forecast them using NPI profiles based on historical sales patterns from other NPI 

EOLs are defined as DFUs that have not been sold in a certain period or that have been marked for discontinuation. Forecasting is not a priority for these, although it helps to determine the service and inventory risks you may face. Forecasting should be kept simple, ideally without a lot of enrichment, and with a clear point in time when these will no longer be forecast, indicating the actual point in time when the DFU will stop. EOLs are determined by the specific dynamics of your business and can often be based on agreements with your customers. EOLs will require some manual intervention and especially follow-up to ensure a smooth phase-out.  


the use of different type of statistical models in demand planning


For the group of mature DFUs, a category is determined that represents the Pareto of total value (margin, revenue, or sometimes volume if there is no price or value information) with ABC categorization and demand volatility with XYZ categorization. On the axis of the ABC categorization, the smallest set of DFUs that generate 80% of the total value (or other) gets an A, the next 15% gets a B, and the remaining gets a C. From another dimension, the X category products are relatively stable and therefore the easiest to forecast using statistical forecast models, and the Z category shows the highly volatile, often intermittent items.   

The ABC-XYZ categorization focuses on which DFUs to spend the most time on and which DFUs can ideally be left untouched by statistical forecasting. This is not meant to be the holy grail, but for anyone who wants to understand where statistical models in demand planning can help and where you need to organize differently, this is a good place to start. Identifying the part of your portfolio where forecasting may not be the solution can already trigger thinking about what to do next with supply, inventory, and service setups.


3. Outlier cleaning 

The goal of statistical forecasting is to find historical patterns to build future forecasts. Extreme outliers caused by one-time events (such as Corona, the Suez Canal, a shift in product due to supply issues, etc.) should not be included in the baseline statistical forecast. Therefore, an element of outlier and event cleaning should be present in any setup with statistical models in demand planning. Without cleaning on historical sales, all sorts of effects will remain in your data to be considered by the statistical models and affect the baseline forecast. Think of promotions, out-of-stocks, one-off marketing events, one-off market impacts, forward buying due to price changes; all these effects have happened in the past, but there is no guarantee that they will happen again in the future.    


statistical models: outlier detection


If you want to discover what statistical forecast models can do for you, it is important to understand the outliers in your sales data. Are they related to buying behavior, events, or things outside of my control? Do we have data that we could use to facilitate some automated cleaning of outliers, such as out-of-stock information? Getting an understanding of outliers will help you determine how to organize yourself for cleaning in an efficient way that helps statistical forecasting to drive quality. Remember, as with any system that uses data, “crap in = crap out”.    

To conclude this topic, just a few thoughts on cleaning sales data. A first step is to remove relevant outliers that you can associate with known drivers. For example, promotions are usually known, so if you do not want the statistical forecast models to predict them, you should clean for them. In an ideal world, you would then clean out the exact effect of the promotion, but in the real world, this data is often not available. So, consider ground rules that use the event data to clean for specific products, customers, and time periods where there might have been an event effect.   

The second level of cleaning is related to unknown outliers. This is where automated outlier cleaning comes in. The intent is not to clean heavily, but only to take out the truly exceptional sales that are very different from the normal sales pattern. Usually this means cleaning within a bandwidth, where the way you define the bandwidth can be done in multiple ways. Just make sure that the approach you use is transparent.

4. Seasonality & trend detection    

The final piece of insight to gain when starting with statistical models in demand planning is the importance of seasonality and trend detection to your business. One of the core expectations of statistical forecasting is to capture seasonality patterns and identify growth trends as part of the baseline forecast. However, it is easy to see trends where there are none, or to mistake event-driven sales for seasonality.   

Let’s look at seasonality first. Seasonal patterns can be identified on the outlier-cleaned actuals. This is usually done at the DFU level and a higher level of aggregation, often product group (versus product) or region (versus country). Seasonality is often detected monthly but can also be relevant weekly. Usually, a company already has a good sense of when sales are higher or lower due to seasonality. However, testing this statistically can provide surprising insights. Seasonality may not be statistically evident at the aggregate level where it is expected, or seasonality may appear where it was not previously expected. Digging into the details can then reveal an underlying cause for this seasonality.   

In essence, seasonality must represent the sales pattern associated with the characteristics of the product. Ice cream sells more in the summer and hot soup sells more in the winter. In several product categories, event planning may cause sales that look like seasonality. For example, if deodorant is promoted in March, this does not mean that the sales would have been higher without the promotion. Probably the sales would still increase due to the warmer weather, but in a smoother way.


The effect of demand drivers on statistical models in demand planning.


Second, there is the growth trend element. For each DFU, trend detection shows whether it is statistically significantly trending up or down. If so, it provides details such as increase/decrease and slope. For example, if a company has an ambitious volume growth target for the next few years, a thorough trend analysis will provide interesting insights. It can determine which part of the portfolio is already showing an increasing trend to validate the client’s gut feeling and identify future growth champions.     

Often the first instinct when looking at sales data is to see trends, but this is often wishful thinking or driven by the growth target that needs to be met. Doing the quantitative analysis can show where the trend is and how to best set up the trend adoption in your models.  


Ready to kickstart your automation journey with statistical models in demand planning?

A high-quality statistical forecasting approach will help identify past demand patterns and apply them to future forecasts more accurately and efficiently than manual methods, freeing up planners to enrich the forecast where necessary. But most importantly: it will kickstart your forecast automation journey. 

In our e-book, we guide you on your journey to getting the basics of statistical forecasting right. You will discover: 

  • How to get started with statistical forecasting 
  • Practical steps on how to get the most out of your statistical setup & effectively tackle challenges 
  • How to turn your statistical forecast setup into AI-based forecasting 

In doing so, we will provide you with the steppingstone to optimal “smart-touch” forecasting: a balance of machine and human efforts to deliver the best possible forecast in this ever-changing environment. Get your copy of the e-book here.


Search for