Demand Forecasting using Python
A Project done for TinyCo Retail Store
Introduction
Supply Chain as a business function has almost become a metric of how well an entity is going to perform internally (in terms of meeting its own targets) and externally (wrestling against the business environment and its competitors).
According to a survey by Deloitte from 2014, 79% of companies with high-performing supply chains achieve revenue growth superior to the average within their industries. Conversely, just 8% of businesses with less capable supply chains report above-average growth. That figure highlights like no other how critical the interrelations are between an enterprise and its supply chain.
A supply chain consists of many factors which may or may not include supplier contracts, transport logistics, storage capacities and product sales. Long term contracts with suppliers often involve forecasts so that the suppliers know when to supply a company with products and in what quantities. Globalisation has recently disrupted businesses world wide. Companies now source raw materials and products from different suppliers internationally. These orders may involve very long lead times given that sea freight is the cheapest mode of transportation. To maximise on sales, a business would need to know exactly when to trigger these international orders such that they're not left with empty warehouses when the time comes. Stock orders also need to be optimised since storage space is finite.
Now the question is, how do we know when to make the appropriate triggers and what strategy do we use to make forecasts? The answer to this is Supply Chain Planning. Supply chain planning (SCP) is the component of supply chain management (SCM) that develops a strategy for balancing supply and demand, predicting future requirements and monitoring fulfilment. SCP covers a range of supply chain processes such as manufacturing, production scheduling, predictive modelling and sustainability.

Future requirements are predicted using a method known as Demand Forecasting. Demand Forecasting involves building a predictive model to forecast sales in a given period of time, usually a quarter or a financial year. It takes into account trends and seasonality.
In a supply chain perspective, this forecast of sales is then used to plan for the requirements that'd drive those sales: raw materials in the case of manufacturing, and stocks in the case of retail. This project is going to be dealing with the latter.
Defining the problem
TinyCo is a retail business which sells 9 different SKUs in 5 different locations/stores. The Supply Chain Management department would like to know the demand of these SKUs in all of the company's stores, so as to design a Supply Chain Plan for the next financial year (the financial year run from August to September of the next year).
Upon requesting some data dating back 3 years for the business, the manager for one of the stores sent the following email:
Welcome to TinyCO! I have attached the data you requested for in the spreadsheet. It has 6 tabs: SKU master and separate tabs for each of the following stores: 312, 323, 415, 521 and 632. The store tabs contain all of the sales information for the past 2 years (that is all I could find) for that respective store. It lists the database or transaction ID, date, the SKU, quantity sold that day and total revenue from those sales. The SKU master tab contains some information on each SKU, such as the name, weight, cube, unit cost, etc. This was really hard to get, so I hope it is all you need.

Objectives
Primary Objective
-
Develop a Demand Forecast to assist the business with Supply Planning.
Secondary Objectives
-
Which month has the highest average sales in dollars for each store?
-
Which SKU is the most popular in each store?
-
Are the sales (in units) between the different SKUs from China Imports correlated for each store?
-
TinyCo is thinking of running a one day promo on each week – which day of the week makes the most sense for each store?
-
How do Unit Sales change over time?
-
Are the sales across all stores correlated?
Skills Demonstrated
-
Data wrangling/cleaning,
-
Exploratory data analysis,
-
Data Visualisation,
-
Model building (selection, train/test validation, etc) &
-
Story telling.
Presentation of Project
The project was carried out entirely using the Python programming language, within a Jupyter Notebook environment.
For presentation purposes, the project was broken down into 3 notebooks, each for the following areas:
-
Part 1: Data Cleaning
-
Part 2: Exploratory Data Analysis
-
Part 3: Model Building and Validation
You may navigate to the different parts of the project using the buttons below:
To download the source code and raw data for the entire project, please visit my github page.