SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Kaggle Winning Solution Series:
Retail sales forecasting
YAN XU
HOUSTON MACHINE LEARNING
JUNE 19, 2021
Retail sales forecasting
Rossmann Store Sales
(https://www.kaggle.com/c/rossmann-store-sales)
Corporación Favorita Grocery Sales Forecasting
(https://www.kaggle.com/c/favorita-grocery-sales-forecasting)
Walmart M5 Forecasting – Accuracy
(https://www.kaggle.com/c/m5-forecasting-accuracy)
1115 stores
predicting their daily sales for up to six weeks in advance
Data – Historical sales
Data - Stores
Metric
Root Mean Square Percentage Error RMSPE
1st place solution with feature extraction
Recent data
Temporal information
Current trends
Store information
Weather
https://www.kaggle.com/c/rossman
n-store-sales/discussion/18024
Feature extraction – Recent data
• Median
• Mean
• Harmonic mean
• Standard deviation
• Skewness
• Kurtosis
• 10%/90% percentiles
• Previous month
• last quarter
• Last half year
• Last year
• Last 2 years
• Store
• Day of week
• Promotions
• Holidays
Keys Time period Stats
Feature extraction – Temporal information
Day counters (how each record relates to events or cycles)
◦ The number of days before, after or within the event
◦ Events
◦ Promotion cycle
◦ Summer holidays
◦ Store refurbishment
◦ Start of competition and start of secondary promotion cycle
Day of week, day of month, day/week/month of year
Number of holidays during the current week, last week and next week
Feature extraction – Current trends
Last quarter and last year
Store specific linear model (ridge regression) on
◦ The day number, to extrapolate into six weeks in test
◦ Day of week
◦ Promotions
Feature extraction – Other features
Store features
◦ Assortment
◦ Store type
◦ Aggregates by store
◦ Average sales per customer
◦ Ratio of sales during promotions/holidays/Saturdays
◦ Proportion school holidays and days that the store is open
State specific weather
◦ Max temperature
◦ Mm precipitation
Model: Gradient Boosted Trees
Model training
•XGBoost Models on random selections of features
•Handpicked models
•500 random models and validate on each pair of ensemble models
•Take features from all the selected models and combine into one
•Separate models on the months May to Sep
•Month ahead models
•Log transformed the variable, a multiplier factor (0.985) to apply
3rd place solution with entity embeddings
https://arxiv.org/pdf/1604.06737.pdf
EE improves different methods
Data
•unit_sales by date
•store_nbr
•item_nbr
•onpromotion - whether that item_nbr was on promotion for a specified date and
store_nbr.
•Store metadata, including city, state, type, and cluster.
•Item metadata, including family, class, and perishable
•The count of sales transactions for each date/store_nbr combination.
•Daily oil price
•Holidays and Events, with metadata
Metric
Normalized Weighted Root Mean Squared Logarithmic Error
𝑖=1
𝑛
𝑤𝑖(log 𝑦𝑖 + 1 − log 𝑦𝑖 + 1 )2
𝑖=1
𝑛
𝑤𝑖
The weights 𝑤𝑖 , can be found in the items.csv file (see the Data page). Perishable items are given a weight of 1.25 where
all other items are given a weight of 1.00.
RMSLE incurs a larger penalty for
the underestimation of the Actual
variable than the Overestimation.
1st place solution with ensemble
LSTM model
4th place: Encode-decoder with
Dilated causal convolutions
5th place: More ensembles
Data
Data
Metric
Weighted Root Mean Squared Scaled Error (RMSSE)
Naïve one step forecast
1st place solution
•Single LGBM model, with objective = tweedie
•divide into groups with similar time series, and model it.
(e.g.) by store, by store cat, by store dept, etc.
•select final model using mean(cvs, public score) and std(cvs,
public score)
• Multiple validation set
•Ensemble of non recursive and recursive
Recursive:
1st place solution
Store/Item Price
◦ Max
◦ Min
◦ Std
◦ Mean
◦ Price_norm divided by price max
◦ Price_nunique
◦ Item_nunique that has the same price
◦ Price momentum by month/year
• Calendar features
• Day
• Week
• Month
• Year index
• Day of week
• Weekend
• Event
• State
Lag features
◦ 28 day shift
◦ for 14 days
Lag rolling features
◦ 28 day shift
◦ Time window of [7,14, 30, 60, 180]
◦ Rolling mean/std
Rolling with shift [1,7,14]
• Mean encoding features (mean and std)
• ['state_id']
• ['store_id']
• ['cat_id']
• ['dept_id']
• ['state_id', 'cat_id']
• ['state_id', 'dept_id']
• ['store_id', 'cat_id']
• ['store_id', 'dept_id']
• ['item_id']
• ['item_id', 'state_id']
• ['item_id', 'store_id']
2nd place – Aligning top and bottom
Bottom level: lgb model for each store without lag/rolling
Top 5 levels: [ALL, STATE, STORE, CATEGORY, DEPARTMENT]
ALL STATE STORE
2nd place – Aligning top and bottom
Magic multipliers
Overshot fit (multiplier > 1)
Level 1 : all_id
“Build a set of bottom level models with multipliers ranging from 0.9 to 1.23, the optimum is somewhere
in the range 0.93-0.95 and built an ensemble with 0.9, 0.93, 0.95, 0.97 and 0.99”
2nd place – Aligning top and bottom
Alignment at level 1
N-beats prediction at level 1
Conclusion
•Data is the key: promotions, competitions, holidays in addition to sales data.
•Metric needs to be customized for different problems
•Statistical features can be helpful (lag, rolling by different combination of keys)
•GBM is still the most popular model for prediction
•Separate models may be needed for different stores/locations/duration of days
•Low level prediction can be improved by aligning high level prediction
Win a Kaggle competition?
• Ideas and Inspiration
• Best practice

Weitere ähnliche Inhalte

Was ist angesagt?

A case study on churn analysis1
A case study on churn analysis1A case study on churn analysis1
A case study on churn analysis1Amit Kumar
 
churn prediction in telecom
churn prediction in telecom churn prediction in telecom
churn prediction in telecom Hong Bui Van
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Informationrecsysfr
 
StanとRでベイズ統計モデリング読書会 導入編(1章~3章)
StanとRでベイズ統計モデリング読書会 導入編(1章~3章)StanとRでベイズ統計モデリング読書会 導入編(1章~3章)
StanとRでベイズ統計モデリング読書会 導入編(1章~3章)Hiroshi Shimizu
 
MCMCでマルチレベルモデル
MCMCでマルチレベルモデルMCMCでマルチレベルモデル
MCMCでマルチレベルモデルHiroshi Shimizu
 
2. BigQuery ML を用いた時系列データの解析 (ARIMA model)
2. BigQuery ML を用いた時系列データの解析 (ARIMA model)2. BigQuery ML を用いた時系列データの解析 (ARIMA model)
2. BigQuery ML を用いた時系列データの解析 (ARIMA model)幸太朗 岩澤
 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorialAlexandros Karatzoglou
 
R Markdownによるドキュメント生成と バージョン管理入門
R Markdownによるドキュメント生成と バージョン管理入門R Markdownによるドキュメント生成と バージョン管理入門
R Markdownによるドキュメント生成と バージョン管理入門nocchi_airport
 
『予測にいかす統計モデリングの基本』の売上データの分析をトレースしてみた
『予測にいかす統計モデリングの基本』の売上データの分析をトレースしてみた『予測にいかす統計モデリングの基本』の売上データの分析をトレースしてみた
『予測にいかす統計モデリングの基本』の売上データの分析をトレースしてみた. .
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologySergey Shelpuk
 
ggplot2用例集 入門編
ggplot2用例集 入門編ggplot2用例集 入門編
ggplot2用例集 入門編nocchi_airport
 
IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaKei Nakagawa
 
Big Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businessesBig Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businessesGopalakrishna Palem
 
マルコフ転換モデル:導入編
マルコフ転換モデル:導入編マルコフ転換モデル:導入編
マルコフ転換モデル:導入編Masa Kato
 
状態空間モデル等による多変量時系列データ解析
状態空間モデル等による多変量時系列データ解析状態空間モデル等による多変量時系列データ解析
状態空間モデル等による多変量時系列データ解析businessanalytics
 
Prml 2_3_5
Prml 2_3_5Prml 2_3_5
Prml 2_3_5brownbro
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical dataPaul Skeie
 
順序データでもベイズモデリング
順序データでもベイズモデリング順序データでもベイズモデリング
順序データでもベイズモデリング. .
 
パターン認識と機械学習 (PRML) 第1章-「多項式曲線フィッティング」「確率論」
パターン認識と機械学習 (PRML) 第1章-「多項式曲線フィッティング」「確率論」パターン認識と機械学習 (PRML) 第1章-「多項式曲線フィッティング」「確率論」
パターン認識と機械学習 (PRML) 第1章-「多項式曲線フィッティング」「確率論」Koichi Hamada
 
Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksDatabricks
 

Was ist angesagt? (20)

A case study on churn analysis1
A case study on churn analysis1A case study on churn analysis1
A case study on churn analysis1
 
churn prediction in telecom
churn prediction in telecom churn prediction in telecom
churn prediction in telecom
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
 
StanとRでベイズ統計モデリング読書会 導入編(1章~3章)
StanとRでベイズ統計モデリング読書会 導入編(1章~3章)StanとRでベイズ統計モデリング読書会 導入編(1章~3章)
StanとRでベイズ統計モデリング読書会 導入編(1章~3章)
 
MCMCでマルチレベルモデル
MCMCでマルチレベルモデルMCMCでマルチレベルモデル
MCMCでマルチレベルモデル
 
2. BigQuery ML を用いた時系列データの解析 (ARIMA model)
2. BigQuery ML を用いた時系列データの解析 (ARIMA model)2. BigQuery ML を用いた時系列データの解析 (ARIMA model)
2. BigQuery ML を用いた時系列データの解析 (ARIMA model)
 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
 
R Markdownによるドキュメント生成と バージョン管理入門
R Markdownによるドキュメント生成と バージョン管理入門R Markdownによるドキュメント生成と バージョン管理入門
R Markdownによるドキュメント生成と バージョン管理入門
 
『予測にいかす統計モデリングの基本』の売上データの分析をトレースしてみた
『予測にいかす統計モデリングの基本』の売上データの分析をトレースしてみた『予測にいかす統計モデリングの基本』の売上データの分析をトレースしてみた
『予測にいかす統計モデリングの基本』の売上データの分析をトレースしてみた
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
ggplot2用例集 入門編
ggplot2用例集 入門編ggplot2用例集 入門編
ggplot2用例集 入門編
 
IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei Nakagawa
 
Big Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businessesBig Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businesses
 
マルコフ転換モデル:導入編
マルコフ転換モデル:導入編マルコフ転換モデル:導入編
マルコフ転換モデル:導入編
 
状態空間モデル等による多変量時系列データ解析
状態空間モデル等による多変量時系列データ解析状態空間モデル等による多変量時系列データ解析
状態空間モデル等による多変量時系列データ解析
 
Prml 2_3_5
Prml 2_3_5Prml 2_3_5
Prml 2_3_5
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical data
 
順序データでもベイズモデリング
順序データでもベイズモデリング順序データでもベイズモデリング
順序データでもベイズモデリング
 
パターン認識と機械学習 (PRML) 第1章-「多項式曲線フィッティング」「確率論」
パターン認識と機械学習 (PRML) 第1章-「多項式曲線フィッティング」「確率論」パターン認識と機械学習 (PRML) 第1章-「多項式曲線フィッティング」「確率論」
パターン認識と機械学習 (PRML) 第1章-「多項式曲線フィッティング」「確率論」
 
Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In Databricks
 

Ähnlich wie Kaggle winning solutions: Retail Sales Forecasting

5 How The Model Works (With Notes)
5 How The Model Works (With Notes)5 How The Model Works (With Notes)
5 How The Model Works (With Notes)Abhishek Datta
 
Lecture 08B - Logical-DWH-Model-Pending.pptx
Lecture 08B - Logical-DWH-Model-Pending.pptxLecture 08B - Logical-DWH-Model-Pending.pptx
Lecture 08B - Logical-DWH-Model-Pending.pptxAsadkhan47384
 
SALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdfSALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdfSparkflows
 
INFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGINFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGZaranTech LLC
 
Predict Repeat Shoppers with H20 and Spark
Predict Repeat Shoppers with H20 and SparkPredict Repeat Shoppers with H20 and Spark
Predict Repeat Shoppers with H20 and SparkairisData
 
Black Friday Shopping Prediction_ PPT
Black Friday Shopping Prediction_ PPTBlack Friday Shopping Prediction_ PPT
Black Friday Shopping Prediction_ PPTArjunThumbayil
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15AnwarrChaudary
 
How to build a data warehouse - code.talks 2014
How to build a data warehouse - code.talks 2014How to build a data warehouse - code.talks 2014
How to build a data warehouse - code.talks 2014Martin Loetzsch
 
Data warehouse implementation design for a Retail business
Data warehouse implementation design for a Retail businessData warehouse implementation design for a Retail business
Data warehouse implementation design for a Retail businessArsalan Qadri
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)tafosepsdfasg
 
Lecture 3F.ppt
Lecture 3F.pptLecture 3F.ppt
Lecture 3F.pptkhang28765
 
Universal Analytics and Google Tag Manager - Superweek 2014
Universal Analytics and Google Tag Manager - Superweek 2014Universal Analytics and Google Tag Manager - Superweek 2014
Universal Analytics and Google Tag Manager - Superweek 2014Yehoshua
 
Universal Analytics and Google Tag Manager
Universal Analytics and Google Tag ManagerUniversal Analytics and Google Tag Manager
Universal Analytics and Google Tag ManagerYehoshua
 
Universal Analytics and Google Tag Manager - Superweek 2014
Universal Analytics and Google Tag Manager - Superweek 2014Universal Analytics and Google Tag Manager - Superweek 2014
Universal Analytics and Google Tag Manager - Superweek 2014Analytics Ninja LLC
 
SAP_SD_Overview_Presentation.ppt
SAP_SD_Overview_Presentation.pptSAP_SD_Overview_Presentation.ppt
SAP_SD_Overview_Presentation.pptSaranyanSR
 
SAP_SD_Overview_Presentation.ppt
SAP_SD_Overview_Presentation.pptSAP_SD_Overview_Presentation.ppt
SAP_SD_Overview_Presentation.pptRoopaBK2
 
Black Friday Shopping Prediction
Black Friday Shopping PredictionBlack Friday Shopping Prediction
Black Friday Shopping PredictionSBIqbal
 

Ähnlich wie Kaggle winning solutions: Retail Sales Forecasting (20)

5 How The Model Works (With Notes)
5 How The Model Works (With Notes)5 How The Model Works (With Notes)
5 How The Model Works (With Notes)
 
Lecture 08B - Logical-DWH-Model-Pending.pptx
Lecture 08B - Logical-DWH-Model-Pending.pptxLecture 08B - Logical-DWH-Model-Pending.pptx
Lecture 08B - Logical-DWH-Model-Pending.pptx
 
SALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdfSALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdf
 
Data ware housing- Introduction to olap .
Data ware housing- Introduction to  olap .Data ware housing- Introduction to  olap .
Data ware housing- Introduction to olap .
 
INFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGINFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAINING
 
Predict Repeat Shoppers with H20 and Spark
Predict Repeat Shoppers with H20 and SparkPredict Repeat Shoppers with H20 and Spark
Predict Repeat Shoppers with H20 and Spark
 
Black Friday Shopping Prediction_ PPT
Black Friday Shopping Prediction_ PPTBlack Friday Shopping Prediction_ PPT
Black Friday Shopping Prediction_ PPT
 
Walmart Sales Prediction
Walmart Sales PredictionWalmart Sales Prediction
Walmart Sales Prediction
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15
 
How to build a data warehouse - code.talks 2014
How to build a data warehouse - code.talks 2014How to build a data warehouse - code.talks 2014
How to build a data warehouse - code.talks 2014
 
Data warehouse implementation design for a Retail business
Data warehouse implementation design for a Retail businessData warehouse implementation design for a Retail business
Data warehouse implementation design for a Retail business
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)
 
Chapter # 13
Chapter # 13Chapter # 13
Chapter # 13
 
Lecture 3F.ppt
Lecture 3F.pptLecture 3F.ppt
Lecture 3F.ppt
 
Universal Analytics and Google Tag Manager - Superweek 2014
Universal Analytics and Google Tag Manager - Superweek 2014Universal Analytics and Google Tag Manager - Superweek 2014
Universal Analytics and Google Tag Manager - Superweek 2014
 
Universal Analytics and Google Tag Manager
Universal Analytics and Google Tag ManagerUniversal Analytics and Google Tag Manager
Universal Analytics and Google Tag Manager
 
Universal Analytics and Google Tag Manager - Superweek 2014
Universal Analytics and Google Tag Manager - Superweek 2014Universal Analytics and Google Tag Manager - Superweek 2014
Universal Analytics and Google Tag Manager - Superweek 2014
 
SAP_SD_Overview_Presentation.ppt
SAP_SD_Overview_Presentation.pptSAP_SD_Overview_Presentation.ppt
SAP_SD_Overview_Presentation.ppt
 
SAP_SD_Overview_Presentation.ppt
SAP_SD_Overview_Presentation.pptSAP_SD_Overview_Presentation.ppt
SAP_SD_Overview_Presentation.ppt
 
Black Friday Shopping Prediction
Black Friday Shopping PredictionBlack Friday Shopping Prediction
Black Friday Shopping Prediction
 

Mehr von Yan Xu

Basics of Dynamic programming
Basics of Dynamic programming Basics of Dynamic programming
Basics of Dynamic programming Yan Xu
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Yan Xu
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for businessYan Xu
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed BanditsYan Xu
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangYan Xu
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Yan Xu
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Yan Xu
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Yan Xu
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to AutoencodersYan Xu
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data scienceYan Xu
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term MemoryYan Xu
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Yan Xu
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningYan Xu
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGoYan Xu
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep LearningYan Xu
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural NetworkYan Xu
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reductionYan Xu
 

Mehr von Yan Xu (20)

Basics of Dynamic programming
Basics of Dynamic programming Basics of Dynamic programming
Basics of Dynamic programming
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for business
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed Bandits
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGo
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep Learning
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 

Kürzlich hochgeladen

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Kürzlich hochgeladen (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Kaggle winning solutions: Retail Sales Forecasting

  • 1. Kaggle Winning Solution Series: Retail sales forecasting YAN XU HOUSTON MACHINE LEARNING JUNE 19, 2021
  • 2. Retail sales forecasting Rossmann Store Sales (https://www.kaggle.com/c/rossmann-store-sales) Corporación Favorita Grocery Sales Forecasting (https://www.kaggle.com/c/favorita-grocery-sales-forecasting) Walmart M5 Forecasting – Accuracy (https://www.kaggle.com/c/m5-forecasting-accuracy)
  • 3. 1115 stores predicting their daily sales for up to six weeks in advance
  • 6. Metric Root Mean Square Percentage Error RMSPE
  • 7. 1st place solution with feature extraction Recent data Temporal information Current trends Store information Weather https://www.kaggle.com/c/rossman n-store-sales/discussion/18024
  • 8. Feature extraction – Recent data • Median • Mean • Harmonic mean • Standard deviation • Skewness • Kurtosis • 10%/90% percentiles • Previous month • last quarter • Last half year • Last year • Last 2 years • Store • Day of week • Promotions • Holidays Keys Time period Stats
  • 9. Feature extraction – Temporal information Day counters (how each record relates to events or cycles) ◦ The number of days before, after or within the event ◦ Events ◦ Promotion cycle ◦ Summer holidays ◦ Store refurbishment ◦ Start of competition and start of secondary promotion cycle Day of week, day of month, day/week/month of year Number of holidays during the current week, last week and next week
  • 10. Feature extraction – Current trends Last quarter and last year Store specific linear model (ridge regression) on ◦ The day number, to extrapolate into six weeks in test ◦ Day of week ◦ Promotions
  • 11. Feature extraction – Other features Store features ◦ Assortment ◦ Store type ◦ Aggregates by store ◦ Average sales per customer ◦ Ratio of sales during promotions/holidays/Saturdays ◦ Proportion school holidays and days that the store is open State specific weather ◦ Max temperature ◦ Mm precipitation
  • 13. Model training •XGBoost Models on random selections of features •Handpicked models •500 random models and validate on each pair of ensemble models •Take features from all the selected models and combine into one •Separate models on the months May to Sep •Month ahead models •Log transformed the variable, a multiplier factor (0.985) to apply
  • 14.
  • 15. 3rd place solution with entity embeddings https://arxiv.org/pdf/1604.06737.pdf
  • 17.
  • 18. Data •unit_sales by date •store_nbr •item_nbr •onpromotion - whether that item_nbr was on promotion for a specified date and store_nbr. •Store metadata, including city, state, type, and cluster. •Item metadata, including family, class, and perishable •The count of sales transactions for each date/store_nbr combination. •Daily oil price •Holidays and Events, with metadata
  • 19. Metric Normalized Weighted Root Mean Squared Logarithmic Error 𝑖=1 𝑛 𝑤𝑖(log 𝑦𝑖 + 1 − log 𝑦𝑖 + 1 )2 𝑖=1 𝑛 𝑤𝑖 The weights 𝑤𝑖 , can be found in the items.csv file (see the Data page). Perishable items are given a weight of 1.25 where all other items are given a weight of 1.00. RMSLE incurs a larger penalty for the underestimation of the Actual variable than the Overestimation.
  • 20. 1st place solution with ensemble LSTM model
  • 21.
  • 22. 4th place: Encode-decoder with Dilated causal convolutions
  • 23. 5th place: More ensembles
  • 24.
  • 25. Data
  • 26. Data
  • 27. Metric Weighted Root Mean Squared Scaled Error (RMSSE) Naïve one step forecast
  • 28. 1st place solution •Single LGBM model, with objective = tweedie •divide into groups with similar time series, and model it. (e.g.) by store, by store cat, by store dept, etc. •select final model using mean(cvs, public score) and std(cvs, public score) • Multiple validation set •Ensemble of non recursive and recursive Recursive:
  • 29. 1st place solution Store/Item Price ◦ Max ◦ Min ◦ Std ◦ Mean ◦ Price_norm divided by price max ◦ Price_nunique ◦ Item_nunique that has the same price ◦ Price momentum by month/year • Calendar features • Day • Week • Month • Year index • Day of week • Weekend • Event • State
  • 30. Lag features ◦ 28 day shift ◦ for 14 days Lag rolling features ◦ 28 day shift ◦ Time window of [7,14, 30, 60, 180] ◦ Rolling mean/std Rolling with shift [1,7,14] • Mean encoding features (mean and std) • ['state_id'] • ['store_id'] • ['cat_id'] • ['dept_id'] • ['state_id', 'cat_id'] • ['state_id', 'dept_id'] • ['store_id', 'cat_id'] • ['store_id', 'dept_id'] • ['item_id'] • ['item_id', 'state_id'] • ['item_id', 'store_id']
  • 31. 2nd place – Aligning top and bottom Bottom level: lgb model for each store without lag/rolling Top 5 levels: [ALL, STATE, STORE, CATEGORY, DEPARTMENT] ALL STATE STORE
  • 32. 2nd place – Aligning top and bottom Magic multipliers Overshot fit (multiplier > 1) Level 1 : all_id “Build a set of bottom level models with multipliers ranging from 0.9 to 1.23, the optimum is somewhere in the range 0.93-0.95 and built an ensemble with 0.9, 0.93, 0.95, 0.97 and 0.99”
  • 33. 2nd place – Aligning top and bottom Alignment at level 1 N-beats prediction at level 1
  • 34. Conclusion •Data is the key: promotions, competitions, holidays in addition to sales data. •Metric needs to be customized for different problems •Statistical features can be helpful (lag, rolling by different combination of keys) •GBM is still the most popular model for prediction •Separate models may be needed for different stores/locations/duration of days •Low level prediction can be improved by aligning high level prediction
  • 35. Win a Kaggle competition? • Ideas and Inspiration • Best practice