This document provides an overview of data-driven attribution modeling using Shapley values and Markov chains. It discusses defining attribution models in BigQuery, comparing different model outputs, and getting an overview of model decision parameters. The speakers then describe how to implement Shapley values and Markov chain attribution models, including available packages for each. Lastly, it recommends starting simply, testing multiple models and time frames, and gradually expanding the attribution methodology.
2. What To Expect from this Session
1. How to view rule-based vs data-driven attribution
Slides & Checklist: via Search Seekers
3. How to compare different model outputs
4. Get an overview of model decision parameters
2. How to implement Shapley & Markov chain attribution
3. Your Speakers: Steffi and Chris
Digital Marketer
Tech nerd
Climber
199. 2008 2013 2020
Dad of 2
● Mountain enthusiast,
most likely biking or skiing
● Passionate about finding
new insights from data
4. Bergzeit: Combining Love for Mountains & Data
Online Store for Mountain Gear
Aiming for +100M Revenue in 2020
14 Countries & 4 Languages
Content. Commerce. Experience
7 data experts in Team A&O
5. #2
Recap: The Attribution Challenge
#3#1
Model Parameters:
- Channel definition
- Time frame
- Path length
- Multi & Repurchase
850€
6. We’re at the beginning of the journey
Steffi
Data Analyst
Kira
Performance
Chris
Support
10. Get to Know your Data: Find Your Ideal Time Lag
Start with exploring
your data!
11. Sales PlanningMedia BiddingReporting
Channel to ad unit
Long lookback
period
Tool independent
Servable as API
Data enhancement
Integrations
Data enhancement
Revenue planning
Budget forecasting
Three Outputs of a Custom Attribution Model
12. Data can’t be stored or awkward to query
What Failed For Us: Multi-Channel Funnel UI/API
13. Data didn’t make too much sense, no API to call
What Failed For Us: Google Attribution
14. It might not be worth it, but you don’t know beforehand!
?
Data Driven Attribution is Like the “Tre Cime”
15. Customized Data DrivenRule-Based
Simple logic
Easier to implement
Includes ALL customer journeys
Distribute by actual contribution
vs
Best Practice: Test many models before deciding!
Why Customize? Get Closer to Your “Truth”
19. Python Cloud Function (or docker image)
Soon to open sourced on Github
Optimized for GA360 Datamain.py
sql
statements
not included: recency, repurchase
Mapping to revenue via transactionId
Bias towards “through-way” channels
Google Fractribution Package for Shapley Values
20. Google Fractribution: Example BigQuery Output
SQL reference for pivoting conversions to channel revenue:
https://towardsdatascience.com/how-to-unpivot-multiple-columns-into-tidy-pairs-with-sql-and-bigquery-d9d0e74ce675
21. Based on sequential probabilities
Detailed reference:
medium.com/@mortenhegewald/marketing-channel-attribution-using-markov-chains-101-in-python-78fb181ebf1ecç
Leverages removal effect (see C3)
Discounts “through-way” channels
Attribution Modeling with Markov Chains
22. Available in Python and R
Outputs to cloud storage bucket
Manual data preprocessing
not included: hit-level, recency,
repurchase
Channel Attribution Package for Markov Chains
23. Note: There is no Golden Rule for Model choice
Model Comparison to Last Click: Our Results
24. userId sessionId date channel transactionId revenue
123 101 2020-09-01 PPC Generic - -
123 102 2020-09-03 PPC Brand - -
124 201 2020-09-01 SEO Magazin - -
124 202 2020-09-05 SEO Brand 002 87.45
Data could come from any tracking source
What Data do we Need? Session-Level Data
25. Data ConnectorGA IV ExportGA 360 Export
Best Google support
Free cost tier
360 needed
Launched Recently
New event model
GA IV needed
Works with Non-360
No GA IV needed
Extra cost
Three Options to Get Your Data into BigQuery
26. Our Decision Space for Attribution Model Choice
DeliveryModel evaluationModel definition
Define time lag
Set max path length
Define multiple models to compare
Compare different time frames
Evaluate model differences
Find best fit for your channels
Build delivery pipelines
Define retraining frequency
Define historical windowSet channel definition