1. The document describes Adform's efforts to perform cross-device tracking by linking cookies and devices that belong to the same user.
2. Adform analyzes over 1.5 billion events per day from deterministic sources and bid responses to build a cross-device graph and cluster cookies into user profiles.
3. A gradient boosted decision tree classifier is used to classify cookie pairs as belonging to the same user or different users based on features like location, device type, and behavior over time.
15. by Pair-wise classification
👽👻
A B C D
…with ground truth
A B
C D
Positive Negative
A C
A D
B C
B D
Create cookie pairs
as classification
data set
E
A E
B E
C E
D E
Cookie association
17. Crafting Features
for each cookie pair…
Aggregation
Jaccard Index
Cosine Similarity
Correlation Similarity
Hellinger Distance
Max Overlap
Similarity
cookie_id
Location
Device
device_type_id
os_id
browser_id
browser_language_codes
mobile_app_id
ip_v4
country_id
region_id
city_id
zip_code_id
log_time
ID
Time
Observations
A B
0 12 24 hours
Pageviews
0 31 days
set a
set b
Features
18. Crafting Features
for each cookie pair…
cookie_id
Location
Device
device_type_id
os_id
browser_id
browser_language_codes
mobile_app_id
ip_v4
country_id
region_id
city_id
zip_code_id
log_time
ID
Time
Observations
A B
Features
(http://arogozhnikov.github.io/2016/06/24/gradient_boosting_explained.html)
🌴 0.78 A B
Classifier
Gradient Boosted Decision Trees
🌴 🌴
PC 1+2
Vectorial
Sets
Numerical
25 dim.
feature vector
20. Classification Clustering
(Connected Components Algorithm)
Pruning
A
C
B
D E
F
G
A
B
C
D
E
F
G
A
B
C
DE
F
G
Pre-Clustering
(Connected Components Algorithm)
Full Graph Components Sub-Graph
A
C
B
D E
F
G
Clusters Users
(XGBoost)
29. 300M
300M 175M
138M
210M
time
More of it…
2017-05-01
2017-05-08
2017-06-25
2017-07-24
7 days
⏲🌐
Germany
1.3B
events
(10%)
30 days
⏲🌐
Germany
56B
events
30. time
More of it…
2017-05-01
2017-05-08
7 days
⏲🌐
Germany
1.3B
events
(10%)
30 days
⏲🌐
Germany
Predicting
Model
parameters
Pre-
Processing
Pre-
Clustering
Predict Cross
Device ID’s
Cross-Device
Cookies Ready
XD
Data
30 min.
2 hours
2017-06-25
2017-07-24
56B
events
35. Tran et al.
“Classification and Learning-to-rank Approaches for Cross-Device Matching”
CIKM Cup 2016
Dasgupta et al.
”Overcoming Browser Cookie Churn with Clustering”
CIKM Cup 2016
Blondel et al.
“Fast unfolding of communities in large networks”
2008
Malloy et al.
“Internet Device Graphs”
KDD 2017
[1]
[2]
[3]
[4]
References