Data Science Behind Display Ads in Digital Marketing

Data Science behind Display Ads in Digital
Marketing
Kushal Wadhwani
Senior Data Scientist

We Help Marketers Increase Digital Share of Business
$30M FUNDING
Singapore,
South East
Asia
Bangalore,
India
Dubai, UAE
Dallas,
USA
CERTIFICATIONS
FOCUS

Use Case: Bring back a prospective user
1) User visits hdfc website , browsed
for personal loan
2) Drops off without submitting lead
3) Visits our publisher network
4) Vizury shows add with personalized
banners and quotes
5) User Clicks banner
6) Reaches back to hdfc website

Some of the Channels Powered by Vizury
Programmatic
Mobile Push
Browser Push
/ InstagFacebookram

Optimization problem behind Programmatic
Pays for impression
Maximize clicks
Publishers
Clients

Parameters to Optimize
1. What to bid
• Depends upon probability of click of that user
• Depends upon probability of click of that ad slot
bidValue ∝ P( click / ad slot, user)
ctr (click through rate) = 100* P( click / ad slot, user)
2. What to Show
• Products visited by the user
• Products and message suggested by the client

Data : Collection and processing

Data Collection
Bids
DB
Impressions
DB
Clicks
DB
User activity
DB

User variables and Ad slot variables
User variables
1) Time spent on website
2) Products visited
3) Number of impression’s shown
4) Number of clicks
Ad slot variables
1) Size of banner
2) Url of the ad slot

Problem formulation
• Classification problem
• 50 – 100 variables
• Both Numerical and categorical variables
• Massive amount of data to train
Id Categorical
variable 1
Categorical
variable 2
Numerical
variable 1
Numerical
variable 2
- - - - Click flag
1 xyz abc 1 0 0
2 - - - - 1
3 - - - - 0
xyz abc ?
?
?
Ad slot variables User level variables
Historical
data
New bid
request

ML Algorithms for
classification

Logistic Regression
Pros:
• Handles all linear interactions between variables
• There are established scalable algorithms for training
• Handles High cardinality categorical variables
Cons:
• Assumes that variables are linearly related to the log odds ratio
• Does not handles non linear interactions well
ln[p/(1-p)] =  + WTX
• p is the probability that the event Y occurs,
p(Y=1)
• p/(1-p) is the "odds ratio"
• ln[p/(1-p)] is the log odds ratio, or "logit"
p = 1/[1 + exp(- - WTX)]

Decision tree based Models
Pros:
• Handles non liner correlation of input variables with output variable
• Handles non linear interactions
• Models are intuitive, easy to understand and explain
Cons:
• Challenges in handling high cardinality categorical variables
Random Forrest
XGBoost

Neural Networks
Pros:
• Handles non liner correlation of input variables with output variable
• Handles non linear interactions of variables
• Handles High cardinality categorical variables
• Works well for large data sets
Cons:
• Models are not readable

Variable Insights and triage
1. Visualize variables
• Plot distributions
• Variable Vs ctr - visually try to see the
nature of correlation
• Cardinality of categorical variables
2. How to preprocess variable
3. Evaluate variable against ML techniques

Variable Insights : Numerical variable’s
Skewed Distribution Non linear correlation
var1var2
Distribution Correlation

Handling Skew and non linearity
Non Linear
correlation
Skewed Distribution
Logistic regression N N
Decision tree based models Y Y
Neural networks Y Y
• In general it is better to preprocess variables with skew
• Log transformation newvalue = log (oldvalue)
• Bucketization

Handling Skew and non linearity : Log transformationBeforeAfter
Distribution Correlation

Handling Skew and non linearity : Bucketization
Bucketized var1
Distribution within buckets

Variable Insights : Interaction of variables
Non linear
interaction
Logistic regression N
Decision tree based models Y
Neural networks Y
var1 vs var2 with size of circle representing ctr

Variable Insights : Categorical variables
Cardinality 104
Cardinality 10

Categorical variables
Neural network and logistic regression doesn’t handle categorical variables
out of the box, variable have to be converted into numerical variables
1. One hot encoding – creates one new variable for each categorical
value
2. Replace categorical value with its class weigh in our case ctr.
Interactions with other variables cannot be captured
High cardinality
categorical variables
Interaction between
categorical variables
Logistic regression Y N
Decision tree based models N Y
Neural networks Y Y

Evaluation Metrics
AUC (Area under curve) : 2 D plot of False positive rate Vs True positive rate
obtained by changing threshold
• Random probability will give auc of 0.5
• More the AUC better is the classification
• Quantifies how well model has ranked test
data but doesn’t consider magnitude of
response
Log Loss

Q & A
My Coordinates
LinkedIn : https://www.linkedin.com/in/kushal-wadhwani-02109a1a/
Email : kushal.wadhwani@vizury.com
To know more about Vizury visit : https://www.vizury.com/

Data Science Behind Display Ads in Digital Marketing

Data Science Behind Display Ads in Digital Marketing

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Mehr von Digital Vidya

Mehr von Digital Vidya (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Data Science Behind Display Ads in Digital Marketing