SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
Data Engineering In Practice:
SmartNews Ads裏のDMP System
Lan
Who am I
• Lan
• Veteran hacker but new in AD world
• someone who can make a computer do what he wants—whether the computer
wants to or not. (http://paulgraham.com/gba.html)
• ex-{Rakuten, GREE}
• Distribution System, Info Retrieval, ML
Today’s Talk
• DMP in SmartNews Ads
• #1. Prediction
• #2. Targeting
• Future Work & Summary
DMP = Data Management Platform
DMP in SmartNews Ads
• Private DMP ( 90%+1st-party data )
• Data Collect, Clean, Aggregation
• ID Mapping
• User Profiling
• User Clustering
• CTR / CVR Prediction
• Lookalike
• Custom Audience
DMP
Clusters
AD delivery
cluster
AD Log in
S3
Kinesis
AD tracker
Video AD
delivery
cluster
DMP
streaming
Audience
Data
in
DynamoDBRDB
Hadoop
ML
Analytics
Models
&
Targeting
SmartNews
Log
ML
Small company but not small data
•Article Meta > 200K/day
•Article x {read, share, read_related …}
•Channel x {subscribe, preview, view, …}
•Push, Live, Weather, Setting, …
•Survey result
•Audience Data > 14M (~5M MAU)
•AD Meta
•AD History
•AD Conversions
•AD Optout
• Managed/Compressed Data > 130TB
• Lookalike seeds
• ~1TB Data for training CTR prediction model
•> 1M unique features
•User Demographics
•Device
•Locations
•…
#1
Prediction
Pick up an AD
to feed here
Similar to
Recommendation
but
DIFFERENT
• optimization goal
• accuracy of the probability
More than Ranking
• When we do AD auction
• eCPM (effective Cost per Mille) = CTR (Click Through Rate) x CPC (Cost per Click)
• Suppose we have
• CTRad1=0.05 > CTRad2=0.04 > CTRad3=0.03
• CPCad1 = 10JPY, CPCad2 = 13JPY, CPCad3 = 20JPY(winner)
• but if: pCTRad1 = 0.2 (winner) > pCTR’ad2 = 0.1 > pCTR’ad3 = 0.03
• then we lost 0.1JPY potential income
The CTR(CVR) prediction Problem
μ(a, u, c) = p(click | a,u,c)
CTR Prediction v1
• Train and scoring daily
• One GBDT (Gradient Boosting Decision Tree) model per AD campaign
• using ~1month’s data
• Hundreds of small batches inside Hadoop Yarn
• Quick and Simple
• dev in 1 month
• pick up best features for every campaign
• minutes ~ 1 hour for model training
• explainable Tree models
• no need for AD feature
• Same approach for CVR prediction (CPC / CVR = CPA (Cost Per Acquisition) )
delivery
result
User
Features
generate
samples
Yarn
Users
predictions
sample
model
scoring
sample
model
scoring
sample
model
scoring
…
Metrics
• NE (Normalized Cross- Entropy)
• the average log loss when using predicted CTR / the average log loss per impression
• https://facebook.com//download/321355358042503/adkdd_2014_camera_ready_junfeng.pdf
• AUC (Area under the ROC curve, AUROC)
• measure ranking quality
• others: Precision/Recall, ECS(Effective catalog size), CTR / CVR / Sales, etc
Review of CTR Prediction v1
• Marked improvement, moderate AUC & NE
• And
• hard to do overall tuning
• hard to prediction online (feature set differs)
• latency for new campaigns
• relatively poor performance to new campaigns (cold start)
• lost the connections between campaigns even for the same advertiser
• …
CTR Prediction v2
• A simple model for all
• AD feature added
• Dynamic features extraction
• All calculation distributed
• GBDT + LogisticRegression
• Train once per day, scoring twice
About the Features
• >1M unique features, sparse
• GBDT provides great feature engineering
• (sometimes) feature engineering is kind of intuition and trial-and-error
• demographic, device, location, reading interests…
• AD history is helpful
• Feature Hashing, Binarization & Discretization, …
Performance improvement
#2
Targeting
Watabe
TamTam
Komiya
Takei
Ikeishi
Nagase
Lan
Niku
Game
Beer
Snack
Costume
Gourmet
Princess
It’s difficult comparing to
Profiling User by Statistics and ML
• Gender Prediction (precision: 0.90+), Age Prediction, …
• News Channel / Source Preference
• AD Slot Preference
• …
Standard Targeting
• Female in Kansai who subscribes Travel Channel
Lookalike Targeting
Lookalike Targeting
• Our solution
• Solve it as an classification problem
• Seed user as Positive Sample
• While all targeting candidates as Negative Sample
(w/ random sampling )
• based on Spark MLlib Logistic Regression
• 30%~50% CVR↑ comparing to normal targeting
Article Keyword Targeting
Keyword
Realtime
Calculating
Reach UU
Only user who
exceeds a certain
read-time threshold
will be included
Custom Audience
SmartNews
AD
tracker
Send any custom event
(S2S req, web beacon, etc)
Event
Audience
BloomFilter
Obj
Updating
per
Several Minutes
Your
Service / App / Site
SmartNews
AD
Delivery
Cluster
AD targeting
/
Delete Targeting
Lookalike
Lookalike Targeting
Future Work
Targeting Audience by Interests
Collect Negative Signal
to
Optimize UX
Summary of My 1st SmartNews Year
• Challenge place. We’re startup so we can move quick and break things
• Learn from the industry leaders. Keep trial-and-error.
• Number don’t lie. Don’t trust your intuition over number.
• But if you really doubt the number, look closely. there may be BUG
hidden.

Weitere ähnliche Inhalte

Was ist angesagt?

Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기 - 윤석찬 (AWS 테크에반젤리스트)
Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기  - 윤석찬 (AWS 테크에반젤리스트)Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기  - 윤석찬 (AWS 테크에반젤리스트)
Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기 - 윤석찬 (AWS 테크에반젤리스트)Amazon Web Services Korea
 
프론트엔드 개발자를 위한 서버리스 - 윤석찬 (AWS 테크에반젤리스트)
프론트엔드 개발자를 위한 서버리스 - 윤석찬 (AWS 테크에반젤리스트)프론트엔드 개발자를 위한 서버리스 - 윤석찬 (AWS 테크에반젤리스트)
프론트엔드 개발자를 위한 서버리스 - 윤석찬 (AWS 테크에반젤리스트)Amazon Web Services Korea
 
AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)
AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)
AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)Amazon Web Services Korea
 
Amazon의 머신러닝 솔루션: Fraud Detection & Predictive Maintenance - 남궁영환 (AWS 데이터 사이...
Amazon의 머신러닝 솔루션: Fraud Detection & Predictive Maintenance - 남궁영환 (AWS 데이터 사이...Amazon의 머신러닝 솔루션: Fraud Detection & Predictive Maintenance - 남궁영환 (AWS 데이터 사이...
Amazon의 머신러닝 솔루션: Fraud Detection & Predictive Maintenance - 남궁영환 (AWS 데이터 사이...Amazon Web Services Korea
 
Aws what is cloud computing deck 08 14 13
Aws what is cloud computing deck 08 14 13Aws what is cloud computing deck 08 14 13
Aws what is cloud computing deck 08 14 13Amazon Web Services
 
일본 시골 개발자의 AWS 활용기 - AWS Summit Seoul 2017
일본 시골 개발자의 AWS 활용기 - AWS Summit Seoul 2017일본 시골 개발자의 AWS 활용기 - AWS Summit Seoul 2017
일본 시골 개발자의 AWS 활용기 - AWS Summit Seoul 2017Amazon Web Services Korea
 
How to Host and Manage Enterprise Customers on AWS (ARC213) | AWS re:Invent 2013
How to Host and Manage Enterprise Customers on AWS (ARC213) | AWS re:Invent 2013How to Host and Manage Enterprise Customers on AWS (ARC213) | AWS re:Invent 2013
How to Host and Manage Enterprise Customers on AWS (ARC213) | AWS re:Invent 2013Amazon Web Services
 
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...Amazon Web Services Korea
 
국내 건설 기계사 도입 사례를 통해 보는 AI가 적용된 수요 예측 관리 - 베스핀글로벌 조창윤 AI/ML팀 팀장
국내 건설 기계사 도입 사례를 통해 보는 AI가 적용된 수요 예측 관리 - 베스핀글로벌 조창윤 AI/ML팀 팀장국내 건설 기계사 도입 사례를 통해 보는 AI가 적용된 수요 예측 관리 - 베스핀글로벌 조창윤 AI/ML팀 팀장
국내 건설 기계사 도입 사례를 통해 보는 AI가 적용된 수요 예측 관리 - 베스핀글로벌 조창윤 AI/ML팀 팀장BESPIN GLOBAL
 
使用Amazon Machine Learning 建立即時推薦引擎
使用Amazon Machine Learning 建立即時推薦引擎使用Amazon Machine Learning 建立即時推薦引擎
使用Amazon Machine Learning 建立即時推薦引擎Amazon Web Services
 
Optimizing Costs and Efficiency of AWS Services
Optimizing Costs and Efficiency of AWS ServicesOptimizing Costs and Efficiency of AWS Services
Optimizing Costs and Efficiency of AWS ServicesAmazon Web Services
 
[Retail & CPG Day 2019] 기조연설 | Cloud Journey of Traditional Retailers for Dig...
[Retail & CPG Day 2019] 기조연설 | Cloud Journey of Traditional Retailers for Dig...[Retail & CPG Day 2019] 기조연설 | Cloud Journey of Traditional Retailers for Dig...
[Retail & CPG Day 2019] 기조연설 | Cloud Journey of Traditional Retailers for Dig...Amazon Web Services Korea
 
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...Amazon Web Services Korea
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotRandall Hunt
 
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web Services
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web ServicesAWS Webcast - Power your Digital Marketing Strategy with Amazon Web Services
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web ServicesAmazon Web Services
 
Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit...
Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit...Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit...
Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit...Amazon Web Services Korea
 
Amazon SageMaker workshop
Amazon SageMaker workshopAmazon SageMaker workshop
Amazon SageMaker workshopJulien SIMON
 
AWS 클라우드를 통해 최소기능제품(MVP) 빠르게 개발하기 - 윤석찬, AWS 테크에반젤리스트
AWS 클라우드를 통해 최소기능제품(MVP) 빠르게 개발하기 - 윤석찬, AWS 테크에반젤리스트AWS 클라우드를 통해 최소기능제품(MVP) 빠르게 개발하기 - 윤석찬, AWS 테크에반젤리스트
AWS 클라우드를 통해 최소기능제품(MVP) 빠르게 개발하기 - 윤석찬, AWS 테크에반젤리스트Amazon Web Services Korea
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine LearningChris Fregly
 

Was ist angesagt? (20)

Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기 - 윤석찬 (AWS 테크에반젤리스트)
Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기  - 윤석찬 (AWS 테크에반젤리스트)Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기  - 윤석찬 (AWS 테크에반젤리스트)
Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기 - 윤석찬 (AWS 테크에반젤리스트)
 
프론트엔드 개발자를 위한 서버리스 - 윤석찬 (AWS 테크에반젤리스트)
프론트엔드 개발자를 위한 서버리스 - 윤석찬 (AWS 테크에반젤리스트)프론트엔드 개발자를 위한 서버리스 - 윤석찬 (AWS 테크에반젤리스트)
프론트엔드 개발자를 위한 서버리스 - 윤석찬 (AWS 테크에반젤리스트)
 
[AWSome Day] Opening Keynote
[AWSome Day] Opening Keynote[AWSome Day] Opening Keynote
[AWSome Day] Opening Keynote
 
AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)
AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)
AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)
 
Amazon의 머신러닝 솔루션: Fraud Detection & Predictive Maintenance - 남궁영환 (AWS 데이터 사이...
Amazon의 머신러닝 솔루션: Fraud Detection & Predictive Maintenance - 남궁영환 (AWS 데이터 사이...Amazon의 머신러닝 솔루션: Fraud Detection & Predictive Maintenance - 남궁영환 (AWS 데이터 사이...
Amazon의 머신러닝 솔루션: Fraud Detection & Predictive Maintenance - 남궁영환 (AWS 데이터 사이...
 
Aws what is cloud computing deck 08 14 13
Aws what is cloud computing deck 08 14 13Aws what is cloud computing deck 08 14 13
Aws what is cloud computing deck 08 14 13
 
일본 시골 개발자의 AWS 활용기 - AWS Summit Seoul 2017
일본 시골 개발자의 AWS 활용기 - AWS Summit Seoul 2017일본 시골 개발자의 AWS 활용기 - AWS Summit Seoul 2017
일본 시골 개발자의 AWS 활용기 - AWS Summit Seoul 2017
 
How to Host and Manage Enterprise Customers on AWS (ARC213) | AWS re:Invent 2013
How to Host and Manage Enterprise Customers on AWS (ARC213) | AWS re:Invent 2013How to Host and Manage Enterprise Customers on AWS (ARC213) | AWS re:Invent 2013
How to Host and Manage Enterprise Customers on AWS (ARC213) | AWS re:Invent 2013
 
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
 
국내 건설 기계사 도입 사례를 통해 보는 AI가 적용된 수요 예측 관리 - 베스핀글로벌 조창윤 AI/ML팀 팀장
국내 건설 기계사 도입 사례를 통해 보는 AI가 적용된 수요 예측 관리 - 베스핀글로벌 조창윤 AI/ML팀 팀장국내 건설 기계사 도입 사례를 통해 보는 AI가 적용된 수요 예측 관리 - 베스핀글로벌 조창윤 AI/ML팀 팀장
국내 건설 기계사 도입 사례를 통해 보는 AI가 적용된 수요 예측 관리 - 베스핀글로벌 조창윤 AI/ML팀 팀장
 
使用Amazon Machine Learning 建立即時推薦引擎
使用Amazon Machine Learning 建立即時推薦引擎使用Amazon Machine Learning 建立即時推薦引擎
使用Amazon Machine Learning 建立即時推薦引擎
 
Optimizing Costs and Efficiency of AWS Services
Optimizing Costs and Efficiency of AWS ServicesOptimizing Costs and Efficiency of AWS Services
Optimizing Costs and Efficiency of AWS Services
 
[Retail & CPG Day 2019] 기조연설 | Cloud Journey of Traditional Retailers for Dig...
[Retail & CPG Day 2019] 기조연설 | Cloud Journey of Traditional Retailers for Dig...[Retail & CPG Day 2019] 기조연설 | Cloud Journey of Traditional Retailers for Dig...
[Retail & CPG Day 2019] 기조연설 | Cloud Journey of Traditional Retailers for Dig...
 
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter Bot
 
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web Services
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web ServicesAWS Webcast - Power your Digital Marketing Strategy with Amazon Web Services
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web Services
 
Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit...
Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit...Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit...
Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit...
 
Amazon SageMaker workshop
Amazon SageMaker workshopAmazon SageMaker workshop
Amazon SageMaker workshop
 
AWS 클라우드를 통해 최소기능제품(MVP) 빠르게 개발하기 - 윤석찬, AWS 테크에반젤리스트
AWS 클라우드를 통해 최소기능제품(MVP) 빠르게 개발하기 - 윤석찬, AWS 테크에반젤리스트AWS 클라우드를 통해 최소기능제품(MVP) 빠르게 개발하기 - 윤석찬, AWS 테크에반젤리스트
AWS 클라우드를 통해 최소기능제품(MVP) 빠르게 개발하기 - 윤석찬, AWS 테크에반젤리스트
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine Learning
 

Ähnlich wie SmartNews TechNight Vol.5 : AD Data Engineering in practice: SmartNews Ads裏のデータシステム

Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganMulti Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganSpark Summit
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?Ivo Andreev
 
Thomas Jensen. Machine Learning
Thomas Jensen. Machine LearningThomas Jensen. Machine Learning
Thomas Jensen. Machine LearningVolha Banadyseva
 
Churn prediction data modeling
Churn prediction data modelingChurn prediction data modeling
Churn prediction data modelingPierre Gutierrez
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
Using Machine Learning in the delivery of ads
Using Machine Learning in the delivery of adsUsing Machine Learning in the delivery of ads
Using Machine Learning in the delivery of adsRuth Garcia Gavilanes
 
Partner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataPartner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataTreasure Data, Inc.
 
Real-Time Forecasting at Scale using Delta Lake and Delta Caching
Real-Time Forecasting at Scale using Delta Lake and Delta CachingReal-Time Forecasting at Scale using Delta Lake and Delta Caching
Real-Time Forecasting at Scale using Delta Lake and Delta CachingDatabricks
 
MongoDB World 2019: How Braze uses the MongoDB Aggregation Pipeline for Lean,...
MongoDB World 2019: How Braze uses the MongoDB Aggregation Pipeline for Lean,...MongoDB World 2019: How Braze uses the MongoDB Aggregation Pipeline for Lean,...
MongoDB World 2019: How Braze uses the MongoDB Aggregation Pipeline for Lean,...MongoDB
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsGanesan Narayanasamy
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...Databricks
 
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajH2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajSri Ambati
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 CareerBuilder.com
 
predictive analysis and usage in procurement ppt 2017
predictive analysis and usage in procurement  ppt 2017predictive analysis and usage in procurement  ppt 2017
predictive analysis and usage in procurement ppt 2017Prashant Bhatmule
 

Ähnlich wie SmartNews TechNight Vol.5 : AD Data Engineering in practice: SmartNews Ads裏のデータシステム (20)

Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganMulti Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
kdd2015
kdd2015kdd2015
kdd2015
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
 
Thomas Jensen. Machine Learning
Thomas Jensen. Machine LearningThomas Jensen. Machine Learning
Thomas Jensen. Machine Learning
 
Churn prediction data modeling
Churn prediction data modelingChurn prediction data modeling
Churn prediction data modeling
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Using Machine Learning in the delivery of ads
Using Machine Learning in the delivery of adsUsing Machine Learning in the delivery of ads
Using Machine Learning in the delivery of ads
 
Knowledge Discovery
Knowledge DiscoveryKnowledge Discovery
Knowledge Discovery
 
Partner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataPartner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_data
 
Advanced Analytics in Banking, CITI
Advanced Analytics in Banking, CITIAdvanced Analytics in Banking, CITI
Advanced Analytics in Banking, CITI
 
Workshop: Make the Most of Customer Data Platforms - David Raab
Workshop: Make the Most of Customer Data Platforms - David RaabWorkshop: Make the Most of Customer Data Platforms - David Raab
Workshop: Make the Most of Customer Data Platforms - David Raab
 
How we measure quality of JIRA deployments to Cloud?
How we measure quality of JIRA deployments to Cloud?How we measure quality of JIRA deployments to Cloud?
How we measure quality of JIRA deployments to Cloud?
 
Real-Time Forecasting at Scale using Delta Lake and Delta Caching
Real-Time Forecasting at Scale using Delta Lake and Delta CachingReal-Time Forecasting at Scale using Delta Lake and Delta Caching
Real-Time Forecasting at Scale using Delta Lake and Delta Caching
 
MongoDB World 2019: How Braze uses the MongoDB Aggregation Pipeline for Lean,...
MongoDB World 2019: How Braze uses the MongoDB Aggregation Pipeline for Lean,...MongoDB World 2019: How Braze uses the MongoDB Aggregation Pipeline for Lean,...
MongoDB World 2019: How Braze uses the MongoDB Aggregation Pipeline for Lean,...
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
 
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajH2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
predictive analysis and usage in procurement ppt 2017
predictive analysis and usage in procurement  ppt 2017predictive analysis and usage in procurement  ppt 2017
predictive analysis and usage in procurement ppt 2017
 

Mehr von SmartNews, Inc.

SmartNewsを支えるデータパイプラインとその運用
SmartNewsを支えるデータパイプラインとその運用SmartNewsを支えるデータパイプラインとその運用
SmartNewsを支えるデータパイプラインとその運用SmartNews, Inc.
 
Spring で実現する SmartNews のニュース配信基盤
Spring で実現する SmartNews のニュース配信基盤Spring で実現する SmartNews のニュース配信基盤
Spring で実現する SmartNews のニュース配信基盤SmartNews, Inc.
 
エンジニアからプロダクトマネージャーへ
エンジニアからプロダクトマネージャーへエンジニアからプロダクトマネージャーへ
エンジニアからプロダクトマネージャーへSmartNews, Inc.
 
SpringOne Platform 2016 報告会「A Lite Rx API for the JVM」/ 井口 貝 @ SmartNews, Inc.
SpringOne Platform 2016 報告会「A Lite Rx API for the JVM」/ 井口 貝 @ SmartNews, Inc.SpringOne Platform 2016 報告会「A Lite Rx API for the JVM」/ 井口 貝 @ SmartNews, Inc.
SpringOne Platform 2016 報告会「A Lite Rx API for the JVM」/ 井口 貝 @ SmartNews, Inc.SmartNews, Inc.
 
SmartNewsのニュース配信を支えるサーバ技術 / Kazhiro Sera @ SmartNews,Inc. #jjug_ccc
SmartNewsのニュース配信を支えるサーバ技術 / Kazhiro Sera @ SmartNews,Inc. #jjug_cccSmartNewsのニュース配信を支えるサーバ技術 / Kazhiro Sera @ SmartNews,Inc. #jjug_ccc
SmartNewsのニュース配信を支えるサーバ技術 / Kazhiro Sera @ SmartNews,Inc. #jjug_cccSmartNews, Inc.
 
Stream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysStream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysSmartNews, Inc.
 
Building a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWSBuilding a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWSSmartNews, Inc.
 
AWSの進化とSmartNewsの裏側
AWSの進化とSmartNewsの裏側AWSの進化とSmartNewsの裏側
AWSの進化とSmartNewsの裏側SmartNews, Inc.
 
SmartNews TechNight Vol.5 : SmartNews Ads の配信最適化の仕組みはどうなってるの? (エンジニア / SmartN...
SmartNews TechNight Vol.5 : SmartNews Ads の配信最適化の仕組みはどうなってるの? (エンジニア / SmartN...SmartNews TechNight Vol.5 : SmartNews Ads の配信最適化の仕組みはどうなってるの? (エンジニア / SmartN...
SmartNews TechNight Vol.5 : SmartNews Ads の配信最適化の仕組みはどうなってるの? (エンジニア / SmartN...SmartNews, Inc.
 
SmartNews TechNight Vol5 : SmartNews AdServer 解体新書 / ポストモーテム
SmartNews TechNight Vol5 : SmartNews AdServer 解体新書 / ポストモーテムSmartNews TechNight Vol5 : SmartNews AdServer 解体新書 / ポストモーテム
SmartNews TechNight Vol5 : SmartNews AdServer 解体新書 / ポストモーテムSmartNews, Inc.
 
SmartNews TechNight vol5 SmartNews Ads大図解
SmartNews TechNight vol5 SmartNews Ads大図解SmartNews TechNight vol5 SmartNews Ads大図解
SmartNews TechNight vol5 SmartNews Ads大図解SmartNews, Inc.
 
SmartNews's journey into microservices
SmartNews's journey into microservicesSmartNews's journey into microservices
SmartNews's journey into microservicesSmartNews, Inc.
 
Strem処理(Spark Streaming + Kinesis)とOffline処理(Hive)の統合
Strem処理(Spark Streaming + Kinesis)とOffline処理(Hive)の統合Strem処理(Spark Streaming + Kinesis)とOffline処理(Hive)の統合
Strem処理(Spark Streaming + Kinesis)とOffline処理(Hive)の統合SmartNews, Inc.
 
SmartNews の Webmining を支えるプラットフォーム
SmartNews の Webmining を支えるプラットフォームSmartNews の Webmining を支えるプラットフォーム
SmartNews の Webmining を支えるプラットフォームSmartNews, Inc.
 
AWS meetup「Apache Spark on EMR」
AWS meetup「Apache Spark on EMR」AWS meetup「Apache Spark on EMR」
AWS meetup「Apache Spark on EMR」SmartNews, Inc.
 
Smartnews Product Manager Night
Smartnews Product Manager NightSmartnews Product Manager Night
Smartnews Product Manager NightSmartNews, Inc.
 
SmartNews Ads System - AWS Summit Tokyo 2015
SmartNews Ads System - AWS Summit Tokyo 2015SmartNews Ads System - AWS Summit Tokyo 2015
SmartNews Ads System - AWS Summit Tokyo 2015SmartNews, Inc.
 
インフラ専任エンジニアが一人もいないSmartNewsにおけるクラウド活用法
インフラ専任エンジニアが一人もいないSmartNewsにおけるクラウド活用法インフラ専任エンジニアが一人もいないSmartNewsにおけるクラウド活用法
インフラ専任エンジニアが一人もいないSmartNewsにおけるクラウド活用法SmartNews, Inc.
 

Mehr von SmartNews, Inc. (19)

SmartNewsを支えるデータパイプラインとその運用
SmartNewsを支えるデータパイプラインとその運用SmartNewsを支えるデータパイプラインとその運用
SmartNewsを支えるデータパイプラインとその運用
 
Spring で実現する SmartNews のニュース配信基盤
Spring で実現する SmartNews のニュース配信基盤Spring で実現する SmartNews のニュース配信基盤
Spring で実現する SmartNews のニュース配信基盤
 
エンジニアからプロダクトマネージャーへ
エンジニアからプロダクトマネージャーへエンジニアからプロダクトマネージャーへ
エンジニアからプロダクトマネージャーへ
 
SpringOne Platform 2016 報告会「A Lite Rx API for the JVM」/ 井口 貝 @ SmartNews, Inc.
SpringOne Platform 2016 報告会「A Lite Rx API for the JVM」/ 井口 貝 @ SmartNews, Inc.SpringOne Platform 2016 報告会「A Lite Rx API for the JVM」/ 井口 貝 @ SmartNews, Inc.
SpringOne Platform 2016 報告会「A Lite Rx API for the JVM」/ 井口 貝 @ SmartNews, Inc.
 
SmartNewsのニュース配信を支えるサーバ技術 / Kazhiro Sera @ SmartNews,Inc. #jjug_ccc
SmartNewsのニュース配信を支えるサーバ技術 / Kazhiro Sera @ SmartNews,Inc. #jjug_cccSmartNewsのニュース配信を支えるサーバ技術 / Kazhiro Sera @ SmartNews,Inc. #jjug_ccc
SmartNewsのニュース配信を支えるサーバ技術 / Kazhiro Sera @ SmartNews,Inc. #jjug_ccc
 
Stream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysStream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdays
 
Building a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWSBuilding a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWS
 
AWSの進化とSmartNewsの裏側
AWSの進化とSmartNewsの裏側AWSの進化とSmartNewsの裏側
AWSの進化とSmartNewsの裏側
 
SmartNews TechNight Vol.5 : SmartNews Ads の配信最適化の仕組みはどうなってるの? (エンジニア / SmartN...
SmartNews TechNight Vol.5 : SmartNews Ads の配信最適化の仕組みはどうなってるの? (エンジニア / SmartN...SmartNews TechNight Vol.5 : SmartNews Ads の配信最適化の仕組みはどうなってるの? (エンジニア / SmartN...
SmartNews TechNight Vol.5 : SmartNews Ads の配信最適化の仕組みはどうなってるの? (エンジニア / SmartN...
 
SmartNews TechNight Vol5 : SmartNews AdServer 解体新書 / ポストモーテム
SmartNews TechNight Vol5 : SmartNews AdServer 解体新書 / ポストモーテムSmartNews TechNight Vol5 : SmartNews AdServer 解体新書 / ポストモーテム
SmartNews TechNight Vol5 : SmartNews AdServer 解体新書 / ポストモーテム
 
SmartNews TechNight vol5 SmartNews Ads大図解
SmartNews TechNight vol5 SmartNews Ads大図解SmartNews TechNight vol5 SmartNews Ads大図解
SmartNews TechNight vol5 SmartNews Ads大図解
 
NLP in SmartNews
NLP in SmartNewsNLP in SmartNews
NLP in SmartNews
 
SmartNews's journey into microservices
SmartNews's journey into microservicesSmartNews's journey into microservices
SmartNews's journey into microservices
 
Strem処理(Spark Streaming + Kinesis)とOffline処理(Hive)の統合
Strem処理(Spark Streaming + Kinesis)とOffline処理(Hive)の統合Strem処理(Spark Streaming + Kinesis)とOffline処理(Hive)の統合
Strem処理(Spark Streaming + Kinesis)とOffline処理(Hive)の統合
 
SmartNews の Webmining を支えるプラットフォーム
SmartNews の Webmining を支えるプラットフォームSmartNews の Webmining を支えるプラットフォーム
SmartNews の Webmining を支えるプラットフォーム
 
AWS meetup「Apache Spark on EMR」
AWS meetup「Apache Spark on EMR」AWS meetup「Apache Spark on EMR」
AWS meetup「Apache Spark on EMR」
 
Smartnews Product Manager Night
Smartnews Product Manager NightSmartnews Product Manager Night
Smartnews Product Manager Night
 
SmartNews Ads System - AWS Summit Tokyo 2015
SmartNews Ads System - AWS Summit Tokyo 2015SmartNews Ads System - AWS Summit Tokyo 2015
SmartNews Ads System - AWS Summit Tokyo 2015
 
インフラ専任エンジニアが一人もいないSmartNewsにおけるクラウド活用法
インフラ専任エンジニアが一人もいないSmartNewsにおけるクラウド活用法インフラ専任エンジニアが一人もいないSmartNewsにおけるクラウド活用法
インフラ専任エンジニアが一人もいないSmartNewsにおけるクラウド活用法
 

Kürzlich hochgeladen

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Kürzlich hochgeladen (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

SmartNews TechNight Vol.5 : AD Data Engineering in practice: SmartNews Ads裏のデータシステム

  • 1. Data Engineering In Practice: SmartNews Ads裏のDMP System Lan
  • 2. Who am I • Lan • Veteran hacker but new in AD world • someone who can make a computer do what he wants—whether the computer wants to or not. (http://paulgraham.com/gba.html) • ex-{Rakuten, GREE} • Distribution System, Info Retrieval, ML
  • 3. Today’s Talk • DMP in SmartNews Ads • #1. Prediction • #2. Targeting • Future Work & Summary
  • 4. DMP = Data Management Platform
  • 5. DMP in SmartNews Ads • Private DMP ( 90%+1st-party data ) • Data Collect, Clean, Aggregation • ID Mapping • User Profiling • User Clustering • CTR / CVR Prediction • Lookalike • Custom Audience
  • 6. DMP Clusters AD delivery cluster AD Log in S3 Kinesis AD tracker Video AD delivery cluster DMP streaming Audience Data in DynamoDBRDB Hadoop ML Analytics Models & Targeting SmartNews Log ML Small company but not small data •Article Meta > 200K/day •Article x {read, share, read_related …} •Channel x {subscribe, preview, view, …} •Push, Live, Weather, Setting, … •Survey result •Audience Data > 14M (~5M MAU) •AD Meta •AD History •AD Conversions •AD Optout • Managed/Compressed Data > 130TB • Lookalike seeds • ~1TB Data for training CTR prediction model •> 1M unique features •User Demographics •Device •Locations •…
  • 8. Pick up an AD to feed here
  • 9. Similar to Recommendation but DIFFERENT • optimization goal • accuracy of the probability
  • 10. More than Ranking • When we do AD auction • eCPM (effective Cost per Mille) = CTR (Click Through Rate) x CPC (Cost per Click) • Suppose we have • CTRad1=0.05 > CTRad2=0.04 > CTRad3=0.03 • CPCad1 = 10JPY, CPCad2 = 13JPY, CPCad3 = 20JPY(winner) • but if: pCTRad1 = 0.2 (winner) > pCTR’ad2 = 0.1 > pCTR’ad3 = 0.03 • then we lost 0.1JPY potential income
  • 11. The CTR(CVR) prediction Problem μ(a, u, c) = p(click | a,u,c)
  • 12. CTR Prediction v1 • Train and scoring daily • One GBDT (Gradient Boosting Decision Tree) model per AD campaign • using ~1month’s data • Hundreds of small batches inside Hadoop Yarn • Quick and Simple • dev in 1 month • pick up best features for every campaign • minutes ~ 1 hour for model training • explainable Tree models • no need for AD feature • Same approach for CVR prediction (CPC / CVR = CPA (Cost Per Acquisition) ) delivery result User Features generate samples Yarn Users predictions sample model scoring sample model scoring sample model scoring …
  • 13. Metrics • NE (Normalized Cross- Entropy) • the average log loss when using predicted CTR / the average log loss per impression • https://facebook.com//download/321355358042503/adkdd_2014_camera_ready_junfeng.pdf • AUC (Area under the ROC curve, AUROC) • measure ranking quality • others: Precision/Recall, ECS(Effective catalog size), CTR / CVR / Sales, etc
  • 14. Review of CTR Prediction v1 • Marked improvement, moderate AUC & NE • And • hard to do overall tuning • hard to prediction online (feature set differs) • latency for new campaigns • relatively poor performance to new campaigns (cold start) • lost the connections between campaigns even for the same advertiser • …
  • 15. CTR Prediction v2 • A simple model for all • AD feature added • Dynamic features extraction • All calculation distributed • GBDT + LogisticRegression • Train once per day, scoring twice
  • 16. About the Features • >1M unique features, sparse • GBDT provides great feature engineering • (sometimes) feature engineering is kind of intuition and trial-and-error • demographic, device, location, reading interests… • AD history is helpful • Feature Hashing, Binarization & Discretization, …
  • 20. Profiling User by Statistics and ML • Gender Prediction (precision: 0.90+), Age Prediction, … • News Channel / Source Preference • AD Slot Preference • …
  • 21. Standard Targeting • Female in Kansai who subscribes Travel Channel
  • 23. Lookalike Targeting • Our solution • Solve it as an classification problem • Seed user as Positive Sample • While all targeting candidates as Negative Sample (w/ random sampling ) • based on Spark MLlib Logistic Regression • 30%~50% CVR↑ comparing to normal targeting
  • 24. Article Keyword Targeting Keyword Realtime Calculating Reach UU Only user who exceeds a certain read-time threshold will be included
  • 25. Custom Audience SmartNews AD tracker Send any custom event (S2S req, web beacon, etc) Event Audience BloomFilter Obj Updating per Several Minutes Your Service / App / Site SmartNews AD Delivery Cluster AD targeting / Delete Targeting Lookalike Lookalike Targeting
  • 29. Summary of My 1st SmartNews Year • Challenge place. We’re startup so we can move quick and break things • Learn from the industry leaders. Keep trial-and-error. • Number don’t lie. Don’t trust your intuition over number. • But if you really doubt the number, look closely. there may be BUG hidden.