SlideShare ist ein Scribd-Unternehmen logo
1 von 15
How Criteo Scaled and Supported
Massive Growth with MongoDB
Julien SIMON
Vice President, Engineering
j.simon@criteo.com @julsimon
CRITEO
2
• R&D EFFORT
• RETARGETING
• CPC
PHASE 1 : 2005-2008
CRITEO CREATION
• MORE THAN 3000 CLIENTS
• 35 COUNTRIES, 15 OFFICES
• R&D: MORE THAN 300 PEOPLE
PHASE 2 : 2008-2012
GLOBAL LEADER : + 700 EMPLOYEES!
2007
15
EMPLOYEES 2009
84
EMPLOYEES
6
EMPLOYEES
2005
2010
203
EMPLOYEES
2012
+700
EMPLOYEES SO FAR
2006
2011
395
EMPLOYEES
2008
33
EMPLOYEES
GLOBAL PRESENCE
3
SYDNEY
PARIS
LONDON
BARCELONA
MILAN
MUNICH
BOSTON
NEW YORK
SAO PAULO
PALO ALTO
TOKYO
SEOUL
STOCKHOLM
AMSTERDAM
15 OFFICES, 30+
COUNTRIES
CHICAGO
GO GOGO
Powered by
PERFORMANCE DISPLAY
Copyright © 2013 Criteo. Confidential
A user sees products
on your …
… and sees
After on the banner,
the user goes back to the product page.
...then browses
the
4
REAL-TIME PERSONALIZATION
5Copyright © 2013 Criteo. Confidential.
Boutons
all original #represent
SHOP NOW
CouleursFond Disposition
WARM MEETS LIGHT
SWEET NOTHING
ADDIDAS IS ALL IN
ALL ORIGINALS
#REPRESENT
Slogans
JOIN NOW
SEE MORE
CLICK HERE
“Call to action”
Lien
opt-out
SEE MORE
JOIN NOW
SEE MORE
CLICK HERE
SHOP NOW
SHOP NOW
JOIN NOW
JOIN NOW
PREDICTION & RECOMMENDATION
2 CORE
TECHNOLOGIES
choose the right product
to display
choose the right users / advertiser /
publisher to display
RECOMMENDATION
ENGINE
CTR + CR
increase
PREDICTION
ENGINE
INFRASTRUCTURE
7Copyright © 2013 Criteo. Confidential.
DAILY TRAFFIC
- HTTP REQUESTS: 30+ BILLION
- BANNERS SERVED: 1+ BILLION
PEAK TRAFFIC (PER SECOND)
- HTTP REQUESTS: 500,000+
- BANNERS: 25,000+
7 DATA CENTERS
SET UP AND MANAGED
IN-HOUSE
AVAILABILITY > 99.95%
8Copyright © 2013 Criteo. Confidential.
HIGH PERFORMANCE COMPUTING
FETCH, STORE, CRUNCH, QUERY 20 additional TB EVERY DAY ?
…SUBTITLED « HOW I LEARNED TO STOP WORRYING AND LOVE HPC »
PRODUCT CATALOGUES
• Catalogue = product feed provided by advertisers (product
id, description, category, price, URL, etc)
• 3000+ catalogues, ranging from a few MB to several tens of GB
• About 50% of products change every day
• Imported at least once a day by an in-house application
• Data replicated within a geographical zone
• Accessed through a cache layer by web servers
• Microsoft SQL Server used from day 1
• Running fine in Europe, but…
– Number of databases (1 per advertiser)… and servers
– Size of databases
– SQL Server issues hard to debug and understand
• Running kind of fine in the US, until dead end in Q1 2011
– transactional replication over high latency links
Copyright © 2010 Criteo. Confidential.
REQUIREMENTS FOR A NEW DB
• Scale-out architecture running on commodity hardware
(aka « Intel CPUs in metal boxes »)
• No transactions needed, eventual consistency OK
• High availability
• Distributed clusters, with replication over high latency links
• Requestable (key-value not enough)
• Open source
… with active user community
… backed by a stable organization with long-term commitment (not one guy in a garage)
… no licence fees for production use
… commercial support available at reasonable cost
• Easy to learn, (re)deploy, monitor and upgrade
• « Low maintenance » (don’t need a 10-people team just to run it)
• Multi-language support
• Ability to export everything to Hadoop multiple times per day
Copyright © 2010 Criteo. Confidential.
FROM SQL SERVER TO MONGODB
• Ah, database migrations… everyone loves them 
• 1st step: solve replication issue
– Import and replicate catalogues in MongoDB
– Push content to SQL Server, still queried by web servers
• 2nd step: prove that MongoDB can survive our web traffic
– Modify web applications to query MongoDB
– C-a-r-e-f-u-l-l-y switch web queries to MongoDB for a small set of catalogues
– Observe, measure, A/B test… and generally make sure that the system still works
• 3rd step: scale !
– Migrate thousands of catalogues away from SQL Server
– Monitor and tweak the MongoDB clusters
– Add more MongoDB servers… and more shards
– Update ops processes (monitoring, backups, etc)
Copyright © 2010 Criteo. Confidential.
OUR MONGODB DEPLOYMENT
• Europe
– 18 3-server shards (1+1+1)
– 800M products, 1TB
– 1B requests/day (peak at 40K/s)
– 350M updates/day (peak at 11K/s)
• US
– 14 4-server shards (2+2)
– 400M products, 650GB
• APAC
– 12 3-server shards (2+1)
– 300M products, 500GB
• 146 servers total :
2.0 (+ Criteo patches)  2.2  2.4.3
Copyright © 2010 Criteo. Confidential.
MONGODB, 2+ YEARS LATER
• Stable (2.4.3 much better)
• Easy to (re)install and administer
• Great for small datasets (i.e. smaller than server RAM)
• Good performance if read/write ratio is high
• Failover and inter-DC replication work (but shard early!)
• Performance suffers when :
– dataset much larger than RAM
– read/write ratio is low
– Multiple applications coexist on the same cluster
• Some scalability issues remain (master-slave, connections)
• Criteo is very interested in the 10gen roadmap 
Copyright © 2010 Criteo. Confidential.
THANKS A LOT FOR YOUR ATTENTION!
14Copyright © 2013 Criteo. Confidential.
www.criteo.com
engineering.criteo.com
Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB

Weitere ähnliche Inhalte

Andere mochten auch

PostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLPostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLCockroachDB
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016Pierre Mavro
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupDatabricks
 
Linking words
Linking wordsLinking words
Linking wordsBochica
 
Intro to Spark and Spark SQL
Intro to Spark and Spark SQLIntro to Spark and Spark SQL
Intro to Spark and Spark SQLjeykottalam
 

Andere mochten auch (6)

PostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLPostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQL
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
Linking words
Linking wordsLinking words
Linking words
 
Road to Analytics
Road to AnalyticsRoad to Analytics
Road to Analytics
 
Intro to Spark and Spark SQL
Intro to Spark and Spark SQLIntro to Spark and Spark SQL
Intro to Spark and Spark SQL
 

Ähnlich wie Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB

TiConf Australia 2013
TiConf Australia 2013TiConf Australia 2013
TiConf Australia 2013Jeff Haynie
 
Titanium Conf Baltimore Keynote 2013
Titanium Conf Baltimore Keynote 2013Titanium Conf Baltimore Keynote 2013
Titanium Conf Baltimore Keynote 2013Jeff Haynie
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Tugdual Grall
 
Webinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each DayWebinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each DayDataStax
 
ARC202:real world real time analytics
ARC202:real world real time analyticsARC202:real world real time analytics
ARC202:real world real time analyticsSebastian Montini
 
From prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.ioFrom prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.ioMáté Lang
 
Dubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureDubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureHuxing Zhang
 
Cincom Smalltalk: Present, Future & Smalltalk Advocacy
Cincom Smalltalk: Present, Future & Smalltalk AdvocacyCincom Smalltalk: Present, Future & Smalltalk Advocacy
Cincom Smalltalk: Present, Future & Smalltalk AdvocacyESUG
 
TIBCO Advanced Analytics Meetup (TAAM) - June 2015
TIBCO Advanced Analytics Meetup (TAAM) - June 2015TIBCO Advanced Analytics Meetup (TAAM) - June 2015
TIBCO Advanced Analytics Meetup (TAAM) - June 2015Bipin Singh
 
Le big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseLe big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseRubedo, a WebTales solution
 
Presenting Data – An Alternative to the View Control
Presenting Data – An Alternative to the View ControlPresenting Data – An Alternative to the View Control
Presenting Data – An Alternative to the View ControlTeamstudio
 
SOA 11g Upgrade Experience - SNI
SOA 11g Upgrade Experience - SNISOA 11g Upgrade Experience - SNI
SOA 11g Upgrade Experience - SNIjtreague
 
QCon 2015 - Microservices Track Notes
QCon 2015 - Microservices Track Notes QCon 2015 - Microservices Track Notes
QCon 2015 - Microservices Track Notes Abdul Basit Munda
 
Done in 60 seconds - Creating Web 2.0 applications made easy
Done in 60 seconds - Creating Web 2.0 applications made easyDone in 60 seconds - Creating Web 2.0 applications made easy
Done in 60 seconds - Creating Web 2.0 applications made easyRoel Hartman
 
Migration of a high-traffic E-commerce website from Legacy Monolith to Micros...
Migration of a high-traffic E-commerce website from Legacy Monolith to Micros...Migration of a high-traffic E-commerce website from Legacy Monolith to Micros...
Migration of a high-traffic E-commerce website from Legacy Monolith to Micros...Pavel Pratyush
 
U of A Web Strategy and Sitecore
U of A Web Strategy and SitecoreU of A Web Strategy and Sitecore
U of A Web Strategy and SitecoreTim Schneider
 
Hadoop Summit 2016 - Evolution of Big Data Pipelines At Intuit
Hadoop Summit 2016 - Evolution of Big Data Pipelines At IntuitHadoop Summit 2016 - Evolution of Big Data Pipelines At Intuit
Hadoop Summit 2016 - Evolution of Big Data Pipelines At IntuitRekha Joshi
 
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014Amazon Web Services
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldRandy Shoup
 

Ähnlich wie Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB (20)

TiConf Australia 2013
TiConf Australia 2013TiConf Australia 2013
TiConf Australia 2013
 
Titanium Conf Baltimore Keynote 2013
Titanium Conf Baltimore Keynote 2013Titanium Conf Baltimore Keynote 2013
Titanium Conf Baltimore Keynote 2013
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications
 
Webinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each DayWebinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each Day
 
ARC202:real world real time analytics
ARC202:real world real time analyticsARC202:real world real time analytics
ARC202:real world real time analytics
 
From prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.ioFrom prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.io
 
Dubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureDubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architecture
 
Cincom Smalltalk: Present, Future & Smalltalk Advocacy
Cincom Smalltalk: Present, Future & Smalltalk AdvocacyCincom Smalltalk: Present, Future & Smalltalk Advocacy
Cincom Smalltalk: Present, Future & Smalltalk Advocacy
 
TIBCO Advanced Analytics Meetup (TAAM) - June 2015
TIBCO Advanced Analytics Meetup (TAAM) - June 2015TIBCO Advanced Analytics Meetup (TAAM) - June 2015
TIBCO Advanced Analytics Meetup (TAAM) - June 2015
 
Le big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseLe big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entreprise
 
Presenting Data – An Alternative to the View Control
Presenting Data – An Alternative to the View ControlPresenting Data – An Alternative to the View Control
Presenting Data – An Alternative to the View Control
 
Cognos Overview
Cognos Overview Cognos Overview
Cognos Overview
 
SOA 11g Upgrade Experience - SNI
SOA 11g Upgrade Experience - SNISOA 11g Upgrade Experience - SNI
SOA 11g Upgrade Experience - SNI
 
QCon 2015 - Microservices Track Notes
QCon 2015 - Microservices Track Notes QCon 2015 - Microservices Track Notes
QCon 2015 - Microservices Track Notes
 
Done in 60 seconds - Creating Web 2.0 applications made easy
Done in 60 seconds - Creating Web 2.0 applications made easyDone in 60 seconds - Creating Web 2.0 applications made easy
Done in 60 seconds - Creating Web 2.0 applications made easy
 
Migration of a high-traffic E-commerce website from Legacy Monolith to Micros...
Migration of a high-traffic E-commerce website from Legacy Monolith to Micros...Migration of a high-traffic E-commerce website from Legacy Monolith to Micros...
Migration of a high-traffic E-commerce website from Legacy Monolith to Micros...
 
U of A Web Strategy and Sitecore
U of A Web Strategy and SitecoreU of A Web Strategy and Sitecore
U of A Web Strategy and Sitecore
 
Hadoop Summit 2016 - Evolution of Big Data Pipelines At Intuit
Hadoop Summit 2016 - Evolution of Big Data Pipelines At IntuitHadoop Summit 2016 - Evolution of Big Data Pipelines At Intuit
Hadoop Summit 2016 - Evolution of Big Data Pipelines At Intuit
 
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric World
 

Mehr von MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Kürzlich hochgeladen

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Kürzlich hochgeladen (20)

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

Business Track: How Criteo Scaled and Supported Massive Growth with MongoDB

  • 1. How Criteo Scaled and Supported Massive Growth with MongoDB Julien SIMON Vice President, Engineering j.simon@criteo.com @julsimon
  • 2. CRITEO 2 • R&D EFFORT • RETARGETING • CPC PHASE 1 : 2005-2008 CRITEO CREATION • MORE THAN 3000 CLIENTS • 35 COUNTRIES, 15 OFFICES • R&D: MORE THAN 300 PEOPLE PHASE 2 : 2008-2012 GLOBAL LEADER : + 700 EMPLOYEES! 2007 15 EMPLOYEES 2009 84 EMPLOYEES 6 EMPLOYEES 2005 2010 203 EMPLOYEES 2012 +700 EMPLOYEES SO FAR 2006 2011 395 EMPLOYEES 2008 33 EMPLOYEES
  • 3. GLOBAL PRESENCE 3 SYDNEY PARIS LONDON BARCELONA MILAN MUNICH BOSTON NEW YORK SAO PAULO PALO ALTO TOKYO SEOUL STOCKHOLM AMSTERDAM 15 OFFICES, 30+ COUNTRIES CHICAGO
  • 4. GO GOGO Powered by PERFORMANCE DISPLAY Copyright © 2013 Criteo. Confidential A user sees products on your … … and sees After on the banner, the user goes back to the product page. ...then browses the 4
  • 5. REAL-TIME PERSONALIZATION 5Copyright © 2013 Criteo. Confidential. Boutons all original #represent SHOP NOW CouleursFond Disposition WARM MEETS LIGHT SWEET NOTHING ADDIDAS IS ALL IN ALL ORIGINALS #REPRESENT Slogans JOIN NOW SEE MORE CLICK HERE “Call to action” Lien opt-out SEE MORE JOIN NOW SEE MORE CLICK HERE SHOP NOW SHOP NOW JOIN NOW JOIN NOW
  • 6. PREDICTION & RECOMMENDATION 2 CORE TECHNOLOGIES choose the right product to display choose the right users / advertiser / publisher to display RECOMMENDATION ENGINE CTR + CR increase PREDICTION ENGINE
  • 7. INFRASTRUCTURE 7Copyright © 2013 Criteo. Confidential. DAILY TRAFFIC - HTTP REQUESTS: 30+ BILLION - BANNERS SERVED: 1+ BILLION PEAK TRAFFIC (PER SECOND) - HTTP REQUESTS: 500,000+ - BANNERS: 25,000+ 7 DATA CENTERS SET UP AND MANAGED IN-HOUSE AVAILABILITY > 99.95%
  • 8. 8Copyright © 2013 Criteo. Confidential. HIGH PERFORMANCE COMPUTING FETCH, STORE, CRUNCH, QUERY 20 additional TB EVERY DAY ? …SUBTITLED « HOW I LEARNED TO STOP WORRYING AND LOVE HPC »
  • 9. PRODUCT CATALOGUES • Catalogue = product feed provided by advertisers (product id, description, category, price, URL, etc) • 3000+ catalogues, ranging from a few MB to several tens of GB • About 50% of products change every day • Imported at least once a day by an in-house application • Data replicated within a geographical zone • Accessed through a cache layer by web servers • Microsoft SQL Server used from day 1 • Running fine in Europe, but… – Number of databases (1 per advertiser)… and servers – Size of databases – SQL Server issues hard to debug and understand • Running kind of fine in the US, until dead end in Q1 2011 – transactional replication over high latency links Copyright © 2010 Criteo. Confidential.
  • 10. REQUIREMENTS FOR A NEW DB • Scale-out architecture running on commodity hardware (aka « Intel CPUs in metal boxes ») • No transactions needed, eventual consistency OK • High availability • Distributed clusters, with replication over high latency links • Requestable (key-value not enough) • Open source … with active user community … backed by a stable organization with long-term commitment (not one guy in a garage) … no licence fees for production use … commercial support available at reasonable cost • Easy to learn, (re)deploy, monitor and upgrade • « Low maintenance » (don’t need a 10-people team just to run it) • Multi-language support • Ability to export everything to Hadoop multiple times per day Copyright © 2010 Criteo. Confidential.
  • 11. FROM SQL SERVER TO MONGODB • Ah, database migrations… everyone loves them  • 1st step: solve replication issue – Import and replicate catalogues in MongoDB – Push content to SQL Server, still queried by web servers • 2nd step: prove that MongoDB can survive our web traffic – Modify web applications to query MongoDB – C-a-r-e-f-u-l-l-y switch web queries to MongoDB for a small set of catalogues – Observe, measure, A/B test… and generally make sure that the system still works • 3rd step: scale ! – Migrate thousands of catalogues away from SQL Server – Monitor and tweak the MongoDB clusters – Add more MongoDB servers… and more shards – Update ops processes (monitoring, backups, etc) Copyright © 2010 Criteo. Confidential.
  • 12. OUR MONGODB DEPLOYMENT • Europe – 18 3-server shards (1+1+1) – 800M products, 1TB – 1B requests/day (peak at 40K/s) – 350M updates/day (peak at 11K/s) • US – 14 4-server shards (2+2) – 400M products, 650GB • APAC – 12 3-server shards (2+1) – 300M products, 500GB • 146 servers total : 2.0 (+ Criteo patches)  2.2  2.4.3 Copyright © 2010 Criteo. Confidential.
  • 13. MONGODB, 2+ YEARS LATER • Stable (2.4.3 much better) • Easy to (re)install and administer • Great for small datasets (i.e. smaller than server RAM) • Good performance if read/write ratio is high • Failover and inter-DC replication work (but shard early!) • Performance suffers when : – dataset much larger than RAM – read/write ratio is low – Multiple applications coexist on the same cluster • Some scalability issues remain (master-slave, connections) • Criteo is very interested in the 10gen roadmap  Copyright © 2010 Criteo. Confidential.
  • 14. THANKS A LOT FOR YOUR ATTENTION! 14Copyright © 2013 Criteo. Confidential. www.criteo.com engineering.criteo.com

Hinweis der Redaktion

  1. http://www.istockphoto.com/stock-photo-14605073-world-map.php?st=71d31fc
  2. http://www.shutterstock.com/pic.mhtml?id=50969467http://www.shutterstock.com/pic.mhtml?id=67739992
  3. http://www.istockphoto.com/stock-photo-15438323-taxes-due.phphttp://www.shutterstock.com/pic.mhtml?id=106229987http://www.istockphoto.com/stock-photo-11925038-nerdy-office-worker-drinking-coffee.php