SlideShare a Scribd company logo
1 of 16
QUEUE IN THE
CLOUD WITH
MONGODB
MONGODB LA 2013


NURI HALPERIN
QUEUE
USAGE
Ordered execution
Buffering consumer/producer
Work distribution
GOALS OF PROJECT
Leverage Mongo
  • Reduce ops overhead by reusing infrastructure
  • Map queue semantics to Mongo’s strengths

Reliable
  • Durable - support long running process
  • Resilient to machine failure
  • Narrow down window of failure/ data loss.

Centralized, distributed:
  • Multiple producers
  • Multiple consumers
ITERATION 0
Capped collection – not the perfect choice
  • Tailing queue seems attractive, but…
  • Need external sync to avoid double-consume
  • Secondary indexes and updating are anti-pattern

Relaxing FIFO is OK
  • No guarantee that first-popped is first done
  • Multi-client is negated if they have to sync on execution order
  • Race condition for queue insertion has same effect

Conclusion: Project doesn’t use capped collection and
relaxes FIFO.
PARANOID BY DESIGN


 Network dies
                Process dies
                                 DB dies




 Machine dies    Poison letter   Dead letter
ITERATION 1
db.q4foo.save({v:{f:1}})


db.q4foo.findAndModify({query: {}, sort: {_id:1}, remove: true})


Hot: quick and simple
Not: dead client, dead in transit, no trace
ARE WE THERE YET?


 Network dies
                Process dies
                                 DB dies




 Machine dies    Poison letter   Dead letter
QUEUE SEMANTICS
Local / Memory    Distributed
Push              Put
Pop               Get << visibility >>
<< exception >>   Release << retry >>
                  Delete
                  << exception >>
ITERATION 2
 db.q4foo.save({v:{f:1}, dq: null})


 db.q4foo.findAndModify( {
        query: { dq: null},
        sort: {_id:1},
        update:{ $set: { dq: later(60)}}})


 … If processing was success => delete..


 Hot: If client dies, item remains in queue. Data not lost.
 Not: index on _id less useful in high volume.
ARE WE THERE YET?


 Network dies
                Process dies
                                 DB dies




 Machine dies    Poison letter   Dead letter
ITERATION 3
db.q4foo.save({v:{f:1}, dq: null, pc: 0})


db.q4foo.findAndModify({
        query: { dq: null, pc:{$lt:3}},
        sort: {_id:1},
        update:{$set:{dq:later(60)},$inc:{pc:1}}}) // consume


db.q4foo.findAndModify({
        query: {_id:"..."},
        update:{$set:{dq: null}}}) // release


Hot: An item can be retried automatically (pc) after released.
         Exhausted item remains in queue.
Not: Not strict FIFO.
ARE WE THERE? YES.


 Network dies
                Process dies
                                 DB dies




 Machine dies    Poison letter   Dead letter
ITERATION 4

Ensure your queue writes use applicable durability
  • db.q4foo.save() + getLastError(…)
  • db.q4foo.findAndModify () + getLastError(…)

Replica sets for durability only. No capacity or speed gain.
OTHER THOUGHTS
Create admin jobs to monitor queues:
   • Growth
   • Retries exhausted

Consider TTL risks (ex: client failure before calling Release())


Consider idempotent operations when possible


Design clients to back off polling


Separate queue vs. extra “topic” field


Consider dedicated DB for write-lock scope


Capped vs. regular collection – capped now can have _id, in-place update.
Q&A


                      Thank you!



      Nuri Halperin

nuri@plusnconsulting.com

More Related Content

What's hot

DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDYDATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
Malikireddy Bramhananda Reddy
 
mysql 高级优化之 理解索引使用
mysql 高级优化之 理解索引使用mysql 高级优化之 理解索引使用
mysql 高级优化之 理解索引使用
nigel889
 

What's hot (20)

DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDYDATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
 
Functional Reactive Programming with RxJS
Functional Reactive Programming with RxJSFunctional Reactive Programming with RxJS
Functional Reactive Programming with RxJS
 
Spock Framework
Spock FrameworkSpock Framework
Spock Framework
 
Spock Testing Framework - The Next Generation
Spock Testing Framework - The Next GenerationSpock Testing Framework - The Next Generation
Spock Testing Framework - The Next Generation
 
Spock: Test Well and Prosper
Spock: Test Well and ProsperSpock: Test Well and Prosper
Spock: Test Well and Prosper
 
Full Text Search in PostgreSQL
Full Text Search in PostgreSQLFull Text Search in PostgreSQL
Full Text Search in PostgreSQL
 
Python in the database
Python in the databasePython in the database
Python in the database
 
Swift after one week of coding
Swift after one week of codingSwift after one week of coding
Swift after one week of coding
 
Леонид Шевцов «Clojure в деле»
Леонид Шевцов «Clojure в деле»Леонид Шевцов «Clojure в деле»
Леонид Шевцов «Clojure в деле»
 
Concurrency Concepts in Java
Concurrency Concepts in JavaConcurrency Concepts in Java
Concurrency Concepts in Java
 
Celery
CeleryCelery
Celery
 
Data recovery using pg_filedump
Data recovery using pg_filedumpData recovery using pg_filedump
Data recovery using pg_filedump
 
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash courseCodepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
 
RestMQ - HTTP/Redis based Message Queue
RestMQ - HTTP/Redis based Message QueueRestMQ - HTTP/Redis based Message Queue
RestMQ - HTTP/Redis based Message Queue
 
Using Cerberus and PySpark to validate semi-structured datasets
Using Cerberus and PySpark to validate semi-structured datasetsUsing Cerberus and PySpark to validate semi-structured datasets
Using Cerberus and PySpark to validate semi-structured datasets
 
mysql 高级优化之 理解索引使用
mysql 高级优化之 理解索引使用mysql 高级优化之 理解索引使用
mysql 高级优化之 理解索引使用
 
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob Lisi
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob LisiUsing Grafana with InfluxDB 2.0 and Flux Lang by Jacob Lisi
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob Lisi
 
TensorFlow BASTA2018 Machinelearning
TensorFlow BASTA2018 MachinelearningTensorFlow BASTA2018 Machinelearning
TensorFlow BASTA2018 Machinelearning
 
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
 
Gur1009
Gur1009Gur1009
Gur1009
 

Viewers also liked

Прогноз ССВ
Прогноз ССВПрогноз ССВ
Прогноз ССВ
Prognoz
 
NCVPS: Real Life Educational Technology
NCVPS: Real Life Educational TechnologyNCVPS: Real Life Educational Technology
NCVPS: Real Life Educational Technology
comp380edutech
 

Viewers also liked (19)

Портал и отчетность для руководства - Как сделать бизнес прозрачным
Портал и отчетность для руководства - Как сделать бизнес прозрачнымПортал и отчетность для руководства - Как сделать бизнес прозрачным
Портал и отчетность для руководства - Как сделать бизнес прозрачным
 
Ncvps comp 380
Ncvps comp 380Ncvps comp 380
Ncvps comp 380
 
Social media
Social mediaSocial media
Social media
 
20120428 ニコニコ学会β スケルトニクス製作委員会
20120428 ニコニコ学会β スケルトニクス製作委員会20120428 ニコニコ学会β スケルトニクス製作委員会
20120428 ニコニコ学会β スケルトニクス製作委員会
 
Прогноз ССВ
Прогноз ССВПрогноз ССВ
Прогноз ССВ
 
NCVPS: Real Life Educational Technology
NCVPS: Real Life Educational TechnologyNCVPS: Real Life Educational Technology
NCVPS: Real Life Educational Technology
 
Ncvps comp 380
Ncvps comp 380Ncvps comp 380
Ncvps comp 380
 
ICFF 2015 BOOTH PHOTOS
ICFF 2015 BOOTH PHOTOSICFF 2015 BOOTH PHOTOS
ICFF 2015 BOOTH PHOTOS
 
Fortina Installations
Fortina InstallationsFortina Installations
Fortina Installations
 
NCVPS Comp 380
NCVPS Comp 380NCVPS Comp 380
NCVPS Comp 380
 
Fortina Installations
Fortina InstallationsFortina Installations
Fortina Installations
 
Successfully speaking
Successfully speakingSuccessfully speaking
Successfully speaking
 
Population Census Web Access System
Population Census Web Access SystemPopulation Census Web Access System
Population Census Web Access System
 
What is social
What is socialWhat is social
What is social
 
Ebook corel draw x3 lengkap
Ebook corel draw x3 lengkapEbook corel draw x3 lengkap
Ebook corel draw x3 lengkap
 
Mongo db 2.2 aggregation like a champ
Mongo db 2.2 aggregation like a champMongo db 2.2 aggregation like a champ
Mongo db 2.2 aggregation like a champ
 
20121124 ロボコニストカンファ・リトライ 外骨格スーツの創り方01
20121124 ロボコニストカンファ・リトライ 外骨格スーツの創り方0120121124 ロボコニストカンファ・リトライ 外骨格スーツの創り方01
20121124 ロボコニストカンファ・リトライ 外骨格スーツの創り方01
 
51358205 project-on-international-business
51358205 project-on-international-business51358205 project-on-international-business
51358205 project-on-international-business
 
Управление ключевыми показателями деятельности (KPI) - от идеи до воплощения
Управление ключевыми показателями деятельности (KPI) - от идеи до воплощенияУправление ключевыми показателями деятельности (KPI) - от идеи до воплощения
Управление ключевыми показателями деятельности (KPI) - от идеи до воплощения
 

Similar to Queue in the cloud with mongo db

Similar to Queue in the cloud with mongo db (20)

Project Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare MetalProject Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare Metal
 
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran LonikarExploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
 
介绍 Percona 服务器 XtraDB 和 Xtrabackup
介绍 Percona 服务器 XtraDB 和 Xtrabackup介绍 Percona 服务器 XtraDB 和 Xtrabackup
介绍 Percona 服务器 XtraDB 和 Xtrabackup
 
Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.
 
Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
2013 london advanced-replication
2013 london advanced-replication2013 london advanced-replication
2013 london advanced-replication
 
Data herding
Data herdingData herding
Data herding
 
Data herding
Data herdingData herding
Data herding
 
Yevhen Tatarynov "From POC to High-Performance .NET applications"
Yevhen Tatarynov "From POC to High-Performance .NET applications"Yevhen Tatarynov "From POC to High-Performance .NET applications"
Yevhen Tatarynov "From POC to High-Performance .NET applications"
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
 
Tracing the Breadcrumbs: Apache Spark Workload Diagnostics
Tracing the Breadcrumbs: Apache Spark Workload DiagnosticsTracing the Breadcrumbs: Apache Spark Workload Diagnostics
Tracing the Breadcrumbs: Apache Spark Workload Diagnostics
 
NoSQL Infrastructure
NoSQL InfrastructureNoSQL Infrastructure
NoSQL Infrastructure
 
Kerberizing spark. Spark Summit east
Kerberizing spark. Spark Summit eastKerberizing spark. Spark Summit east
Kerberizing spark. Spark Summit east
 
Nyt Prof 200910
Nyt Prof 200910Nyt Prof 200910
Nyt Prof 200910
 
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop ConnectorMongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
 
Data Pipeline at Tapad
Data Pipeline at TapadData Pipeline at Tapad
Data Pipeline at Tapad
 
Handout3o
Handout3oHandout3o
Handout3o
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
 
OSDC 2012 | Scaling with MongoDB by Ross Lawley
OSDC 2012 | Scaling with MongoDB by Ross LawleyOSDC 2012 | Scaling with MongoDB by Ross Lawley
OSDC 2012 | Scaling with MongoDB by Ross Lawley
 

Recently uploaded

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Queue in the cloud with mongo db

  • 1. QUEUE IN THE CLOUD WITH MONGODB MONGODB LA 2013 NURI HALPERIN
  • 4. GOALS OF PROJECT Leverage Mongo • Reduce ops overhead by reusing infrastructure • Map queue semantics to Mongo’s strengths Reliable • Durable - support long running process • Resilient to machine failure • Narrow down window of failure/ data loss. Centralized, distributed: • Multiple producers • Multiple consumers
  • 5. ITERATION 0 Capped collection – not the perfect choice • Tailing queue seems attractive, but… • Need external sync to avoid double-consume • Secondary indexes and updating are anti-pattern Relaxing FIFO is OK • No guarantee that first-popped is first done • Multi-client is negated if they have to sync on execution order • Race condition for queue insertion has same effect Conclusion: Project doesn’t use capped collection and relaxes FIFO.
  • 6. PARANOID BY DESIGN Network dies Process dies DB dies Machine dies Poison letter Dead letter
  • 7. ITERATION 1 db.q4foo.save({v:{f:1}}) db.q4foo.findAndModify({query: {}, sort: {_id:1}, remove: true}) Hot: quick and simple Not: dead client, dead in transit, no trace
  • 8. ARE WE THERE YET? Network dies Process dies DB dies Machine dies Poison letter Dead letter
  • 9. QUEUE SEMANTICS Local / Memory Distributed Push Put Pop Get << visibility >> << exception >> Release << retry >> Delete << exception >>
  • 10. ITERATION 2 db.q4foo.save({v:{f:1}, dq: null}) db.q4foo.findAndModify( { query: { dq: null}, sort: {_id:1}, update:{ $set: { dq: later(60)}}}) … If processing was success => delete.. Hot: If client dies, item remains in queue. Data not lost. Not: index on _id less useful in high volume.
  • 11. ARE WE THERE YET? Network dies Process dies DB dies Machine dies Poison letter Dead letter
  • 12. ITERATION 3 db.q4foo.save({v:{f:1}, dq: null, pc: 0}) db.q4foo.findAndModify({ query: { dq: null, pc:{$lt:3}}, sort: {_id:1}, update:{$set:{dq:later(60)},$inc:{pc:1}}}) // consume db.q4foo.findAndModify({ query: {_id:"..."}, update:{$set:{dq: null}}}) // release Hot: An item can be retried automatically (pc) after released. Exhausted item remains in queue. Not: Not strict FIFO.
  • 13. ARE WE THERE? YES. Network dies Process dies DB dies Machine dies Poison letter Dead letter
  • 14. ITERATION 4 Ensure your queue writes use applicable durability • db.q4foo.save() + getLastError(…) • db.q4foo.findAndModify () + getLastError(…) Replica sets for durability only. No capacity or speed gain.
  • 15. OTHER THOUGHTS Create admin jobs to monitor queues: • Growth • Retries exhausted Consider TTL risks (ex: client failure before calling Release()) Consider idempotent operations when possible Design clients to back off polling Separate queue vs. extra “topic” field Consider dedicated DB for write-lock scope Capped vs. regular collection – capped now can have _id, in-place update.
  • 16. Q&A Thank you! Nuri Halperin nuri@plusnconsulting.com

Editor's Notes

  1. Queue holds elements until they are required. Items in the queue are accessed from the head of the queue only – implied order.
  2. Add capped collection