When you're building a web application, you want to respond to every request as quickly as possible. The usual approach is to use an asynchronous job queue like Sidekiq, Resque, Celery, RQ, or a number of other frameworks to handle those tasks outside the request/response cycle in a separate 'worker' process. Unfortunately, many of these frameworks either require the deployment of Redis, RabbitMQ, or some other request broker, or they resort to polling a database for new work to do. Chapman is a distributed task queue built on MongoDB that avoids gratuitous polling, using tailable cursors with the oplog to provide notifications of incoming work. Inspired by Celery, Chapman also supports task graphs, where multiple tasks that depend on each other can be executed by the system asynchronously. Come learn how Synapp.io is using MongoDB and Chapman to handle its core data processing needs.
11. @rick446 @synappio
WhatYou’ll Learn
How to…
Build a task queue in MongoDB
Bring consistency to distributed systems
(without transactions)
Build low-latency reactive systems
79. @rick446 @synappio
Approach 2
Worker Identity
Good
• Worker failure
handled via out-
of-band detection
of live workers
• Handles slow
workers
Bad
80. @rick446 @synappio
Approach 2
Worker Identity
Good
• Worker failure
handled via out-
of-band detection
of live workers
• Handles slow
workers
Bad
• Requires periodic
“unlock” task
81. @rick446 @synappio
Approach 2
Worker Identity
Good
• Worker failure
handled via out-
of-band detection
of live workers
• Handles slow
workers
Bad
• Requires periodic
“unlock” task
• Unlock updates
can be slow
86. @rick446 @synappio
Semaphores
• Some services perform connection-
throttling (e.g. Mailchimp)
• Some services just have a hard time with
144 threads hitting them simultaneously
87. @rick446 @synappio
Semaphores
• Some services perform connection-
throttling (e.g. Mailchimp)
• Some services just have a hard time with
144 threads hitting them simultaneously
• Need a way to limit our concurrency
90. @rick446 @synappio
Semaphores
Semaphore
Active: msg1, msg2, msg3, …
Capacity: 16
Queued: msg17, msg18, msg19, …
• Keep active and queued messages in arrays
• Releasing the semaphore makes queued
messages available for dispatch
91. @rick446 @synappio
Semaphores
Semaphore
Active: msg1, msg2, msg3, …
Capacity: 16
Queued: msg17, msg18, msg19, …
• Keep active and queued messages in arrays
• Releasing the semaphore makes queued
messages available for dispatch
• Use $slice (2.6) to keep arrays the right
size
127. @rick446 @synappio
Managing Latency
• Reserving messages is expensive
• Use Pub/Sub system instead
• Publish to the channel whenever a
message is ready to be handled
• Each worker subscribes to the channel
• Workers only ‘poll’ when they have a
chance of getting work
134. @rick446 @synappio
Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
135. @rick446 @synappio
Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
Make cursor tailable
136. @rick446 @synappio
Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
Holds open cursor for a
while
Make cursor tailable
137. @rick446 @synappio
Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
Holds open cursor for a
while
Make cursor tailable
Don’t use indexes
138. @rick446 @synappio
Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
import re, time!
while True:!
cur = get_cursor(!
db.capped_collection, !
re.compile('^foo'), !
await_data=True)!
for msg in cur:!
do_something(msg)!
time.sleep(0.1)
Holds open cursor for a
while
Make cursor tailable
Don’t use indexes
139. @rick446 @synappio
Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
import re, time!
while True:!
cur = get_cursor(!
db.capped_collection, !
re.compile('^foo'), !
await_data=True)!
for msg in cur:!
do_something(msg)!
time.sleep(0.1)
Holds open cursor for a
while
Make cursor tailable
Don’t use indexes
Still some polling when
no producer, so don’t
spin too fast
140. @rick446 @synappio
Building in retry...
def get_cursor(collection, topic_re, last_id=-1, await_data=True):!
options = { 'tailable': True }!
spec = { !
'id': { '$gt': last_id }, # only new messages!
'k': topic_re }!
if await_data:!
options['await_data'] = True!
cur = collection.find(spec, **options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
141. @rick446 @synappio
Building in retry...
def get_cursor(collection, topic_re, last_id=-1, await_data=True):!
options = { 'tailable': True }!
spec = { !
'id': { '$gt': last_id }, # only new messages!
'k': topic_re }!
if await_data:!
options['await_data'] = True!
cur = collection.find(spec, **options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
Integer autoincrement
“id”
142. @rick446 @synappio
Ludicrous Speed
from pymongo.cursor import _QUERY_OPTIONS!
!
def get_cursor(collection, topic_re, last_id=-1, await_data=True):!
options = { 'tailable': True }!
spec = { !
'ts': { '$gt': last_id }, # only new messages!
'k': topic_re }!
if await_data:!
options['await_data'] = True!
cur = collection.find(spec, **options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
if await:!
cur = cur.add_option(_QUERY_OPTIONS['oplog_replay'])!
return cur
143. @rick446 @synappio
Ludicrous Speed
from pymongo.cursor import _QUERY_OPTIONS!
!
def get_cursor(collection, topic_re, last_id=-1, await_data=True):!
options = { 'tailable': True }!
spec = { !
'ts': { '$gt': last_id }, # only new messages!
'k': topic_re }!
if await_data:!
options['await_data'] = True!
cur = collection.find(spec, **options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
if await:!
cur = cur.add_option(_QUERY_OPTIONS['oplog_replay'])!
return cur
id ==> ts
144. @rick446 @synappio
Ludicrous Speed
from pymongo.cursor import _QUERY_OPTIONS!
!
def get_cursor(collection, topic_re, last_id=-1, await_data=True):!
options = { 'tailable': True }!
spec = { !
'ts': { '$gt': last_id }, # only new messages!
'k': topic_re }!
if await_data:!
options['await_data'] = True!
cur = collection.find(spec, **options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
if await:!
cur = cur.add_option(_QUERY_OPTIONS['oplog_replay'])!
return cur
id ==> ts
Co-opt the
oplog_replay
option
145. @rick446 @synappio
The Oplog
• Capped collection that records all
operations for replication
• Includes a ‘ts’ field suitable for oplog_replay
• Does not require a separate publish
operation (all changes are automatically
“published”)
146. @rick446 @synappio
Using the Oplog
def oplog_await(oplog, spec):!
'''Await the very next message on the oplog satisfying the spec'''!
last = oplog.find_one(spec, sort=[('$natural', -1)])!
if last is None:!
return # Can't await unless there is an existing message satisfying spec!
await_spec = dict(spec)!
last_ts = last['ts']!
await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}!
curs = oplog.find(await_spec, tailable=True, await_data=True)!
curs = curs.hint([('$natural', 1)])!
curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])!
curs.next() # should always find 1 element!
try:!
return curs.next()!
except StopIteration:!
return None
147. @rick446 @synappio
Using the Oplog
def oplog_await(oplog, spec):!
'''Await the very next message on the oplog satisfying the spec'''!
last = oplog.find_one(spec, sort=[('$natural', -1)])!
if last is None:!
return # Can't await unless there is an existing message satisfying spec!
await_spec = dict(spec)!
last_ts = last['ts']!
await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}!
curs = oplog.find(await_spec, tailable=True, await_data=True)!
curs = curs.hint([('$natural', 1)])!
curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])!
curs.next() # should always find 1 element!
try:!
return curs.next()!
except StopIteration:!
return None
most recent oplog entry
148. @rick446 @synappio
Using the Oplog
def oplog_await(oplog, spec):!
'''Await the very next message on the oplog satisfying the spec'''!
last = oplog.find_one(spec, sort=[('$natural', -1)])!
if last is None:!
return # Can't await unless there is an existing message satisfying spec!
await_spec = dict(spec)!
last_ts = last['ts']!
await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}!
curs = oplog.find(await_spec, tailable=True, await_data=True)!
curs = curs.hint([('$natural', 1)])!
curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])!
curs.next() # should always find 1 element!
try:!
return curs.next()!
except StopIteration:!
return None
most recent oplog entry
finds most recent plus
following entries
149. @rick446 @synappio
Using the Oplog
def oplog_await(oplog, spec):!
'''Await the very next message on the oplog satisfying the spec'''!
last = oplog.find_one(spec, sort=[('$natural', -1)])!
if last is None:!
return # Can't await unless there is an existing message satisfying spec!
await_spec = dict(spec)!
last_ts = last['ts']!
await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}!
curs = oplog.find(await_spec, tailable=True, await_data=True)!
curs = curs.hint([('$natural', 1)])!
curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])!
curs.next() # should always find 1 element!
try:!
return curs.next()!
except StopIteration:!
return None
most recent oplog entry
finds most recent plus
following entries
skip most recent
150. @rick446 @synappio
Using the Oplog
def oplog_await(oplog, spec):!
'''Await the very next message on the oplog satisfying the spec'''!
last = oplog.find_one(spec, sort=[('$natural', -1)])!
if last is None:!
return # Can't await unless there is an existing message satisfying spec!
await_spec = dict(spec)!
last_ts = last['ts']!
await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}!
curs = oplog.find(await_spec, tailable=True, await_data=True)!
curs = curs.hint([('$natural', 1)])!
curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])!
curs.next() # should always find 1 element!
try:!
return curs.next()!
except StopIteration:!
return None
most recent oplog entry
finds most recent plus
following entries
skip most recent
return on anything new
155. @rick446 @synappio
What We’ve Learned
How to…
Build a task queue in MongoDB
Bring consistency to distributed systems
(without transactions)
156. @rick446 @synappio
What We’ve Learned
How to…
Build a task queue in MongoDB
Bring consistency to distributed systems
(without transactions)
157. @rick446 @synappio
What We’ve Learned
How to…
Build a task queue in MongoDB
Bring consistency to distributed systems
(without transactions)
Build low-latency reactive systems
163. @rick446 @synappio
Tips
• findAndModify is ideal for queues
• Atomic update + compensation brings
consistency to your distributed system
• Use the oplog to build reactive, low-latency
systems