SlideShare ist ein Scribd-Unternehmen logo
1 von 164
Downloaden Sie, um offline zu lesen
Chapman: Building a
Distributed Job Queue
in MongoDB
Rick Copeland	

@rick446 @synappio	

rick@synapp.io
@rick446 @synappio	

Getting to Know One
Another
@rick446 @synappio	

Getting to Know One
Another
Rick
@rick446 @synappio	

Getting to Know One
Another
Rick
@rick446 @synappio	

WhatYou’ll Learn
@rick446 @synappio	

WhatYou’ll Learn
How to…
@rick446 @synappio	

WhatYou’ll Learn
How to…
Build a task queue in MongoDB
@rick446 @synappio	

WhatYou’ll Learn
How to…
Build a task queue in MongoDB
@rick446 @synappio	

WhatYou’ll Learn
How to…
Build a task queue in MongoDB
Bring consistency to distributed systems
(without transactions)
@rick446 @synappio	

WhatYou’ll Learn
How to…
Build a task queue in MongoDB
Bring consistency to distributed systems
(without transactions)
@rick446 @synappio	

WhatYou’ll Learn
How to…
Build a task queue in MongoDB
Bring consistency to distributed systems
(without transactions)
Build low-latency reactive systems
@rick446 @synappio	

Why a Queue?
@rick446 @synappio	

Why a Queue?
• Long-running task (or longer than the web
can wait)
@rick446 @synappio	

Why a Queue?
• Long-running task (or longer than the web
can wait)
• Farm out chunks of work for performance
@rick446 @synappio	

Things I Worry About
@rick446 @synappio	

Things I Worry About
• Priority
@rick446 @synappio	

Things I Worry About
• Priority
• Latency
@rick446 @synappio	

Things I Worry About
• Priority
• Latency
• Unreliable workers
@rick446 @synappio	

Queue Options
@rick446 @synappio	

Queue Options
• SQS? No priority
@rick446 @synappio	

Queue Options
• SQS? No priority
• Redis? Can’t overflow memory
@rick446 @synappio	

Queue Options
• SQS? No priority
• Redis? Can’t overflow memory
• Rabbit-MQ? Lack of visibility
@rick446 @synappio	

Queue Options
• SQS? No priority
• Redis? Can’t overflow memory
• Rabbit-MQ? Lack of visibility
• ZeroMQ? Lack of persistence
@rick446 @synappio	

Queue Options
• SQS? No priority
• Redis? Can’t overflow memory
• Rabbit-MQ? Lack of visibility
• ZeroMQ? Lack of persistence
• What about MongoDB?
@rick446 @synappio	

Chapman
Graham Arthur Chapman	

8 January 1941 – 4 October 1989
@rick446 @synappio	

Roadmap
@rick446 @synappio	

Roadmap
• Building a scheduled priority queue
@rick446 @synappio	

Roadmap
• Building a scheduled priority queue
• Handling unreliable workers
@rick446 @synappio	

Roadmap
• Building a scheduled priority queue
• Handling unreliable workers
• Shared resources
@rick446 @synappio	

Roadmap
• Building a scheduled priority queue
• Handling unreliable workers
• Shared resources
• Managing Latency
@rick446 @synappio	

Building a Scheduled
Priority Queue
@rick446 @synappio	

Step 1: Simple Queue
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")!
}!
});!
!
db.message.ensureIndex({'s.status': 1, 's.ts_enqueue': 1});!
!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.ts_enqueue': 1},!
update: { '$set': {'s.status': 'reserved'} },!
}!
);!
@rick446 @synappio	

Step 1: Simple Queue
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")!
}!
});!
!
db.message.ensureIndex({'s.status': 1, 's.ts_enqueue': 1});!
!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.ts_enqueue': 1},!
update: { '$set': {'s.status': 'reserved'} },!
}!
);!
FIFO
@rick446 @synappio	

Step 1: Simple Queue
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")!
}!
});!
!
db.message.ensureIndex({'s.status': 1, 's.ts_enqueue': 1});!
!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.ts_enqueue': 1},!
update: { '$set': {'s.status': 'reserved'} },!
}!
);!
FIFO
Get earliest
message for
processing
@rick446 @synappio	

Step 1: Simple Queue
@rick446 @synappio	

Step 1: Simple Queue
Good
@rick446 @synappio	

Step 1: Simple Queue
Good
• Guaranteed FIFO
@rick446 @synappio	

Step 1: Simple Queue
Good
• Guaranteed FIFO
Bad
@rick446 @synappio	

Step 1: Simple Queue
Good
• Guaranteed FIFO
Bad
• No priority
(other than FIFO)
@rick446 @synappio	

Step 1: Simple Queue
Good
• Guaranteed FIFO
Bad
• No priority
(other than FIFO)
• No handling of
worker problems
@rick446 @synappio	

Step 2: Scheduled
Messages
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
“ts_after" : ISODate(…),!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")!
}!
});!
!
db.message.ensureIndex(!
{'s.status': 1, 's.ts_enqueue': 1});!
!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready', ’s.ts_after': {$lt: now }},!
sort: {'s.ts_enqueue': 1},!
update: { '$set': {'s.status': 'reserved'} },!
}!
);
@rick446 @synappio	

Step 2: Scheduled
Messages
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
“ts_after" : ISODate(…),!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")!
}!
});!
!
db.message.ensureIndex(!
{'s.status': 1, 's.ts_enqueue': 1});!
!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready', ’s.ts_after': {$lt: now }},!
sort: {'s.ts_enqueue': 1},!
update: { '$set': {'s.status': 'reserved'} },!
}!
);
MinValid Time
@rick446 @synappio	

Step 2: Scheduled
Messages
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
“ts_after" : ISODate(…),!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")!
}!
});!
!
db.message.ensureIndex(!
{'s.status': 1, 's.ts_enqueue': 1});!
!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready', ’s.ts_after': {$lt: now }},!
sort: {'s.ts_enqueue': 1},!
update: { '$set': {'s.status': 'reserved'} },!
}!
);
MinValid Time
Get earliest
message for
processing
@rick446 @synappio	

Step 2: Scheduled
Messages
@rick446 @synappio	

Step 2: Scheduled
Messages
Good
@rick446 @synappio	

Step 2: Scheduled
Messages
Good
• Easy to build
periodic tasks
@rick446 @synappio	

Step 2: Scheduled
Messages
Good
• Easy to build
periodic tasks
Bad
@rick446 @synappio	

Step 2: Scheduled
Messages
Good
• Easy to build
periodic tasks
Bad
• Be careful with
the word “now”
@rick446 @synappio	

Step 3: Priority
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
"pri": 30128,!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")!
}!
});!
!
db.message.ensureIndex({'s.status': 1, 's.pri': -1, 's.ts_enqueue': 1});!
!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.pri': -1, 's.ts_enqueue': 1},!
update: { '$set': {'s.status': 'reserved'} },!
}!
);
@rick446 @synappio	

Step 3: Priority
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
"pri": 30128,!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")!
}!
});!
!
db.message.ensureIndex({'s.status': 1, 's.pri': -1, 's.ts_enqueue': 1});!
!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.pri': -1, 's.ts_enqueue': 1},!
update: { '$set': {'s.status': 'reserved'} },!
}!
);
Add Priority
@rick446 @synappio	

Step 3: Priority
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
"pri": 30128,!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")!
}!
});!
!
db.message.ensureIndex({'s.status': 1, 's.pri': -1, 's.ts_enqueue': 1});!
!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.pri': -1, 's.ts_enqueue': 1},!
update: { '$set': {'s.status': 'reserved'} },!
}!
);
Add Priority
@rick446 @synappio	

Step 3: Priority
@rick446 @synappio	

Step 3: Priority
Good
@rick446 @synappio	

Step 3: Priority
Good
• Priorities are
handled
@rick446 @synappio	

Step 3: Priority
Good
• Priorities are
handled
• Guaranteed FIFO
within a priority
@rick446 @synappio	

Step 3: Priority
Good
• Priorities are
handled
• Guaranteed FIFO
within a priority
Bad
@rick446 @synappio	

Step 3: Priority
Good
• Priorities are
handled
• Guaranteed FIFO
within a priority
Bad
• No handling of
worker
problems
@rick446 @synappio	

Handling Unreliable
Workers
@rick446 @synappio	

Approach 1	

Timeouts
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
"pri": 30128,!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z"),!
"ts_timeout" : ISODate("2025-01-01T00:00:00.000Z")!
}!
});!
!
db.message.ensureIndex({“s.status": 1, “s.ts_timeout": 1})!
!
@rick446 @synappio	

Approach 1	

Timeouts
db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
"pri": 30128,!
"ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z"),!
"ts_timeout" : ISODate("2025-01-01T00:00:00.000Z")!
}!
});!
!
db.message.ensureIndex({“s.status": 1, “s.ts_timeout": 1})!
!
Far-future
placeholder
@rick446 @synappio	

// Reserve message!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.pri': -1, 's.ts_enqueue': 1},!
update: { '$set': {!
's.status': 'reserved',!
's.ts_timeout': now + processing_time } }!
}!
);!
!
// Timeout message ("unlock")!
db.message.update(!
{'s.ts_status': 'reserved', 's.ts_timeout': {'$lt': now}},!
{'$set': {'s.status': 'ready'}},!
{'multi': true});
Approach 1	

Timeouts
@rick446 @synappio	

// Reserve message!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.pri': -1, 's.ts_enqueue': 1},!
update: { '$set': {!
's.status': 'reserved',!
's.ts_timeout': now + processing_time } }!
}!
);!
!
// Timeout message ("unlock")!
db.message.update(!
{'s.ts_status': 'reserved', 's.ts_timeout': {'$lt': now}},!
{'$set': {'s.status': 'ready'}},!
{'multi': true});
Client sets
timeout
Approach 1	

Timeouts
@rick446 @synappio	

Approach 1	

Timeouts
@rick446 @synappio	

Approach 1	

Timeouts
Good
@rick446 @synappio	

Approach 1	

Timeouts
Good
• Worker failure
handled via
timeout
@rick446 @synappio	

Approach 1	

Timeouts
Good
• Worker failure
handled via
timeout
Bad
@rick446 @synappio	

Approach 1	

Timeouts
Good
• Worker failure
handled via
timeout
Bad
• Requires periodic
“unlock” task
@rick446 @synappio	

Approach 1	

Timeouts
Good
• Worker failure
handled via
timeout
Bad
• Requires periodic
“unlock” task
• Slow (but “live”)
workers can
cause spurious
timeouts
@rick446 @synappio	

db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
"pri": 30128,!
"cli": "--------------------------"!
"ts_enqueue" : ISODate("2015-03-02T..."),!
"ts_timeout" : ISODate("2025-...")!
}!
});
Approach 2	

Worker Identity
@rick446 @synappio	

db.message.insert({!
"_id" : NumberLong("3784707300388732067"),!
"data" : BinData(...),!
"s" : {!
"status" : "ready",!
"pri": 30128,!
"cli": "--------------------------"!
"ts_enqueue" : ISODate("2015-03-02T..."),!
"ts_timeout" : ISODate("2025-...")!
}!
});
Client / worker
placeholder
Approach 2	

Worker Identity
@rick446 @synappio	

// Reserve message!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.pri': -1, 's.ts_enqueue': 1},!
update: { '$set': {!
's.status': 'reserved',!
's.cli': ‘client_name:pid',!
's.ts_timeout': now + processing_time } }!
}!
);!
!
// Unlock “dead” client messages!
db.message.update(!
{'s.status': 'reserved', !
's.cli': {'$nin': active_clients} },!
{'$set': {'s.status': 'ready'}},!
{'multi': true});!
Approach 2	

Worker Identity
@rick446 @synappio	

// Reserve message!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.pri': -1, 's.ts_enqueue': 1},!
update: { '$set': {!
's.status': 'reserved',!
's.cli': ‘client_name:pid',!
's.ts_timeout': now + processing_time } }!
}!
);!
!
// Unlock “dead” client messages!
db.message.update(!
{'s.status': 'reserved', !
's.cli': {'$nin': active_clients} },!
{'$set': {'s.status': 'ready'}},!
{'multi': true});!
Mark the worker
who reserved the
message
Approach 2	

Worker Identity
@rick446 @synappio	

// Reserve message!
db.runCommand(!
{!
findAndModify: "message",!
query: { 's.status': 'ready' },!
sort: {'s.pri': -1, 's.ts_enqueue': 1},!
update: { '$set': {!
's.status': 'reserved',!
's.cli': ‘client_name:pid',!
's.ts_timeout': now + processing_time } }!
}!
);!
!
// Unlock “dead” client messages!
db.message.update(!
{'s.status': 'reserved', !
's.cli': {'$nin': active_clients} },!
{'$set': {'s.status': 'ready'}},!
{'multi': true});!
Mark the worker
who reserved the
message
Messages reserved by
dead workers are
unlocked
Approach 2	

Worker Identity
@rick446 @synappio	

Approach 2	

Worker Identity
@rick446 @synappio	

Approach 2	

Worker Identity
Good
@rick446 @synappio	

Approach 2	

Worker Identity
Good
• Worker failure
handled via out-
of-band detection
of live workers
@rick446 @synappio	

Approach 2	

Worker Identity
Good
• Worker failure
handled via out-
of-band detection
of live workers
• Handles slow
workers
@rick446 @synappio	

Approach 2	

Worker Identity
Good
• Worker failure
handled via out-
of-band detection
of live workers
• Handles slow
workers
@rick446 @synappio	

Approach 2	

Worker Identity
Good
• Worker failure
handled via out-
of-band detection
of live workers
• Handles slow
workers
Bad
@rick446 @synappio	

Approach 2	

Worker Identity
Good
• Worker failure
handled via out-
of-band detection
of live workers
• Handles slow
workers
Bad
• Requires periodic
“unlock” task
@rick446 @synappio	

Approach 2	

Worker Identity
Good
• Worker failure
handled via out-
of-band detection
of live workers
• Handles slow
workers
Bad
• Requires periodic
“unlock” task
• Unlock updates
can be slow
@rick446 @synappio	

Shared Resources
@rick446 @synappio	

Complex Tasks
Group
check_smtp
Analyze	

Results
Update	

Reports
Pipeline
@rick446 @synappio	

Semaphores
@rick446 @synappio	

Semaphores
• Some services perform connection-
throttling (e.g. Mailchimp)
@rick446 @synappio	

Semaphores
• Some services perform connection-
throttling (e.g. Mailchimp)
• Some services just have a hard time with
144 threads hitting them simultaneously
@rick446 @synappio	

Semaphores
• Some services perform connection-
throttling (e.g. Mailchimp)
• Some services just have a hard time with
144 threads hitting them simultaneously
• Need a way to limit our concurrency
@rick446 @synappio	

Semaphores
Semaphore
Active: msg1, msg2, msg3, …
Capacity: 16
Queued: msg17, msg18, msg19, …
@rick446 @synappio	

Semaphores
Semaphore
Active: msg1, msg2, msg3, …
Capacity: 16
Queued: msg17, msg18, msg19, …
• Keep active and queued messages in arrays
@rick446 @synappio	

Semaphores
Semaphore
Active: msg1, msg2, msg3, …
Capacity: 16
Queued: msg17, msg18, msg19, …
• Keep active and queued messages in arrays
• Releasing the semaphore makes queued
messages available for dispatch
@rick446 @synappio	

Semaphores
Semaphore
Active: msg1, msg2, msg3, …
Capacity: 16
Queued: msg17, msg18, msg19, …
• Keep active and queued messages in arrays
• Releasing the semaphore makes queued
messages available for dispatch
• Use $slice (2.6) to keep arrays the right
size
@rick446 @synappio	

Semaphores:Acquire
db.semaphore.insert({!
'_id': 'semaphore-name',!
'value': 16,!
'active': [],!
'queued': []});!
!
def acquire(sem_id, msg_id, sem_size):!
sem = db.semaphore.find_and_modify(!
{'_id': sem_id},!
update={'$push': {!
'active': {!
'$each': [msg_id], !
'$slice': sem_size},!
'queued': msg_id}},!
new=True)!
if msg_id in sem['active']:!
db.semaphore.update(!
{'_id': 'semaphore-name'},!
{'$pull': {'queued': msg_id}})!
return True!
return False
@rick446 @synappio	

Semaphores:Acquire
db.semaphore.insert({!
'_id': 'semaphore-name',!
'value': 16,!
'active': [],!
'queued': []});!
!
def acquire(sem_id, msg_id, sem_size):!
sem = db.semaphore.find_and_modify(!
{'_id': sem_id},!
update={'$push': {!
'active': {!
'$each': [msg_id], !
'$slice': sem_size},!
'queued': msg_id}},!
new=True)!
if msg_id in sem['active']:!
db.semaphore.update(!
{'_id': 'semaphore-name'},!
{'$pull': {'queued': msg_id}})!
return True!
return False
Pessimistic
update
@rick446 @synappio	

Semaphores:Acquire
db.semaphore.insert({!
'_id': 'semaphore-name',!
'value': 16,!
'active': [],!
'queued': []});!
!
def acquire(sem_id, msg_id, sem_size):!
sem = db.semaphore.find_and_modify(!
{'_id': sem_id},!
update={'$push': {!
'active': {!
'$each': [msg_id], !
'$slice': sem_size},!
'queued': msg_id}},!
new=True)!
if msg_id in sem['active']:!
db.semaphore.update(!
{'_id': 'semaphore-name'},!
{'$pull': {'queued': msg_id}})!
return True!
return False
Pessimistic
update
Compensation
@rick446 @synappio	

Semaphores: Release
def release(sem_id, msg_id, sem_size):!
sem = db.semaphore.find_and_modify(!
{'_id': sem_id},!
update={'$pull': {!
'active': msg_id, !
'queued': msg_id}},!
new=True)!
!
while len(sem['active']) < sem_size and sem['queued']:!
wake_msg_ids = sem['queued'][:sem_size]!
updated = self.cls.m.find_and_modify(!
{'_id': sem_id},!
update={'$pullAll': {'queued': wake_msg_ids}},!
new=True)!
for msgid in wake_msg_ids:!
make_dispatchable(msgid)!
sem = updated
@rick446 @synappio	

Semaphores: Release
def release(sem_id, msg_id, sem_size):!
sem = db.semaphore.find_and_modify(!
{'_id': sem_id},!
update={'$pull': {!
'active': msg_id, !
'queued': msg_id}},!
new=True)!
!
while len(sem['active']) < sem_size and sem['queued']:!
wake_msg_ids = sem['queued'][:sem_size]!
updated = self.cls.m.find_and_modify(!
{'_id': sem_id},!
update={'$pullAll': {'queued': wake_msg_ids}},!
new=True)!
for msgid in wake_msg_ids:!
make_dispatchable(msgid)!
sem = updated
Actually release
@rick446 @synappio	

Semaphores: Release
def release(sem_id, msg_id, sem_size):!
sem = db.semaphore.find_and_modify(!
{'_id': sem_id},!
update={'$pull': {!
'active': msg_id, !
'queued': msg_id}},!
new=True)!
!
while len(sem['active']) < sem_size and sem['queued']:!
wake_msg_ids = sem['queued'][:sem_size]!
updated = self.cls.m.find_and_modify(!
{'_id': sem_id},!
update={'$pullAll': {'queued': wake_msg_ids}},!
new=True)!
for msgid in wake_msg_ids:!
make_dispatchable(msgid)!
sem = updated
Actually release
Awaken queued
message(s)
@rick446 @synappio	

Semaphores: Release
def release(sem_id, msg_id, sem_size):!
sem = db.semaphore.find_and_modify(!
{'_id': sem_id},!
update={'$pull': {!
'active': msg_id, !
'queued': msg_id}},!
new=True)!
!
while len(sem['active']) < sem_size and sem['queued']:!
wake_msg_ids = sem['queued'][:sem_size]!
updated = self.cls.m.find_and_modify(!
{'_id': sem_id},!
update={'$pullAll': {'queued': wake_msg_ids}},!
new=True)!
for msgid in wake_msg_ids:!
make_dispatchable(msgid)!
sem = updated
Actually release
Awaken queued
message(s)
Some magic
(covered later)
@rick446 @synappio	

Message States
ready
acquirequeued
busy
@rick446 @synappio	

Message States
ready
acquirequeued
busy
• Reserve the message
@rick446 @synappio	

Message States
ready
acquirequeued
busy
• Reserve the message
• Acquire resources
@rick446 @synappio	

Message States
ready
acquirequeued
busy
• Reserve the message
• Acquire resources
• Process the message
@rick446 @synappio	

Message States
ready
acquirequeued
busy
• Reserve the message
• Acquire resources
• Process the message
• Release resources
@rick446 @synappio	

Reserve a Message
msg = db.message.find_and_modify(!
{'s.status': 'ready'},!
sort=[('s.sub_status', -1), ('s.pri', -1), ('s.ts', 1)],!
update={'$set': {'s.w': worker, 's.status': 'acquire'}},!
new=True)
message.s == {!
pri: 10,!
semaphores: ['foo'],!
status: 'ready',!
sub_status: 0,!
w: '----------',!
...}
message.s == {!
pri: 10,!
semaphores: ['foo'],!
status: 'acquire!
sub_status: 0,!
w: worker,!
...}
@rick446 @synappio	

Reserve a Message
msg = db.message.find_and_modify(!
{'s.status': 'ready'},!
sort=[('s.sub_status', -1), ('s.pri', -1), ('s.ts', 1)],!
update={'$set': {'s.w': worker, 's.status': 'acquire'}},!
new=True)
message.s == {!
pri: 10,!
semaphores: ['foo'],!
status: 'ready',!
sub_status: 0,!
w: '----------',!
...}
Required semaphores
message.s == {!
pri: 10,!
semaphores: ['foo'],!
status: 'acquire!
sub_status: 0,!
w: worker,!
...}
@rick446 @synappio	

Reserve a Message
msg = db.message.find_and_modify(!
{'s.status': 'ready'},!
sort=[('s.sub_status', -1), ('s.pri', -1), ('s.ts', 1)],!
update={'$set': {'s.w': worker, 's.status': 'acquire'}},!
new=True)
message.s == {!
pri: 10,!
semaphores: ['foo'],!
status: 'ready',!
sub_status: 0,!
w: '----------',!
...}
Required semaphores
# semaphores acquired
message.s == {!
pri: 10,!
semaphores: ['foo'],!
status: 'acquire!
sub_status: 0,!
w: worker,!
...}
@rick446 @synappio	

Reserve a Message
msg = db.message.find_and_modify(!
{'s.status': 'ready'},!
sort=[('s.sub_status', -1), ('s.pri', -1), ('s.ts', 1)],!
update={'$set': {'s.w': worker, 's.status': 'acquire'}},!
new=True)
message.s == {!
pri: 10,!
semaphores: ['foo'],!
status: 'ready',!
sub_status: 0,!
w: '----------',!
...}
Required semaphores
# semaphores acquired
message.s == {!
pri: 10,!
semaphores: ['foo'],!
status: 'acquire!
sub_status: 0,!
w: worker,!
...}
Prefer partially-acquired messages
@rick446 @synappio	

Acquire Resources
def acquire_resources(msg):!
for i, sem_id in enumerate(msg['s']['semaphores']):!
if i < msg['sub_status']: # already acquired!
continue!
sem = db.semaphore.find_one({'_id': 'sem_id'})!
if try_acquire_resource(sem_id, msg['_id'], sem['value']):!
db.message.update(!
{'_id': msg['_id']}, {'$set': {'s.sub_status': i}})!
else:!
return False!
db.message.update(!
{'_id': msg['_id']}, {'$set': {'s.status': 'busy'}})!
return True
@rick446 @synappio	

Acquire Resources
def acquire_resources(msg):!
for i, sem_id in enumerate(msg['s']['semaphores']):!
if i < msg['sub_status']: # already acquired!
continue!
sem = db.semaphore.find_one({'_id': 'sem_id'})!
if try_acquire_resource(sem_id, msg['_id'], sem['value']):!
db.message.update(!
{'_id': msg['_id']}, {'$set': {'s.sub_status': i}})!
else:!
return False!
db.message.update(!
{'_id': msg['_id']}, {'$set': {'s.status': 'busy'}})!
return True
Save forward progress
@rick446 @synappio	

Acquire Resources
def acquire_resources(msg):!
for i, sem_id in enumerate(msg['s']['semaphores']):!
if i < msg['sub_status']: # already acquired!
continue!
sem = db.semaphore.find_one({'_id': 'sem_id'})!
if try_acquire_resource(sem_id, msg['_id'], sem['value']):!
db.message.update(!
{'_id': msg['_id']}, {'$set': {'s.sub_status': i}})!
else:!
return False!
db.message.update(!
{'_id': msg['_id']}, {'$set': {'s.status': 'busy'}})!
return True
Save forward progress
Failure to acquire (already queued)
@rick446 @synappio	

Acquire Resources
def acquire_resources(msg):!
for i, sem_id in enumerate(msg['s']['semaphores']):!
if i < msg['sub_status']: # already acquired!
continue!
sem = db.semaphore.find_one({'_id': 'sem_id'})!
if try_acquire_resource(sem_id, msg['_id'], sem['value']):!
db.message.update(!
{'_id': msg['_id']}, {'$set': {'s.sub_status': i}})!
else:!
return False!
db.message.update(!
{'_id': msg['_id']}, {'$set': {'s.status': 'busy'}})!
return True
Save forward progress
Failure to acquire (already queued)
Resources acquired,
message ready to be
processed
@rick446 @synappio	

Acquire Resources
def try_acquire_resource(sem_id, msg_id, sem_size):!
'''Version 1 (race condition)'''!
if reserve(sem_id, msg_id, sem_size):!
return True!
else:!
db.message.update(!
{'_id': msg_id},!
{'$set': {'s.status': 'queued'}})!
return False
@rick446 @synappio	

Acquire Resources
def try_acquire_resource(sem_id, msg_id, sem_size):!
'''Version 1 (race condition)'''!
if reserve(sem_id, msg_id, sem_size):!
return True!
else:!
db.message.update(!
{'_id': msg_id},!
{'$set': {'s.status': 'queued'}})!
return False
Here be dragons!
@rick446 @synappio	

Release Resources (v1)	

“magic”
def make_dispatchable(msg_id):!
'''Version 1 (race condition)'''!
db.message.update(!
{'_id': msg_id, 's.status': 'queued'},!
{'$set': {'s.status': 'ready'}})
@rick446 @synappio	

Release Resources (v1)	

“magic”
def make_dispatchable(msg_id):!
'''Version 1 (race condition)'''!
db.message.update(!
{'_id': msg_id, 's.status': 'queued'},!
{'$set': {'s.status': 'ready'}})
But what if s.status == ‘acquire’?
@rick446 @synappio	

Release Resources (v1)	

“magic”
def make_dispatchable(msg_id):!
'''Version 1 (race condition)'''!
db.message.update(!
{'_id': msg_id, 's.status': 'queued'},!
{'$set': {'s.status': 'ready'}})
But what if s.status == ‘acquire’?
@rick446 @synappio	

Release Resources (v1)	

“magic”
def make_dispatchable(msg_id):!
'''Version 1 (race condition)'''!
db.message.update(!
{'_id': msg_id, 's.status': 'queued'},!
{'$set': {'s.status': 'ready'}})
But what if s.status == ‘acquire’?
That’s the dragon.
@rick446 @synappio	

Release Resources (v2)
def make_dispatchable(msg_id):!
res = db.message.update(!
{'_id': msg_id, 's.status': 'acquire'},!
{'$set': {'s.event': True}})!
if not res['updatedExisting']:!
db.message.update(!
{'_id': msg_id, 's.status': 'queued'},!
{'$set': {'s.status': 'ready'}})
@rick446 @synappio	

Release Resources (v2)
def make_dispatchable(msg_id):!
res = db.message.update(!
{'_id': msg_id, 's.status': 'acquire'},!
{'$set': {'s.event': True}})!
if not res['updatedExisting']:!
db.message.update(!
{'_id': msg_id, 's.status': 'queued'},!
{'$set': {'s.status': 'ready'}})
Hey, something happened!
@rick446 @synappio	

Acquire Resources (v2)
def try_acquire_resource(sem_id, msg_id, sem_size):!
'''Version 2'''!
while True:!
db.message.update(!
{'_id': msg_id}, {'$set': {'event': False}})!
if reserve(sem_id, msg_id, sem_size):!
return True!
else:!
res = db.message.update(!
{'_id': msg_id, 's.event': False},!
{'$set': {'s.status': 'queued'}})!
if not res['updatedExisting']:!
# Someone released this message; try again!
continue!
return False
@rick446 @synappio	

Acquire Resources (v2)
def try_acquire_resource(sem_id, msg_id, sem_size):!
'''Version 2'''!
while True:!
db.message.update(!
{'_id': msg_id}, {'$set': {'event': False}})!
if reserve(sem_id, msg_id, sem_size):!
return True!
else:!
res = db.message.update(!
{'_id': msg_id, 's.event': False},!
{'$set': {'s.status': 'queued'}})!
if not res['updatedExisting']:!
# Someone released this message; try again!
continue!
return False
Nothing’s happened yet!
@rick446 @synappio	

Acquire Resources (v2)
def try_acquire_resource(sem_id, msg_id, sem_size):!
'''Version 2'''!
while True:!
db.message.update(!
{'_id': msg_id}, {'$set': {'event': False}})!
if reserve(sem_id, msg_id, sem_size):!
return True!
else:!
res = db.message.update(!
{'_id': msg_id, 's.event': False},!
{'$set': {'s.status': 'queued'}})!
if not res['updatedExisting']:!
# Someone released this message; try again!
continue!
return False
Nothing’s happened yet!
Check if something
happened
@rick446 @synappio	

One More Race….
def release(sem_id, msg_id, sem_size):!
sem = db.semaphore.find_and_modify(!
{'_id': sem_id},!
update={'$pull': {!
'active': msg_id, !
'queued': msg_id}},!
new=True)!
!
while len(sem['active']) < sem_size and sem['queued']:!
wake_msg_ids = sem['queued'][:sem_size]!
updated = self.cls.m.find_and_modify(!
{'_id': sem_id},!
update={'$pullAll': {'queued': wake_msg_ids}},!
new=True)!
for msgid in wake_msg_ids:!
make_dispatchable(msgid)!
sem = updated
@rick446 @synappio	

One More Race….
def release(sem_id, msg_id, sem_size):!
sem = db.semaphore.find_and_modify(!
{'_id': sem_id},!
update={'$pull': {!
'active': msg_id, !
'queued': msg_id}},!
new=True)!
!
while len(sem['active']) < sem_size and sem['queued']:!
wake_msg_ids = sem['queued'][:sem_size]!
updated = self.cls.m.find_and_modify(!
{'_id': sem_id},!
update={'$pullAll': {'queued': wake_msg_ids}},!
new=True)!
for msgid in wake_msg_ids:!
make_dispatchable(msgid)!
sem = updated
@rick446 @synappio	

Compensate!
def fixup_queued_messages():!
for msg in db.message.find({'s.status': 'queued'}):!
sem_id = msg['semaphores'][msg['s']['sub_status']]!
sem = db.semaphore.find_one(!
{'_id': sem_id, 'queued': msg['_id']})!
if sem is None:!
db.message.m.update(!
{'_id': msg['_id'], !
's.status': 'queued', !
's.sub_status': msg['sub_status']},!
{'$set': {'s.status': 'ready'}})
@rick446 @synappio	

Managing Latency
@rick446 @synappio	

Managing Latency
• Reserving messages is expensive	

• Use Pub/Sub system instead	

• Publish to the channel whenever a
message is ready to be handled	

• Each worker subscribes to the channel	

• Workers only ‘poll’ when they have a
chance of getting work
Capped Collections
Capped	

Collection
• Fixed size	

• Fast inserts	

• “Tailable” cursors
Tailable 	

Cursor
Capped Collections
Capped	

Collection
• Fixed size	

• Fast inserts	

• “Tailable” cursors
Tailable 	

Cursor
Capped Collections
Capped	

Collection
• Fixed size	

• Fast inserts	

• “Tailable” cursors
Tailable 	

Cursor
Capped Collections
Capped	

Collection
• Fixed size	

• Fast inserts	

• “Tailable” cursors
Tailable 	

Cursor
Capped Collections
Capped	

Collection
• Fixed size	

• Fast inserts	

• “Tailable” cursors
Tailable 	

Cursor
Capped Collections
Capped	

Collection
• Fixed size	

• Fast inserts	

• “Tailable” cursors
Tailable 	

Cursor
@rick446 @synappio	

Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
@rick446 @synappio	

Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
Make cursor tailable
@rick446 @synappio	

Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
Holds open cursor for a
while
Make cursor tailable
@rick446 @synappio	

Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
Holds open cursor for a
while
Make cursor tailable
Don’t use indexes
@rick446 @synappio	

Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
import re, time!
while True:!
cur = get_cursor(!
db.capped_collection, !
re.compile('^foo'), !
await_data=True)!
for msg in cur:!
do_something(msg)!
time.sleep(0.1)
Holds open cursor for a
while
Make cursor tailable
Don’t use indexes
@rick446 @synappio	

Getting a Tailable
Cursor
def get_cursor(collection, topic_re, await_data=True):!
options = { 'tailable': True }!
if await_data:!
options['await_data'] = True!
cur = collection.find(!
{ 'k': topic_re },!
**options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
import re, time!
while True:!
cur = get_cursor(!
db.capped_collection, !
re.compile('^foo'), !
await_data=True)!
for msg in cur:!
do_something(msg)!
time.sleep(0.1)
Holds open cursor for a
while
Make cursor tailable
Don’t use indexes
Still some polling when
no producer, so don’t
spin too fast
@rick446 @synappio	

Building in retry...
def get_cursor(collection, topic_re, last_id=-1, await_data=True):!
options = { 'tailable': True }!
spec = { !
'id': { '$gt': last_id }, # only new messages!
'k': topic_re }!
if await_data:!
options['await_data'] = True!
cur = collection.find(spec, **options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
@rick446 @synappio	

Building in retry...
def get_cursor(collection, topic_re, last_id=-1, await_data=True):!
options = { 'tailable': True }!
spec = { !
'id': { '$gt': last_id }, # only new messages!
'k': topic_re }!
if await_data:!
options['await_data'] = True!
cur = collection.find(spec, **options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
return cur
Integer autoincrement
“id”
@rick446 @synappio	

Ludicrous Speed
from pymongo.cursor import _QUERY_OPTIONS!
!
def get_cursor(collection, topic_re, last_id=-1, await_data=True):!
options = { 'tailable': True }!
spec = { !
'ts': { '$gt': last_id }, # only new messages!
'k': topic_re }!
if await_data:!
options['await_data'] = True!
cur = collection.find(spec, **options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
if await:!
cur = cur.add_option(_QUERY_OPTIONS['oplog_replay'])!
return cur
@rick446 @synappio	

Ludicrous Speed
from pymongo.cursor import _QUERY_OPTIONS!
!
def get_cursor(collection, topic_re, last_id=-1, await_data=True):!
options = { 'tailable': True }!
spec = { !
'ts': { '$gt': last_id }, # only new messages!
'k': topic_re }!
if await_data:!
options['await_data'] = True!
cur = collection.find(spec, **options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
if await:!
cur = cur.add_option(_QUERY_OPTIONS['oplog_replay'])!
return cur
id ==> ts
@rick446 @synappio	

Ludicrous Speed
from pymongo.cursor import _QUERY_OPTIONS!
!
def get_cursor(collection, topic_re, last_id=-1, await_data=True):!
options = { 'tailable': True }!
spec = { !
'ts': { '$gt': last_id }, # only new messages!
'k': topic_re }!
if await_data:!
options['await_data'] = True!
cur = collection.find(spec, **options)!
cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes!
if await:!
cur = cur.add_option(_QUERY_OPTIONS['oplog_replay'])!
return cur
id ==> ts
Co-opt the
oplog_replay
option
@rick446 @synappio	

The Oplog
• Capped collection that records all
operations for replication	

• Includes a ‘ts’ field suitable for oplog_replay	

• Does not require a separate publish
operation (all changes are automatically
“published”)
@rick446 @synappio	

Using the Oplog
def oplog_await(oplog, spec):!
'''Await the very next message on the oplog satisfying the spec'''!
last = oplog.find_one(spec, sort=[('$natural', -1)])!
if last is None:!
return # Can't await unless there is an existing message satisfying spec!
await_spec = dict(spec)!
last_ts = last['ts']!
await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}!
curs = oplog.find(await_spec, tailable=True, await_data=True)!
curs = curs.hint([('$natural', 1)])!
curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])!
curs.next() # should always find 1 element!
try:!
return curs.next()!
except StopIteration:!
return None
@rick446 @synappio	

Using the Oplog
def oplog_await(oplog, spec):!
'''Await the very next message on the oplog satisfying the spec'''!
last = oplog.find_one(spec, sort=[('$natural', -1)])!
if last is None:!
return # Can't await unless there is an existing message satisfying spec!
await_spec = dict(spec)!
last_ts = last['ts']!
await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}!
curs = oplog.find(await_spec, tailable=True, await_data=True)!
curs = curs.hint([('$natural', 1)])!
curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])!
curs.next() # should always find 1 element!
try:!
return curs.next()!
except StopIteration:!
return None
most recent oplog entry
@rick446 @synappio	

Using the Oplog
def oplog_await(oplog, spec):!
'''Await the very next message on the oplog satisfying the spec'''!
last = oplog.find_one(spec, sort=[('$natural', -1)])!
if last is None:!
return # Can't await unless there is an existing message satisfying spec!
await_spec = dict(spec)!
last_ts = last['ts']!
await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}!
curs = oplog.find(await_spec, tailable=True, await_data=True)!
curs = curs.hint([('$natural', 1)])!
curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])!
curs.next() # should always find 1 element!
try:!
return curs.next()!
except StopIteration:!
return None
most recent oplog entry
finds most recent plus
following entries
@rick446 @synappio	

Using the Oplog
def oplog_await(oplog, spec):!
'''Await the very next message on the oplog satisfying the spec'''!
last = oplog.find_one(spec, sort=[('$natural', -1)])!
if last is None:!
return # Can't await unless there is an existing message satisfying spec!
await_spec = dict(spec)!
last_ts = last['ts']!
await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}!
curs = oplog.find(await_spec, tailable=True, await_data=True)!
curs = curs.hint([('$natural', 1)])!
curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])!
curs.next() # should always find 1 element!
try:!
return curs.next()!
except StopIteration:!
return None
most recent oplog entry
finds most recent plus
following entries
skip most recent
@rick446 @synappio	

Using the Oplog
def oplog_await(oplog, spec):!
'''Await the very next message on the oplog satisfying the spec'''!
last = oplog.find_one(spec, sort=[('$natural', -1)])!
if last is None:!
return # Can't await unless there is an existing message satisfying spec!
await_spec = dict(spec)!
last_ts = last['ts']!
await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}!
curs = oplog.find(await_spec, tailable=True, await_data=True)!
curs = curs.hint([('$natural', 1)])!
curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])!
curs.next() # should always find 1 element!
try:!
return curs.next()!
except StopIteration:!
return None
most recent oplog entry
finds most recent plus
following entries
skip most recent
return on anything new
@rick446 @synappio	

What We’ve Learned
@rick446 @synappio	

What We’ve Learned
How to…
@rick446 @synappio	

What We’ve Learned
How to…
Build a task queue in MongoDB
@rick446 @synappio	

What We’ve Learned
How to…
Build a task queue in MongoDB
@rick446 @synappio	

What We’ve Learned
How to…
Build a task queue in MongoDB
Bring consistency to distributed systems
(without transactions)
@rick446 @synappio	

What We’ve Learned
How to…
Build a task queue in MongoDB
Bring consistency to distributed systems
(without transactions)
@rick446 @synappio	

What We’ve Learned
How to…
Build a task queue in MongoDB
Bring consistency to distributed systems
(without transactions)
Build low-latency reactive systems
@rick446 @synappio	

Tips
@rick446 @synappio	

Tips
• findAndModify is ideal for queues
@rick446 @synappio	

Tips
• findAndModify is ideal for queues
@rick446 @synappio	

Tips
• findAndModify is ideal for queues
• Atomic update + compensation brings
consistency to your distributed system
@rick446 @synappio	

Tips
• findAndModify is ideal for queues
• Atomic update + compensation brings
consistency to your distributed system
@rick446 @synappio	

Tips
• findAndModify is ideal for queues
• Atomic update + compensation brings
consistency to your distributed system
• Use the oplog to build reactive, low-latency
systems
Questions?
Rick Copeland	

rick@synapp.io	

@rick446

Weitere ähnliche Inhalte

Was ist angesagt?

Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageSATOSHI TAGOMORI
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparisonshsedghi
 
Storing 16 Bytes at Scale
Storing 16 Bytes at ScaleStoring 16 Bytes at Scale
Storing 16 Bytes at ScaleFabian Reinartz
 
Altitude NY 2018: Programming the edge workshop
Altitude NY 2018: Programming the edge workshopAltitude NY 2018: Programming the edge workshop
Altitude NY 2018: Programming the edge workshopFastly
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingMongoDB
 
Async and Non-blocking IO w/ JRuby
Async and Non-blocking IO w/ JRubyAsync and Non-blocking IO w/ JRuby
Async and Non-blocking IO w/ JRubyJoe Kutner
 
Altitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, Ever
Altitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, EverAltitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, Ever
Altitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, EverFastly
 
Monitoring Docker with ELK
Monitoring Docker with ELKMonitoring Docker with ELK
Monitoring Docker with ELKDaniel Berman
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaPrajal Kulkarni
 
Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0
Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0
Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0Zabbix
 
Back to Basics 2017: Mí primera aplicación MongoDB
Back to Basics 2017: Mí primera aplicación MongoDBBack to Basics 2017: Mí primera aplicación MongoDB
Back to Basics 2017: Mí primera aplicación MongoDBMongoDB
 
Altitude SF 2017: Advanced VCL: Shielding and Clustering
Altitude SF 2017: Advanced VCL: Shielding and ClusteringAltitude SF 2017: Advanced VCL: Shielding and Clustering
Altitude SF 2017: Advanced VCL: Shielding and ClusteringFastly
 
«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghub«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghubit-people
 
What is the ServiceStack?
What is the ServiceStack?What is the ServiceStack?
What is the ServiceStack?Demis Bellot
 
What you need to know for postgresql operation
What you need to know for postgresql operationWhat you need to know for postgresql operation
What you need to know for postgresql operationAnton Bushmelev
 
ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)Mathew Beane
 
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...MongoDB
 

Was ist angesagt? (20)

Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
Storing 16 Bytes at Scale
Storing 16 Bytes at ScaleStoring 16 Bytes at Scale
Storing 16 Bytes at Scale
 
Altitude NY 2018: Programming the edge workshop
Altitude NY 2018: Programming the edge workshopAltitude NY 2018: Programming the edge workshop
Altitude NY 2018: Programming the edge workshop
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to sharding
 
Async and Non-blocking IO w/ JRuby
Async and Non-blocking IO w/ JRubyAsync and Non-blocking IO w/ JRuby
Async and Non-blocking IO w/ JRuby
 
Altitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, Ever
Altitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, EverAltitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, Ever
Altitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, Ever
 
Monitoring Docker with ELK
Monitoring Docker with ELKMonitoring Docker with ELK
Monitoring Docker with ELK
 
Query planner
Query plannerQuery planner
Query planner
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and Kibana
 
Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0
Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0
Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0
 
Back to Basics 2017: Mí primera aplicación MongoDB
Back to Basics 2017: Mí primera aplicación MongoDBBack to Basics 2017: Mí primera aplicación MongoDB
Back to Basics 2017: Mí primera aplicación MongoDB
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
Altitude SF 2017: Advanced VCL: Shielding and Clustering
Altitude SF 2017: Advanced VCL: Shielding and ClusteringAltitude SF 2017: Advanced VCL: Shielding and Clustering
Altitude SF 2017: Advanced VCL: Shielding and Clustering
 
«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghub«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghub
 
What is the ServiceStack?
What is the ServiceStack?What is the ServiceStack?
What is the ServiceStack?
 
Dapper performance
Dapper performanceDapper performance
Dapper performance
 
What you need to know for postgresql operation
What you need to know for postgresql operationWhat you need to know for postgresql operation
What you need to know for postgresql operation
 
ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)
 
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...
 

Ähnlich wie Chapman: Building a High-Performance Distributed Task Service with MongoDB

Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...NoSQLmatters
 
Back to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBBack to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBMongoDB
 
Building Apps with MongoDB
Building Apps with MongoDBBuilding Apps with MongoDB
Building Apps with MongoDBNate Abele
 
When to NoSQL and when to know SQL
When to NoSQL and when to know SQLWhen to NoSQL and when to know SQL
When to NoSQL and when to know SQLSimon Elliston Ball
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherMongoDB
 
Forcelandia 2016 PK Chunking
Forcelandia 2016 PK ChunkingForcelandia 2016 PK Chunking
Forcelandia 2016 PK ChunkingDaniel Peter
 
PK chunking presentation from Tahoe Dreamin' 2016
PK chunking presentation from Tahoe Dreamin' 2016PK chunking presentation from Tahoe Dreamin' 2016
PK chunking presentation from Tahoe Dreamin' 2016Daniel Peter
 
Webinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationWebinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationMongoDB
 
Basic PowerShell Toolmaking - Spiceworld 2016 session
Basic PowerShell Toolmaking - Spiceworld 2016 sessionBasic PowerShell Toolmaking - Spiceworld 2016 session
Basic PowerShell Toolmaking - Spiceworld 2016 sessionRob Dunn
 
Big Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision MakerBig Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision MakerMongoDB
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDBNorberto Leite
 
MongoDB Solution for Internet of Things and Big Data
MongoDB Solution for Internet of Things and Big DataMongoDB Solution for Internet of Things and Big Data
MongoDB Solution for Internet of Things and Big DataStefano Dindo
 
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...festival ICT 2016
 
mongoDB Performance
mongoDB PerformancemongoDB Performance
mongoDB PerformanceMoshe Kaplan
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMongoDB
 
Storage Methods for Nonstandard Data Patterns
Storage Methods for Nonstandard Data PatternsStorage Methods for Nonstandard Data Patterns
Storage Methods for Nonstandard Data PatternsBob Burgess
 
Spatial script for Spatial mongo for PHP and Zend
Spatial script for Spatial mongo for PHP and ZendSpatial script for Spatial mongo for PHP and Zend
Spatial script for Spatial mongo for PHP and ZendSteven Pousty
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbMongoDB APAC
 
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDBMongoDB
 

Ähnlich wie Chapman: Building a High-Performance Distributed Task Service with MongoDB (20)

Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
 
Back to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBBack to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDB
 
Building Apps with MongoDB
Building Apps with MongoDBBuilding Apps with MongoDB
Building Apps with MongoDB
 
When to NoSQL and when to know SQL
When to NoSQL and when to know SQLWhen to NoSQL and when to know SQL
When to NoSQL and when to know SQL
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
 
Forcelandia 2016 PK Chunking
Forcelandia 2016 PK ChunkingForcelandia 2016 PK Chunking
Forcelandia 2016 PK Chunking
 
SQL vs NoSQL
SQL vs NoSQLSQL vs NoSQL
SQL vs NoSQL
 
PK chunking presentation from Tahoe Dreamin' 2016
PK chunking presentation from Tahoe Dreamin' 2016PK chunking presentation from Tahoe Dreamin' 2016
PK chunking presentation from Tahoe Dreamin' 2016
 
Webinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationWebinar: Index Tuning and Evaluation
Webinar: Index Tuning and Evaluation
 
Basic PowerShell Toolmaking - Spiceworld 2016 session
Basic PowerShell Toolmaking - Spiceworld 2016 sessionBasic PowerShell Toolmaking - Spiceworld 2016 session
Basic PowerShell Toolmaking - Spiceworld 2016 session
 
Big Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision MakerBig Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision Maker
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
MongoDB Solution for Internet of Things and Big Data
MongoDB Solution for Internet of Things and Big DataMongoDB Solution for Internet of Things and Big Data
MongoDB Solution for Internet of Things and Big Data
 
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
 
mongoDB Performance
mongoDB PerformancemongoDB Performance
mongoDB Performance
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Storage Methods for Nonstandard Data Patterns
Storage Methods for Nonstandard Data PatternsStorage Methods for Nonstandard Data Patterns
Storage Methods for Nonstandard Data Patterns
 
Spatial script for Spatial mongo for PHP and Zend
Spatial script for Spatial mongo for PHP and ZendSpatial script for Spatial mongo for PHP and Zend
Spatial script for Spatial mongo for PHP and Zend
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
Buildingsocialanalyticstoolwithmongodb
 
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 

Mehr von MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Kürzlich hochgeladen

KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...Daniel Zivkovic
 
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"DianaGray10
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5DianaGray10
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Valere | Digital Solutions & AI Transformation Portfolio | 2024
Valere | Digital Solutions & AI Transformation Portfolio | 2024Valere | Digital Solutions & AI Transformation Portfolio | 2024
Valere | Digital Solutions & AI Transformation Portfolio | 2024Alexander Turgeon
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 

Kürzlich hochgeladen (20)

KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
 
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
UiPath Clipboard AI: "A TIME Magazine Best Invention of 2023 Unveiled"
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Valere | Digital Solutions & AI Transformation Portfolio | 2024
Valere | Digital Solutions & AI Transformation Portfolio | 2024Valere | Digital Solutions & AI Transformation Portfolio | 2024
Valere | Digital Solutions & AI Transformation Portfolio | 2024
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 

Chapman: Building a High-Performance Distributed Task Service with MongoDB

  • 1. Chapman: Building a Distributed Job Queue in MongoDB Rick Copeland @rick446 @synappio rick@synapp.io
  • 2. @rick446 @synappio Getting to Know One Another
  • 3. @rick446 @synappio Getting to Know One Another Rick
  • 4. @rick446 @synappio Getting to Know One Another Rick
  • 7. @rick446 @synappio WhatYou’ll Learn How to… Build a task queue in MongoDB
  • 8. @rick446 @synappio WhatYou’ll Learn How to… Build a task queue in MongoDB
  • 9. @rick446 @synappio WhatYou’ll Learn How to… Build a task queue in MongoDB Bring consistency to distributed systems (without transactions)
  • 10. @rick446 @synappio WhatYou’ll Learn How to… Build a task queue in MongoDB Bring consistency to distributed systems (without transactions)
  • 11. @rick446 @synappio WhatYou’ll Learn How to… Build a task queue in MongoDB Bring consistency to distributed systems (without transactions) Build low-latency reactive systems
  • 13. @rick446 @synappio Why a Queue? • Long-running task (or longer than the web can wait)
  • 14. @rick446 @synappio Why a Queue? • Long-running task (or longer than the web can wait) • Farm out chunks of work for performance
  • 16. @rick446 @synappio Things I Worry About • Priority
  • 17. @rick446 @synappio Things I Worry About • Priority • Latency
  • 18. @rick446 @synappio Things I Worry About • Priority • Latency • Unreliable workers
  • 21. @rick446 @synappio Queue Options • SQS? No priority • Redis? Can’t overflow memory
  • 22. @rick446 @synappio Queue Options • SQS? No priority • Redis? Can’t overflow memory • Rabbit-MQ? Lack of visibility
  • 23. @rick446 @synappio Queue Options • SQS? No priority • Redis? Can’t overflow memory • Rabbit-MQ? Lack of visibility • ZeroMQ? Lack of persistence
  • 24. @rick446 @synappio Queue Options • SQS? No priority • Redis? Can’t overflow memory • Rabbit-MQ? Lack of visibility • ZeroMQ? Lack of persistence • What about MongoDB?
  • 25. @rick446 @synappio Chapman Graham Arthur Chapman 8 January 1941 – 4 October 1989
  • 27. @rick446 @synappio Roadmap • Building a scheduled priority queue
  • 28. @rick446 @synappio Roadmap • Building a scheduled priority queue • Handling unreliable workers
  • 29. @rick446 @synappio Roadmap • Building a scheduled priority queue • Handling unreliable workers • Shared resources
  • 30. @rick446 @synappio Roadmap • Building a scheduled priority queue • Handling unreliable workers • Shared resources • Managing Latency
  • 31. @rick446 @synappio Building a Scheduled Priority Queue
  • 32. @rick446 @synappio Step 1: Simple Queue db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")! }! });! ! db.message.ensureIndex({'s.status': 1, 's.ts_enqueue': 1});! ! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.ts_enqueue': 1},! update: { '$set': {'s.status': 'reserved'} },! }! );!
  • 33. @rick446 @synappio Step 1: Simple Queue db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")! }! });! ! db.message.ensureIndex({'s.status': 1, 's.ts_enqueue': 1});! ! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.ts_enqueue': 1},! update: { '$set': {'s.status': 'reserved'} },! }! );! FIFO
  • 34. @rick446 @synappio Step 1: Simple Queue db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")! }! });! ! db.message.ensureIndex({'s.status': 1, 's.ts_enqueue': 1});! ! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.ts_enqueue': 1},! update: { '$set': {'s.status': 'reserved'} },! }! );! FIFO Get earliest message for processing
  • 36. @rick446 @synappio Step 1: Simple Queue Good
  • 37. @rick446 @synappio Step 1: Simple Queue Good • Guaranteed FIFO
  • 38. @rick446 @synappio Step 1: Simple Queue Good • Guaranteed FIFO Bad
  • 39. @rick446 @synappio Step 1: Simple Queue Good • Guaranteed FIFO Bad • No priority (other than FIFO)
  • 40. @rick446 @synappio Step 1: Simple Queue Good • Guaranteed FIFO Bad • No priority (other than FIFO) • No handling of worker problems
  • 41. @rick446 @synappio Step 2: Scheduled Messages db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! “ts_after" : ISODate(…),! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")! }! });! ! db.message.ensureIndex(! {'s.status': 1, 's.ts_enqueue': 1});! ! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready', ’s.ts_after': {$lt: now }},! sort: {'s.ts_enqueue': 1},! update: { '$set': {'s.status': 'reserved'} },! }! );
  • 42. @rick446 @synappio Step 2: Scheduled Messages db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! “ts_after" : ISODate(…),! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")! }! });! ! db.message.ensureIndex(! {'s.status': 1, 's.ts_enqueue': 1});! ! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready', ’s.ts_after': {$lt: now }},! sort: {'s.ts_enqueue': 1},! update: { '$set': {'s.status': 'reserved'} },! }! ); MinValid Time
  • 43. @rick446 @synappio Step 2: Scheduled Messages db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! “ts_after" : ISODate(…),! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")! }! });! ! db.message.ensureIndex(! {'s.status': 1, 's.ts_enqueue': 1});! ! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready', ’s.ts_after': {$lt: now }},! sort: {'s.ts_enqueue': 1},! update: { '$set': {'s.status': 'reserved'} },! }! ); MinValid Time Get earliest message for processing
  • 44. @rick446 @synappio Step 2: Scheduled Messages
  • 45. @rick446 @synappio Step 2: Scheduled Messages Good
  • 46. @rick446 @synappio Step 2: Scheduled Messages Good • Easy to build periodic tasks
  • 47. @rick446 @synappio Step 2: Scheduled Messages Good • Easy to build periodic tasks Bad
  • 48. @rick446 @synappio Step 2: Scheduled Messages Good • Easy to build periodic tasks Bad • Be careful with the word “now”
  • 49. @rick446 @synappio Step 3: Priority db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! "pri": 30128,! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")! }! });! ! db.message.ensureIndex({'s.status': 1, 's.pri': -1, 's.ts_enqueue': 1});! ! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.pri': -1, 's.ts_enqueue': 1},! update: { '$set': {'s.status': 'reserved'} },! }! );
  • 50. @rick446 @synappio Step 3: Priority db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! "pri": 30128,! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")! }! });! ! db.message.ensureIndex({'s.status': 1, 's.pri': -1, 's.ts_enqueue': 1});! ! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.pri': -1, 's.ts_enqueue': 1},! update: { '$set': {'s.status': 'reserved'} },! }! ); Add Priority
  • 51. @rick446 @synappio Step 3: Priority db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! "pri": 30128,! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z")! }! });! ! db.message.ensureIndex({'s.status': 1, 's.pri': -1, 's.ts_enqueue': 1});! ! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.pri': -1, 's.ts_enqueue': 1},! update: { '$set': {'s.status': 'reserved'} },! }! ); Add Priority
  • 53. @rick446 @synappio Step 3: Priority Good
  • 54. @rick446 @synappio Step 3: Priority Good • Priorities are handled
  • 55. @rick446 @synappio Step 3: Priority Good • Priorities are handled • Guaranteed FIFO within a priority
  • 56. @rick446 @synappio Step 3: Priority Good • Priorities are handled • Guaranteed FIFO within a priority Bad
  • 57. @rick446 @synappio Step 3: Priority Good • Priorities are handled • Guaranteed FIFO within a priority Bad • No handling of worker problems
  • 59. @rick446 @synappio Approach 1 Timeouts db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! "pri": 30128,! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z"),! "ts_timeout" : ISODate("2025-01-01T00:00:00.000Z")! }! });! ! db.message.ensureIndex({“s.status": 1, “s.ts_timeout": 1})! !
  • 60. @rick446 @synappio Approach 1 Timeouts db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! "pri": 30128,! "ts_enqueue" : ISODate("2015-03-02T15:27:29.228Z"),! "ts_timeout" : ISODate("2025-01-01T00:00:00.000Z")! }! });! ! db.message.ensureIndex({“s.status": 1, “s.ts_timeout": 1})! ! Far-future placeholder
  • 61. @rick446 @synappio // Reserve message! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.pri': -1, 's.ts_enqueue': 1},! update: { '$set': {! 's.status': 'reserved',! 's.ts_timeout': now + processing_time } }! }! );! ! // Timeout message ("unlock")! db.message.update(! {'s.ts_status': 'reserved', 's.ts_timeout': {'$lt': now}},! {'$set': {'s.status': 'ready'}},! {'multi': true}); Approach 1 Timeouts
  • 62. @rick446 @synappio // Reserve message! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.pri': -1, 's.ts_enqueue': 1},! update: { '$set': {! 's.status': 'reserved',! 's.ts_timeout': now + processing_time } }! }! );! ! // Timeout message ("unlock")! db.message.update(! {'s.ts_status': 'reserved', 's.ts_timeout': {'$lt': now}},! {'$set': {'s.status': 'ready'}},! {'multi': true}); Client sets timeout Approach 1 Timeouts
  • 65. @rick446 @synappio Approach 1 Timeouts Good • Worker failure handled via timeout
  • 66. @rick446 @synappio Approach 1 Timeouts Good • Worker failure handled via timeout Bad
  • 67. @rick446 @synappio Approach 1 Timeouts Good • Worker failure handled via timeout Bad • Requires periodic “unlock” task
  • 68. @rick446 @synappio Approach 1 Timeouts Good • Worker failure handled via timeout Bad • Requires periodic “unlock” task • Slow (but “live”) workers can cause spurious timeouts
  • 69. @rick446 @synappio db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! "pri": 30128,! "cli": "--------------------------"! "ts_enqueue" : ISODate("2015-03-02T..."),! "ts_timeout" : ISODate("2025-...")! }! }); Approach 2 Worker Identity
  • 70. @rick446 @synappio db.message.insert({! "_id" : NumberLong("3784707300388732067"),! "data" : BinData(...),! "s" : {! "status" : "ready",! "pri": 30128,! "cli": "--------------------------"! "ts_enqueue" : ISODate("2015-03-02T..."),! "ts_timeout" : ISODate("2025-...")! }! }); Client / worker placeholder Approach 2 Worker Identity
  • 71. @rick446 @synappio // Reserve message! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.pri': -1, 's.ts_enqueue': 1},! update: { '$set': {! 's.status': 'reserved',! 's.cli': ‘client_name:pid',! 's.ts_timeout': now + processing_time } }! }! );! ! // Unlock “dead” client messages! db.message.update(! {'s.status': 'reserved', ! 's.cli': {'$nin': active_clients} },! {'$set': {'s.status': 'ready'}},! {'multi': true});! Approach 2 Worker Identity
  • 72. @rick446 @synappio // Reserve message! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.pri': -1, 's.ts_enqueue': 1},! update: { '$set': {! 's.status': 'reserved',! 's.cli': ‘client_name:pid',! 's.ts_timeout': now + processing_time } }! }! );! ! // Unlock “dead” client messages! db.message.update(! {'s.status': 'reserved', ! 's.cli': {'$nin': active_clients} },! {'$set': {'s.status': 'ready'}},! {'multi': true});! Mark the worker who reserved the message Approach 2 Worker Identity
  • 73. @rick446 @synappio // Reserve message! db.runCommand(! {! findAndModify: "message",! query: { 's.status': 'ready' },! sort: {'s.pri': -1, 's.ts_enqueue': 1},! update: { '$set': {! 's.status': 'reserved',! 's.cli': ‘client_name:pid',! 's.ts_timeout': now + processing_time } }! }! );! ! // Unlock “dead” client messages! db.message.update(! {'s.status': 'reserved', ! 's.cli': {'$nin': active_clients} },! {'$set': {'s.status': 'ready'}},! {'multi': true});! Mark the worker who reserved the message Messages reserved by dead workers are unlocked Approach 2 Worker Identity
  • 76. @rick446 @synappio Approach 2 Worker Identity Good • Worker failure handled via out- of-band detection of live workers
  • 77. @rick446 @synappio Approach 2 Worker Identity Good • Worker failure handled via out- of-band detection of live workers • Handles slow workers
  • 78. @rick446 @synappio Approach 2 Worker Identity Good • Worker failure handled via out- of-band detection of live workers • Handles slow workers
  • 79. @rick446 @synappio Approach 2 Worker Identity Good • Worker failure handled via out- of-band detection of live workers • Handles slow workers Bad
  • 80. @rick446 @synappio Approach 2 Worker Identity Good • Worker failure handled via out- of-band detection of live workers • Handles slow workers Bad • Requires periodic “unlock” task
  • 81. @rick446 @synappio Approach 2 Worker Identity Good • Worker failure handled via out- of-band detection of live workers • Handles slow workers Bad • Requires periodic “unlock” task • Unlock updates can be slow
  • 85. @rick446 @synappio Semaphores • Some services perform connection- throttling (e.g. Mailchimp)
  • 86. @rick446 @synappio Semaphores • Some services perform connection- throttling (e.g. Mailchimp) • Some services just have a hard time with 144 threads hitting them simultaneously
  • 87. @rick446 @synappio Semaphores • Some services perform connection- throttling (e.g. Mailchimp) • Some services just have a hard time with 144 threads hitting them simultaneously • Need a way to limit our concurrency
  • 88. @rick446 @synappio Semaphores Semaphore Active: msg1, msg2, msg3, … Capacity: 16 Queued: msg17, msg18, msg19, …
  • 89. @rick446 @synappio Semaphores Semaphore Active: msg1, msg2, msg3, … Capacity: 16 Queued: msg17, msg18, msg19, … • Keep active and queued messages in arrays
  • 90. @rick446 @synappio Semaphores Semaphore Active: msg1, msg2, msg3, … Capacity: 16 Queued: msg17, msg18, msg19, … • Keep active and queued messages in arrays • Releasing the semaphore makes queued messages available for dispatch
  • 91. @rick446 @synappio Semaphores Semaphore Active: msg1, msg2, msg3, … Capacity: 16 Queued: msg17, msg18, msg19, … • Keep active and queued messages in arrays • Releasing the semaphore makes queued messages available for dispatch • Use $slice (2.6) to keep arrays the right size
  • 92. @rick446 @synappio Semaphores:Acquire db.semaphore.insert({! '_id': 'semaphore-name',! 'value': 16,! 'active': [],! 'queued': []});! ! def acquire(sem_id, msg_id, sem_size):! sem = db.semaphore.find_and_modify(! {'_id': sem_id},! update={'$push': {! 'active': {! '$each': [msg_id], ! '$slice': sem_size},! 'queued': msg_id}},! new=True)! if msg_id in sem['active']:! db.semaphore.update(! {'_id': 'semaphore-name'},! {'$pull': {'queued': msg_id}})! return True! return False
  • 93. @rick446 @synappio Semaphores:Acquire db.semaphore.insert({! '_id': 'semaphore-name',! 'value': 16,! 'active': [],! 'queued': []});! ! def acquire(sem_id, msg_id, sem_size):! sem = db.semaphore.find_and_modify(! {'_id': sem_id},! update={'$push': {! 'active': {! '$each': [msg_id], ! '$slice': sem_size},! 'queued': msg_id}},! new=True)! if msg_id in sem['active']:! db.semaphore.update(! {'_id': 'semaphore-name'},! {'$pull': {'queued': msg_id}})! return True! return False Pessimistic update
  • 94. @rick446 @synappio Semaphores:Acquire db.semaphore.insert({! '_id': 'semaphore-name',! 'value': 16,! 'active': [],! 'queued': []});! ! def acquire(sem_id, msg_id, sem_size):! sem = db.semaphore.find_and_modify(! {'_id': sem_id},! update={'$push': {! 'active': {! '$each': [msg_id], ! '$slice': sem_size},! 'queued': msg_id}},! new=True)! if msg_id in sem['active']:! db.semaphore.update(! {'_id': 'semaphore-name'},! {'$pull': {'queued': msg_id}})! return True! return False Pessimistic update Compensation
  • 95. @rick446 @synappio Semaphores: Release def release(sem_id, msg_id, sem_size):! sem = db.semaphore.find_and_modify(! {'_id': sem_id},! update={'$pull': {! 'active': msg_id, ! 'queued': msg_id}},! new=True)! ! while len(sem['active']) < sem_size and sem['queued']:! wake_msg_ids = sem['queued'][:sem_size]! updated = self.cls.m.find_and_modify(! {'_id': sem_id},! update={'$pullAll': {'queued': wake_msg_ids}},! new=True)! for msgid in wake_msg_ids:! make_dispatchable(msgid)! sem = updated
  • 96. @rick446 @synappio Semaphores: Release def release(sem_id, msg_id, sem_size):! sem = db.semaphore.find_and_modify(! {'_id': sem_id},! update={'$pull': {! 'active': msg_id, ! 'queued': msg_id}},! new=True)! ! while len(sem['active']) < sem_size and sem['queued']:! wake_msg_ids = sem['queued'][:sem_size]! updated = self.cls.m.find_and_modify(! {'_id': sem_id},! update={'$pullAll': {'queued': wake_msg_ids}},! new=True)! for msgid in wake_msg_ids:! make_dispatchable(msgid)! sem = updated Actually release
  • 97. @rick446 @synappio Semaphores: Release def release(sem_id, msg_id, sem_size):! sem = db.semaphore.find_and_modify(! {'_id': sem_id},! update={'$pull': {! 'active': msg_id, ! 'queued': msg_id}},! new=True)! ! while len(sem['active']) < sem_size and sem['queued']:! wake_msg_ids = sem['queued'][:sem_size]! updated = self.cls.m.find_and_modify(! {'_id': sem_id},! update={'$pullAll': {'queued': wake_msg_ids}},! new=True)! for msgid in wake_msg_ids:! make_dispatchable(msgid)! sem = updated Actually release Awaken queued message(s)
  • 98. @rick446 @synappio Semaphores: Release def release(sem_id, msg_id, sem_size):! sem = db.semaphore.find_and_modify(! {'_id': sem_id},! update={'$pull': {! 'active': msg_id, ! 'queued': msg_id}},! new=True)! ! while len(sem['active']) < sem_size and sem['queued']:! wake_msg_ids = sem['queued'][:sem_size]! updated = self.cls.m.find_and_modify(! {'_id': sem_id},! update={'$pullAll': {'queued': wake_msg_ids}},! new=True)! for msgid in wake_msg_ids:! make_dispatchable(msgid)! sem = updated Actually release Awaken queued message(s) Some magic (covered later)
  • 101. @rick446 @synappio Message States ready acquirequeued busy • Reserve the message • Acquire resources
  • 102. @rick446 @synappio Message States ready acquirequeued busy • Reserve the message • Acquire resources • Process the message
  • 103. @rick446 @synappio Message States ready acquirequeued busy • Reserve the message • Acquire resources • Process the message • Release resources
  • 104. @rick446 @synappio Reserve a Message msg = db.message.find_and_modify(! {'s.status': 'ready'},! sort=[('s.sub_status', -1), ('s.pri', -1), ('s.ts', 1)],! update={'$set': {'s.w': worker, 's.status': 'acquire'}},! new=True) message.s == {! pri: 10,! semaphores: ['foo'],! status: 'ready',! sub_status: 0,! w: '----------',! ...} message.s == {! pri: 10,! semaphores: ['foo'],! status: 'acquire! sub_status: 0,! w: worker,! ...}
  • 105. @rick446 @synappio Reserve a Message msg = db.message.find_and_modify(! {'s.status': 'ready'},! sort=[('s.sub_status', -1), ('s.pri', -1), ('s.ts', 1)],! update={'$set': {'s.w': worker, 's.status': 'acquire'}},! new=True) message.s == {! pri: 10,! semaphores: ['foo'],! status: 'ready',! sub_status: 0,! w: '----------',! ...} Required semaphores message.s == {! pri: 10,! semaphores: ['foo'],! status: 'acquire! sub_status: 0,! w: worker,! ...}
  • 106. @rick446 @synappio Reserve a Message msg = db.message.find_and_modify(! {'s.status': 'ready'},! sort=[('s.sub_status', -1), ('s.pri', -1), ('s.ts', 1)],! update={'$set': {'s.w': worker, 's.status': 'acquire'}},! new=True) message.s == {! pri: 10,! semaphores: ['foo'],! status: 'ready',! sub_status: 0,! w: '----------',! ...} Required semaphores # semaphores acquired message.s == {! pri: 10,! semaphores: ['foo'],! status: 'acquire! sub_status: 0,! w: worker,! ...}
  • 107. @rick446 @synappio Reserve a Message msg = db.message.find_and_modify(! {'s.status': 'ready'},! sort=[('s.sub_status', -1), ('s.pri', -1), ('s.ts', 1)],! update={'$set': {'s.w': worker, 's.status': 'acquire'}},! new=True) message.s == {! pri: 10,! semaphores: ['foo'],! status: 'ready',! sub_status: 0,! w: '----------',! ...} Required semaphores # semaphores acquired message.s == {! pri: 10,! semaphores: ['foo'],! status: 'acquire! sub_status: 0,! w: worker,! ...} Prefer partially-acquired messages
  • 108. @rick446 @synappio Acquire Resources def acquire_resources(msg):! for i, sem_id in enumerate(msg['s']['semaphores']):! if i < msg['sub_status']: # already acquired! continue! sem = db.semaphore.find_one({'_id': 'sem_id'})! if try_acquire_resource(sem_id, msg['_id'], sem['value']):! db.message.update(! {'_id': msg['_id']}, {'$set': {'s.sub_status': i}})! else:! return False! db.message.update(! {'_id': msg['_id']}, {'$set': {'s.status': 'busy'}})! return True
  • 109. @rick446 @synappio Acquire Resources def acquire_resources(msg):! for i, sem_id in enumerate(msg['s']['semaphores']):! if i < msg['sub_status']: # already acquired! continue! sem = db.semaphore.find_one({'_id': 'sem_id'})! if try_acquire_resource(sem_id, msg['_id'], sem['value']):! db.message.update(! {'_id': msg['_id']}, {'$set': {'s.sub_status': i}})! else:! return False! db.message.update(! {'_id': msg['_id']}, {'$set': {'s.status': 'busy'}})! return True Save forward progress
  • 110. @rick446 @synappio Acquire Resources def acquire_resources(msg):! for i, sem_id in enumerate(msg['s']['semaphores']):! if i < msg['sub_status']: # already acquired! continue! sem = db.semaphore.find_one({'_id': 'sem_id'})! if try_acquire_resource(sem_id, msg['_id'], sem['value']):! db.message.update(! {'_id': msg['_id']}, {'$set': {'s.sub_status': i}})! else:! return False! db.message.update(! {'_id': msg['_id']}, {'$set': {'s.status': 'busy'}})! return True Save forward progress Failure to acquire (already queued)
  • 111. @rick446 @synappio Acquire Resources def acquire_resources(msg):! for i, sem_id in enumerate(msg['s']['semaphores']):! if i < msg['sub_status']: # already acquired! continue! sem = db.semaphore.find_one({'_id': 'sem_id'})! if try_acquire_resource(sem_id, msg['_id'], sem['value']):! db.message.update(! {'_id': msg['_id']}, {'$set': {'s.sub_status': i}})! else:! return False! db.message.update(! {'_id': msg['_id']}, {'$set': {'s.status': 'busy'}})! return True Save forward progress Failure to acquire (already queued) Resources acquired, message ready to be processed
  • 112. @rick446 @synappio Acquire Resources def try_acquire_resource(sem_id, msg_id, sem_size):! '''Version 1 (race condition)'''! if reserve(sem_id, msg_id, sem_size):! return True! else:! db.message.update(! {'_id': msg_id},! {'$set': {'s.status': 'queued'}})! return False
  • 113. @rick446 @synappio Acquire Resources def try_acquire_resource(sem_id, msg_id, sem_size):! '''Version 1 (race condition)'''! if reserve(sem_id, msg_id, sem_size):! return True! else:! db.message.update(! {'_id': msg_id},! {'$set': {'s.status': 'queued'}})! return False Here be dragons!
  • 114. @rick446 @synappio Release Resources (v1) “magic” def make_dispatchable(msg_id):! '''Version 1 (race condition)'''! db.message.update(! {'_id': msg_id, 's.status': 'queued'},! {'$set': {'s.status': 'ready'}})
  • 115. @rick446 @synappio Release Resources (v1) “magic” def make_dispatchable(msg_id):! '''Version 1 (race condition)'''! db.message.update(! {'_id': msg_id, 's.status': 'queued'},! {'$set': {'s.status': 'ready'}}) But what if s.status == ‘acquire’?
  • 116. @rick446 @synappio Release Resources (v1) “magic” def make_dispatchable(msg_id):! '''Version 1 (race condition)'''! db.message.update(! {'_id': msg_id, 's.status': 'queued'},! {'$set': {'s.status': 'ready'}}) But what if s.status == ‘acquire’?
  • 117. @rick446 @synappio Release Resources (v1) “magic” def make_dispatchable(msg_id):! '''Version 1 (race condition)'''! db.message.update(! {'_id': msg_id, 's.status': 'queued'},! {'$set': {'s.status': 'ready'}}) But what if s.status == ‘acquire’? That’s the dragon.
  • 118. @rick446 @synappio Release Resources (v2) def make_dispatchable(msg_id):! res = db.message.update(! {'_id': msg_id, 's.status': 'acquire'},! {'$set': {'s.event': True}})! if not res['updatedExisting']:! db.message.update(! {'_id': msg_id, 's.status': 'queued'},! {'$set': {'s.status': 'ready'}})
  • 119. @rick446 @synappio Release Resources (v2) def make_dispatchable(msg_id):! res = db.message.update(! {'_id': msg_id, 's.status': 'acquire'},! {'$set': {'s.event': True}})! if not res['updatedExisting']:! db.message.update(! {'_id': msg_id, 's.status': 'queued'},! {'$set': {'s.status': 'ready'}}) Hey, something happened!
  • 120. @rick446 @synappio Acquire Resources (v2) def try_acquire_resource(sem_id, msg_id, sem_size):! '''Version 2'''! while True:! db.message.update(! {'_id': msg_id}, {'$set': {'event': False}})! if reserve(sem_id, msg_id, sem_size):! return True! else:! res = db.message.update(! {'_id': msg_id, 's.event': False},! {'$set': {'s.status': 'queued'}})! if not res['updatedExisting']:! # Someone released this message; try again! continue! return False
  • 121. @rick446 @synappio Acquire Resources (v2) def try_acquire_resource(sem_id, msg_id, sem_size):! '''Version 2'''! while True:! db.message.update(! {'_id': msg_id}, {'$set': {'event': False}})! if reserve(sem_id, msg_id, sem_size):! return True! else:! res = db.message.update(! {'_id': msg_id, 's.event': False},! {'$set': {'s.status': 'queued'}})! if not res['updatedExisting']:! # Someone released this message; try again! continue! return False Nothing’s happened yet!
  • 122. @rick446 @synappio Acquire Resources (v2) def try_acquire_resource(sem_id, msg_id, sem_size):! '''Version 2'''! while True:! db.message.update(! {'_id': msg_id}, {'$set': {'event': False}})! if reserve(sem_id, msg_id, sem_size):! return True! else:! res = db.message.update(! {'_id': msg_id, 's.event': False},! {'$set': {'s.status': 'queued'}})! if not res['updatedExisting']:! # Someone released this message; try again! continue! return False Nothing’s happened yet! Check if something happened
  • 123. @rick446 @synappio One More Race…. def release(sem_id, msg_id, sem_size):! sem = db.semaphore.find_and_modify(! {'_id': sem_id},! update={'$pull': {! 'active': msg_id, ! 'queued': msg_id}},! new=True)! ! while len(sem['active']) < sem_size and sem['queued']:! wake_msg_ids = sem['queued'][:sem_size]! updated = self.cls.m.find_and_modify(! {'_id': sem_id},! update={'$pullAll': {'queued': wake_msg_ids}},! new=True)! for msgid in wake_msg_ids:! make_dispatchable(msgid)! sem = updated
  • 124. @rick446 @synappio One More Race…. def release(sem_id, msg_id, sem_size):! sem = db.semaphore.find_and_modify(! {'_id': sem_id},! update={'$pull': {! 'active': msg_id, ! 'queued': msg_id}},! new=True)! ! while len(sem['active']) < sem_size and sem['queued']:! wake_msg_ids = sem['queued'][:sem_size]! updated = self.cls.m.find_and_modify(! {'_id': sem_id},! update={'$pullAll': {'queued': wake_msg_ids}},! new=True)! for msgid in wake_msg_ids:! make_dispatchable(msgid)! sem = updated
  • 125. @rick446 @synappio Compensate! def fixup_queued_messages():! for msg in db.message.find({'s.status': 'queued'}):! sem_id = msg['semaphores'][msg['s']['sub_status']]! sem = db.semaphore.find_one(! {'_id': sem_id, 'queued': msg['_id']})! if sem is None:! db.message.m.update(! {'_id': msg['_id'], ! 's.status': 'queued', ! 's.sub_status': msg['sub_status']},! {'$set': {'s.status': 'ready'}})
  • 127. @rick446 @synappio Managing Latency • Reserving messages is expensive • Use Pub/Sub system instead • Publish to the channel whenever a message is ready to be handled • Each worker subscribes to the channel • Workers only ‘poll’ when they have a chance of getting work
  • 128. Capped Collections Capped Collection • Fixed size • Fast inserts • “Tailable” cursors Tailable Cursor
  • 129. Capped Collections Capped Collection • Fixed size • Fast inserts • “Tailable” cursors Tailable Cursor
  • 130. Capped Collections Capped Collection • Fixed size • Fast inserts • “Tailable” cursors Tailable Cursor
  • 131. Capped Collections Capped Collection • Fixed size • Fast inserts • “Tailable” cursors Tailable Cursor
  • 132. Capped Collections Capped Collection • Fixed size • Fast inserts • “Tailable” cursors Tailable Cursor
  • 133. Capped Collections Capped Collection • Fixed size • Fast inserts • “Tailable” cursors Tailable Cursor
  • 134. @rick446 @synappio Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True):! options = { 'tailable': True }! if await_data:! options['await_data'] = True! cur = collection.find(! { 'k': topic_re },! **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! return cur
  • 135. @rick446 @synappio Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True):! options = { 'tailable': True }! if await_data:! options['await_data'] = True! cur = collection.find(! { 'k': topic_re },! **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! return cur Make cursor tailable
  • 136. @rick446 @synappio Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True):! options = { 'tailable': True }! if await_data:! options['await_data'] = True! cur = collection.find(! { 'k': topic_re },! **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! return cur Holds open cursor for a while Make cursor tailable
  • 137. @rick446 @synappio Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True):! options = { 'tailable': True }! if await_data:! options['await_data'] = True! cur = collection.find(! { 'k': topic_re },! **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! return cur Holds open cursor for a while Make cursor tailable Don’t use indexes
  • 138. @rick446 @synappio Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True):! options = { 'tailable': True }! if await_data:! options['await_data'] = True! cur = collection.find(! { 'k': topic_re },! **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! return cur import re, time! while True:! cur = get_cursor(! db.capped_collection, ! re.compile('^foo'), ! await_data=True)! for msg in cur:! do_something(msg)! time.sleep(0.1) Holds open cursor for a while Make cursor tailable Don’t use indexes
  • 139. @rick446 @synappio Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True):! options = { 'tailable': True }! if await_data:! options['await_data'] = True! cur = collection.find(! { 'k': topic_re },! **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! return cur import re, time! while True:! cur = get_cursor(! db.capped_collection, ! re.compile('^foo'), ! await_data=True)! for msg in cur:! do_something(msg)! time.sleep(0.1) Holds open cursor for a while Make cursor tailable Don’t use indexes Still some polling when no producer, so don’t spin too fast
  • 140. @rick446 @synappio Building in retry... def get_cursor(collection, topic_re, last_id=-1, await_data=True):! options = { 'tailable': True }! spec = { ! 'id': { '$gt': last_id }, # only new messages! 'k': topic_re }! if await_data:! options['await_data'] = True! cur = collection.find(spec, **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! return cur
  • 141. @rick446 @synappio Building in retry... def get_cursor(collection, topic_re, last_id=-1, await_data=True):! options = { 'tailable': True }! spec = { ! 'id': { '$gt': last_id }, # only new messages! 'k': topic_re }! if await_data:! options['await_data'] = True! cur = collection.find(spec, **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! return cur Integer autoincrement “id”
  • 142. @rick446 @synappio Ludicrous Speed from pymongo.cursor import _QUERY_OPTIONS! ! def get_cursor(collection, topic_re, last_id=-1, await_data=True):! options = { 'tailable': True }! spec = { ! 'ts': { '$gt': last_id }, # only new messages! 'k': topic_re }! if await_data:! options['await_data'] = True! cur = collection.find(spec, **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! if await:! cur = cur.add_option(_QUERY_OPTIONS['oplog_replay'])! return cur
  • 143. @rick446 @synappio Ludicrous Speed from pymongo.cursor import _QUERY_OPTIONS! ! def get_cursor(collection, topic_re, last_id=-1, await_data=True):! options = { 'tailable': True }! spec = { ! 'ts': { '$gt': last_id }, # only new messages! 'k': topic_re }! if await_data:! options['await_data'] = True! cur = collection.find(spec, **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! if await:! cur = cur.add_option(_QUERY_OPTIONS['oplog_replay'])! return cur id ==> ts
  • 144. @rick446 @synappio Ludicrous Speed from pymongo.cursor import _QUERY_OPTIONS! ! def get_cursor(collection, topic_re, last_id=-1, await_data=True):! options = { 'tailable': True }! spec = { ! 'ts': { '$gt': last_id }, # only new messages! 'k': topic_re }! if await_data:! options['await_data'] = True! cur = collection.find(spec, **options)! cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes! if await:! cur = cur.add_option(_QUERY_OPTIONS['oplog_replay'])! return cur id ==> ts Co-opt the oplog_replay option
  • 145. @rick446 @synappio The Oplog • Capped collection that records all operations for replication • Includes a ‘ts’ field suitable for oplog_replay • Does not require a separate publish operation (all changes are automatically “published”)
  • 146. @rick446 @synappio Using the Oplog def oplog_await(oplog, spec):! '''Await the very next message on the oplog satisfying the spec'''! last = oplog.find_one(spec, sort=[('$natural', -1)])! if last is None:! return # Can't await unless there is an existing message satisfying spec! await_spec = dict(spec)! last_ts = last['ts']! await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}! curs = oplog.find(await_spec, tailable=True, await_data=True)! curs = curs.hint([('$natural', 1)])! curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])! curs.next() # should always find 1 element! try:! return curs.next()! except StopIteration:! return None
  • 147. @rick446 @synappio Using the Oplog def oplog_await(oplog, spec):! '''Await the very next message on the oplog satisfying the spec'''! last = oplog.find_one(spec, sort=[('$natural', -1)])! if last is None:! return # Can't await unless there is an existing message satisfying spec! await_spec = dict(spec)! last_ts = last['ts']! await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}! curs = oplog.find(await_spec, tailable=True, await_data=True)! curs = curs.hint([('$natural', 1)])! curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])! curs.next() # should always find 1 element! try:! return curs.next()! except StopIteration:! return None most recent oplog entry
  • 148. @rick446 @synappio Using the Oplog def oplog_await(oplog, spec):! '''Await the very next message on the oplog satisfying the spec'''! last = oplog.find_one(spec, sort=[('$natural', -1)])! if last is None:! return # Can't await unless there is an existing message satisfying spec! await_spec = dict(spec)! last_ts = last['ts']! await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}! curs = oplog.find(await_spec, tailable=True, await_data=True)! curs = curs.hint([('$natural', 1)])! curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])! curs.next() # should always find 1 element! try:! return curs.next()! except StopIteration:! return None most recent oplog entry finds most recent plus following entries
  • 149. @rick446 @synappio Using the Oplog def oplog_await(oplog, spec):! '''Await the very next message on the oplog satisfying the spec'''! last = oplog.find_one(spec, sort=[('$natural', -1)])! if last is None:! return # Can't await unless there is an existing message satisfying spec! await_spec = dict(spec)! last_ts = last['ts']! await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}! curs = oplog.find(await_spec, tailable=True, await_data=True)! curs = curs.hint([('$natural', 1)])! curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])! curs.next() # should always find 1 element! try:! return curs.next()! except StopIteration:! return None most recent oplog entry finds most recent plus following entries skip most recent
  • 150. @rick446 @synappio Using the Oplog def oplog_await(oplog, spec):! '''Await the very next message on the oplog satisfying the spec'''! last = oplog.find_one(spec, sort=[('$natural', -1)])! if last is None:! return # Can't await unless there is an existing message satisfying spec! await_spec = dict(spec)! last_ts = last['ts']! await_spec['ts'] = {'$gt': bson.Timestamp(last_ts.time, last_ts.inc - 1)}! curs = oplog.find(await_spec, tailable=True, await_data=True)! curs = curs.hint([('$natural', 1)])! curs = curs.add_option(_QUERY_OPTIONS['oplog_replay'])! curs.next() # should always find 1 element! try:! return curs.next()! except StopIteration:! return None most recent oplog entry finds most recent plus following entries skip most recent return on anything new
  • 152. @rick446 @synappio What We’ve Learned How to…
  • 153. @rick446 @synappio What We’ve Learned How to… Build a task queue in MongoDB
  • 154. @rick446 @synappio What We’ve Learned How to… Build a task queue in MongoDB
  • 155. @rick446 @synappio What We’ve Learned How to… Build a task queue in MongoDB Bring consistency to distributed systems (without transactions)
  • 156. @rick446 @synappio What We’ve Learned How to… Build a task queue in MongoDB Bring consistency to distributed systems (without transactions)
  • 157. @rick446 @synappio What We’ve Learned How to… Build a task queue in MongoDB Bring consistency to distributed systems (without transactions) Build low-latency reactive systems
  • 161. @rick446 @synappio Tips • findAndModify is ideal for queues • Atomic update + compensation brings consistency to your distributed system
  • 162. @rick446 @synappio Tips • findAndModify is ideal for queues • Atomic update + compensation brings consistency to your distributed system
  • 163. @rick446 @synappio Tips • findAndModify is ideal for queues • Atomic update + compensation brings consistency to your distributed system • Use the oplog to build reactive, low-latency systems