At Telefonica PDI we are developing an internal messaging service to be used by our own products.
Sprayer is a low latency, reliable messaging system supporting delivery of messages to a single receiver, predefined group of receivers or specific list of receivers over different channels (SMS, HTTP, WebSockets, Email, Android, iOS and Firefox OS native push…). We are using Redis, MongoDB and RabbitMQ to implement Sprayer. In this talk we will review Sprayer’s architecture.
We will see for each of these technologies, why, where and for what they are used as well as some tips.
Talk done together with Javier Arias ( @javier_arilos ) at NoSQL Matters Barcelona 2013.
2. who are we?
Pablo Enfedaque
@pablitoev56
Javier Arias
@javier_arilos
Javier is a Software
Architect and developer,
worked in different
sectors such as M2M,
Telcos, Finance, Airports.
Pablo is a SW R&D engineer
with a strong background
in high performance
computing, big data and
distributed systems.
3. some context
Telefónica is the 4th largest telco in the
world
2 years ago Telefonica Digital was
established to spread our business to the
digital world
former Telefonica R&D / PDI was merged into
this new company
4. overview
we are developing an internal messaging service
to be used by our own products
we have polyglot persistence using different
NoSQL technologies
in this talk we will review Sprayer’s
architecture and, for each technology, how it is
used
5. why sprayer?
a common push messaging service. why?
➔ each project with messaging needs was
implementing its own server its own way
➔ 5 push messaging systems in the company
➔ none of them supporting a wide variety of
transports
➔ independent deployment and operations
6. the problem
cross technology push:
iOS
Android
Websockets
eMail
SMS
HTTP
FirefoxOS
point to point and pubsub:
1 to 1
PaaS, multitenant
1 to N
1 to Group
8. the proposal
SPRAYER!
Sprayer is a low latency, reliable
messaging system supporting delivery
of messages to a single receiver, to
a predefined group of receivers or
to a specific list of receivers over
different channels (WebSockets, SMS,
Email, HTTP and iOS, Android or
Firefox OS native push…)
12. server side API challenges
➔ common interface for all channels
➔ reliable, consistent, idempotent
➔ route messages efficiently
➔ simple and user oriented
◆ manage subscriptions
◆ send messages: to list or group (topic)
◆ get delivery feedback
➔ standards based (HTTP + Json)
22. message routing challenges
routing (two-steps):
➔ API routes messages to N dispatchers
➔ Each dispatcher routes message to M
receivers (subscribers of a group)
both steps must be decoupled
The number of receivers could be thousands
26. async delivery feedback challenges
make msg feedback available through API
to clients
feedback must not compromise message
delivery or API
The number of updates could be millions
feedback: msg delivery, connections, push
33. redis
Redis is an open source, advanced keyvalue store. It is often referred to as a
data structure server (...) - (redis.io)
why redis?
- amazingly fast
- easy to use
- usage patterns: shared cache, queues,
pubsub, distributed lock, counting things
34. redis use cases
use cases in Sprayer:
➔ group subscribers x channel
➔ channels x group
➔ websockets channel queues (potentially
million receivers)
limitations for our use cases:
➔ memory bound
➔ queries and pagination
➔ high throughput queues
35. redis concerns
➔ what happens when dataset does not fit in
memory? two strategies
◆ partition datasets to different redis clusters
◆ sharding: based in tenant would be easy
➔ FT and HA
◆ easy way: master-slave with virtual IPs, switch
slave’s IP when master’s out. home made daemon
◆ sentinel based, some tests done, needs to be
supported by client library
◆ redis cluster being implemented; limited features
38. mongodb
mongoDB (from "humongous") is a
document database (...) features: full
index support, replication & HA, autosharding... (mongodb.org)
why mongoDB?
➔ scaling & HA
➔ great performance
➔ dynamic schemas
➔ versatile
39. mongodb use cases
use cases in Sprayer:
➔ operational DB, administrative data
➔ message delivery feedback updates
(potentially millions of records)
limitations for our use cases:
➔ operations with sets of subscribers
➔ high throughput queues
40. mongodb concerns
no concerns about mongodb for our
usecase.
maybe, in the long term, can it handle
the huge amount of feedback write
operations without affecting the API?
43. rabbitmq
robust messaging for applications,
easy to use
(www.rabbitmq.com)
why rabbitmq?
➔ very fast
➔ reliable
➔ builtin clustering
44. rabbitmq use cases
use cases in Sprayer:
➔ jobs for dispatchers (API => dispatchers)
➔ feedback status updates: message
delivery, connections, device status
(dispatchers => API)
limitations for our use cases:
➔ not scaling well to millions of queues
(websocket receiver inboxes)
49. design threats
related data in different places:
redis, rabbitmq and mongo
we are not transactional, our
components remain sane in case of a DB
failure, idempotent operations help
here
light implementation of Unit of Work
architectural pattern
51. architecture guidelines
➔ asynchronous processing / queues everywhere
➔ dedicated dispatchers for each transport
➔ common API interface
➔ used the best tool for each responsibility:
polyglot persistence
➔ processes as stateless as possible