This is an introduction to building our services in a different way, where state is moved out of the database and into the services (as opposed to mainstream stateless servers).
It also describes one particular proof-of-concept tool that Cabify built during its annual offsite.
18. Cachopo. The core ideas:
- move data to where it is used
- protect your database
- operate in memory
19. Cachopo. Use cases:
- full dataset in memory
- allows full in-memory queries
- core dataset
- hot dataset in memory
- allows only “by id” queries
- authorisations / online users / …
21. Cachopo. Use cases:
- hot dataset in memory
- allows only “by id” queries
- authorisations / online users / …
- full dataset in memory
- allows full in-memory queries
- core dataset
26. Erlang’s :ets module
Access to C code to store erlang terms, so there
is minimum overhead (no serialization to/
deserialization from, e.g. JSON)
At the top level, stores tuples. One of the
positions of the tuple is the primary key.
Out of heap
ETS data is stored outside erlang processes, so it
does not impact GC
Key-Value store
Fast access to “rows” by key.
Also allows scans and some matching syntax
not yet used for Cachopo.
No mailbox
ETS is designed for highly concurrent access.
Unlike a process mailbox, there is no serialisation
of the messages.
No network
No rountrips, no errors, no format conversions…
ETS. The S is for Speed.
34. Couchdb adapter.
Cabify <3 Couch.
Live changes feed
Clients can receive new versions as they are
written. Globally for a database or filtered.
Cachopo uses this on start.
Replay changes
Clients can “rewind” the feed to a given
change.
Cachopo uses this to recover from
disconnections so that no change is lost.
35. Couchdb Adapter. Three
implementation highlights.
HTTPoison streaming mode
The changes feed may be infinite, Cachopo
parses it line by line.
Stream API for changes feed
Store changes in the Changes Consumer, read
them using familiar iteration.
Connection behaviour
Simplifies connect/disonnect management,
backoffs on the Changes Consumer. Build on top
of GenServer.
36. defmodule Cachopo.Adapter.Couchdb.Client do
def follow(%{url: url,
http_client: http_client,
http_client_options: opts},
last_seq "now",
handler self()) do
url = changes_url_from(url, last_seq)
opts = opts ++ [stream_to: handler,
recv_timeout: :infinity]
http_client.get(url, [], opts)
end
end
1. HTTPoison streaming mode
38. defmodule Cachopo.Adapter.Couchdb.Changes.Consumer do
def handle_info(message, %{client: client} = state) do
case client.handle_async_message(message) do
...
# parses and buffers or delivers changes
...
end
end
end
1. HTTPoison streaming mode
39. Stream API for changes
feed. Stream.resource/3
Stream.resource(
start_fun,
next_fun,
after_fun
)
start_fun
Starts a new consumer process that reads &
buffers changes.
next_fun
Retrieves a batch of buffered changes from the
consumer process.
If the buffer is empty, the requester is left waiting
and the next change will be delivered
immediately.
after_fun
No op.
40. defmodule Cachopo.Adapter.Couchdb.Changes do
defp start_fun(sup_mod, sup, config) do
fn ->
{:ok, pid} = sup_mod.start_changes_consumer(sup, config)
pid
end
end
defp next_changes_fun(consumer_mod) do
fn pid ->
changes = consumer_mod.pop_many(pid)
{changes, pid}
end
end
end
2. Stream API for changes feed
41. defmodule Cachopo.Adapter.Couchdb.Changes.Consumer do
def handle_call(:pop_many, from, state) do
case State.pop_all_changes(state) do
{:ok, changes, new_state} ->
{:reply, changes, new_state}
:empty ->
{:noreply, State.set_waiting(state, from)}
end
end
end
2. Stream API for changes feed
Keep caller
waiting
Store caller pid for fast
response when data arrives
42. defmodule Cachopo.Adapter.Couchdb.Changes.Consumer do
defp process(change, seq_num, state) do
state = State.set_seq_num(state, seq_num)
case State.pop_waiting(state) do
{:ok, from, new_state} ->
GenServer.reply(from, [change])
new_state
:empty ->
State.push_change(state, change)
end
end
end
2. Stream API for changes feed
Store sequence
number to allow
recovery
If there was a client
waiting, reply.
Otherwise, just
buffer
43. defmodule Cachopo.Adapter.Couchdb.Changes.Consumer do
use Connection
def init({config, client}),
do: {:connect, :init, State.new(config, client)}
def connect(_source, %{backoff: backoff} = state) do
case follow(state) do
{:ok, _info} -> {:ok, state}
{:error, _reason} -> {:backoff, backoff, state}
end
end
end
3. Connection behaviour
45. Request times. Couchdb in localhost, no load,
small documents.
Couchdb endpoint
[info] GET /couch/test
[info] Sent 200 in 25ms
[info] GET /couch/test
[info] Sent 200 in 1ms
[info] GET /couch/test
[info] Sent 200 in 1ms
[info] GET /couch/test
[info] Sent 200 in 1ms
Cachopo endpoint
[info] GET /document/test
[info] Sent 200 in 11ms
[info] GET /document/test
[info] Sent 200 in 67µs
[info] GET /document/test
[info] Sent 200 in 65µs
[info] GET /document/test
[info] Sent 200 in 66µs