Recently, ArangoDB integrated its cluster management with Apache Mesos. This makes it now possible to launch an ArangoDB cluster on a Mesos cluster with a single, albeit complex shell command. In a DCOS-enabled Mesosphere cluster this is even easier, because one can use the dcos subcommand for ArangoDB, which essentially turns a Mesosphere cluster into a single, large computer.
In this talk I explain the whole setup and show (live on stage) how to deploy ArangoDB clusters on Google Compute Engine, and how we used this to scale ArangoDB up until it could sustain 1000000 document writes per second.
3. What did we just see?
The future of distributed computing!
I used a CLI on my laptop,
to start up an ArangoDB cluster (a distributed application)
on a bunch of AWS instances in the cloud,
running the Mesosphere Data Center Operation System (DCOS).
4. Data Center Operating Systems
Single machine Cluster
multiple cores multiple nodes
shared memory distributed memory
local disks distributed local disks
sockets switched network
systemd Marathon
crond Chronos
processes Frameworks/Services
Why?
more convenient cloud/distributed computing
better resource utilization (industry standard is 12%-15%)
5. Features
is a multi-model database (document store & graph database),
offers convenient queries (via HTTP/REST and AQL),
including joins between different collections,
configurable consistency guarantees using transactions
API extensible by JS code in the Foxx Microservice Framework.
6. Replication and Sharding — horizontal scalability
ArangoDB provides
easy setup of (asynchronous) replication,
sharding with automatic data distribution
MongoDB-style replication in the cluster,
full integration with Apache Mesos and Mesosphere.
Work in progress:
synchronous replication in cluster mode,
fault tolerance by automatic failover and
zero administration by a self-reparing and self-balancing cluster architecture,
all based on the Apache Mesos infrastructure.
10. dcos CLI Marathonschedules frameworks
starts (this is a lie)
Mesos Agent Mesos Master
Zookeeper
registers
stores state
Framework
Task
11. dcos CLI Marathonschedules frameworks
1. reports free resources
4. tells to execute
3. accepts or
resource offers
2. makes
declines them
starts (this is a big lie)
executes
(this is a small lie)
Mesos Agent Mesos Master
Zookeeper
Framework
Task
12. dcos CLI Marathonschedules frameworks
1. reports free resources
4. tells to execute
3. accepts or
resource offers
2. makes
declines them
executes
(this is a small lie)
actually, Marathon is a framework
starts (this is a big lie)
actually, it uses an "executor"
Mesos Agent Mesos Master
Zookeeper
Framework
Task
15. dcos CLI Marathonschedules frameworks
restarts
Mesos Agent Mesos Master
Zookeeper
Framework
Task
gets state
and reconciles
reconnects
16. Deployment
Docker and github
One container image arangodb/arangodb-mesos used to run
the ArangoDB framework (C++ executable)
all ArangoDB instances in the cluster
the Agency (etcd)
The dcos CLI by Mesosphere is a Python program (virtualenv, pip).
ArangoDB subcommand: a Python program, talks JSON/REST with the
framework, plugs into dcos, deployed from a github repository.
github repository mesosphere/universe has all certified frameworks
17. Scaling ArangoDB
Ultimate aim with a distributed database: horizontal scalability.
Devise a test, . . .
to show linear scaling
use N = 8, 16, 24, 32, 40, 48, 56, 64, 72, 80 nodes with 8 vCPUs each.
run N/2 DBServers, N/2 asynchronous replicas and N/2 Coordinators.
use single document reads, writes and 50%/50%,
from N/2 load servers in the same Mesosphere cluster
up to 640 vCPUs, want to write as many k docs/(s * vCPU) as possible.
18. Deployment of load servers
Docker and ArangoDB
Use a central ArangoDB instance to
collect results,
evaluate them,
and synchronise load servers.
Each load server runs the Waiter in a Docker container.
The Waiter waits, most of the time,
observes a collection and notices new "work"documents,
fires up load processes,
reports termination as a "done"document.
A single JavaScript program directs the whole experiment.
We deploy the Waiter using Marathon.