Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Couchbase b jmeetup
1. USING COUCHBASE FOR SOCIAL
GAME SCALING AND SPEED
Steven Mih, Couchbase Inc.
Bin Cui, Couchbase Inc.
1
2. Agenda
• Introduction
• What is Couchbase Server?
– Simple, Fast, Elastic
– Technology Overview (Architecture, data flow, rebalancing)
• Tribal Crossing Inc: Animal Party
– Challenges before Couchbase
• Original Architecture
– Why Couchbase?
• Simplicity
• Performance
• Flexibility
– Deploying Couchbase
• New Architecture
• EC2
• Data Model
• Accessing data in Couchbase
• Product Roadmap
• Q&A
2
3. Couchbase Inc.
• Membase and CouchOne have merged to form Couchbase
Inc. (headquartered in Silicon Valley)
• Team
– Brings together the creators and core contributors of Memcached,
Membase and CouchDB technologies
– Doubles technical team size, accelerates roadmaps by over a year
• Products
– Couchbase Server (Formerly Membase)
– Couchbase Single Server
– Mobile Couchbase (iPhone and Android)
• Technology
– Most mature, reliable and widely deployed NoSQL technologies
– Fully featured, open source document datastore
– First complete, end-to-end NoSQL database product
3
4. Modern Interactive Web Application Architecture
Application Scales Out
Just add more commodity web servers
www.facebook.com/animalparty
Load Balancer
Web Servers
Relational Database Scales Up
Get a bigger, more complex server
Database
- Expensive and disruptive sharding
- Doesn’t perform at Web Scale
4
5. Couchbase Server is a distributed database
Couchbase Web Console
Application user
Web application server
Couchbase Servers
5
6. Couchbase data layer scales like application logic tier
Data layer now scales with linear cost and constant performance.
Application Scales Out
www.facebook.com/animalparty Just add more commodity web servers
Load Balancer
Web Servers
Couchbase Servers
Database Scales Out
Just add more commodity data servers
Horizontally scalable, schema-less,
auto-sharding, high-performance at
Web Scale
Scaling out flattens the cost and performance curves. 6
7. Couchbase Demonstration
• Membase ServerTemplate
Demo
– Starting with four database Application user
nodes under load
– Dynamically scaling to
eight database nodes Web application server
– Easy management and
monitoring
– Not possible any other Couchbase Servers
database technology
In the EC2 or Datacenter
7
8. Couchbase Server is Simple, Fast, Elastic
• Five minutes or less to a working cluster
– Downloads for Windows, Linux and OSX
– Start with a single node
– One button press joins nodes to a cluster
• Easy to develop against
– Just SET and GET – no schema required
– Drop it in. 10,000+ existing applications
already “speak Couchbase” (via memcached)
– Practically every language and application
framework is supported, out of the box
• Easy to manage
– One-click failover and cluster rebalancing
– Graphical and programmatic interfaces
– Configurable alerting
8
9. Couchbase Server is Simple, Fast, Elastic
• Predictable
– “Never keep an application waiting”
– Quasi-deterministic latency and throughput
• Low latency
– Built-in Memcached technology
– Auto-migration of hot data to lowest latency
storage technology (RAM, SSD, Disk)
– Selectable write behavior – asynchronous,
synchronous (on replication, persistence)
• High throughput
– Multi-threaded
– Low lock contention
– Asynchronous wherever possible
– Automatic write de-duplication
9
10. Couchbase Server is Simple, Fast, Elastic
• Zero-downtime elasticity
Couchbase Web Console
– Spread I/O and data across commodity
servers (or VMs)
– Consistent performance with linear cost
– Dynamic rebalancing of a live cluster
• All nodes are created equal
– No special case nodes
– Clone to grow
• Extensible
– Change feeds
– Real-time map-reduce
– RESTful interface for management
10
11. Proven at Small, and Extra Large Scale
• Leading cloud service (PAAS) • Social game leader – FarmVille,
provider Mafia Wars, Empires and Allies,
• Over 150,000 hosted applications Café World, FishVille
• Couchbase Server serving over • Over 230 million monthly users
6,200 Heroku customers • Couchbase Server is the primary
database behind key Zynga
properties
11
13. Couchbase Server Architecture
11211 11210
memcapable 1.0 memcapable 2.0
moxi
REST management API/Web UI
vBucket state and replication manager
Global singleton supervisor
Rebalance orchestrator
Configuration manager
memcached
Node health monitor
Process monitor
protocol listener/sender
Heartbeat
Data Manager Cluster Manager
engine interface
Couchbase Storage Engine http on each node one per cluster
Erlang/OTP
HTTP erlang port mapper distributed erlang
8091 4369 21100 – 21199 13
14. Couchbase Server Architecture
11211 11210
memcapable 1.0 memcapable 2.0
moxi
vBucket state and replication manager
REST management API/Web UI
Global singleton supervisor
Rebalance orchestrator
Configuration manager
memcached
Node health monitor
Process monitor
protocol listener/sender
Heartbeat
engine interface
Couchbase Storage Engine http on each node one per cluster
Erlang/OTP
HTTP erlang port mapper distributed erlang
8091 4369 21100 – 21199 14
15. Couchbase “write” Data Flow – application view
User action results in the need to
change the VALUE of KEY
1
Application updates key’s VALUE,
2 performs SET operation
4 Couchbase client hashes KEY, identifies
3 KEY’s master server
SET request sent over
network to master server
5
Couchbase replicates KEY-VALUE pair, caches it
in memory and stores it to disk
15
16. Couchbase Data Flow – under the hood
SET request arrives at SET acknowledgement
KEY’s master server
1 3 returned to application
2 2
Listener-Sender Listener-Sender Listener-Sender
RAM* RAM* RAM*
membase storage engine
membase storage engine
Couchbase storage engine
2
SSD SSD SSD SSD SSD SSD
SSD SSD SSD
Disk Disk Disk
4 Disk Disk Disk
Disk Disk Disk
Replica Server 1 for KEY Master server for KEY Replica Server 2 for KEY
16
17. Elasticity - Rebalancing
Node 1 Node 2 Node 3
Before vBucket 1 vBucket 7
• Adding Node 3 vBucket 2 vBucket 8
vBucket 3 vBucket 9
• Node 3 is in pending state vBucket 4 vBucket 10
Pending state
• Clients talk to Node 1,2 only vBucket 5 vBucket 11
vBucket 6 vBucket 12
vBucket 1 vBucket 7
During vBucket 2 vBucket 8
vBucket 3 vBucket 9
• Rebalancing orchestrator recalculates Rebalancing
vBucket 4 vBucket 10
the vBucket map (including replicas) vBucket 5 vBucket 11
• Migrate vBuckets to the new server vBucket 6 vBucket 12
• Finalize migration
vBucket migrator vBucket migrator
Client
After
vBucket 5
• Node 3 is balanced vBucket 1 vBucket 7
vBucket 2 vBucket 6
• Clients are reconfigured to talk to vBucket 8
vBucket 3 vBucket 9 vBucket 11
Node 3
vBucket 4 vBucket 10 vBucket 12
17
18. Data buckets are secure Couchbase “slices”
Application user
Web application server
Bucket 1
Bucket 2
Aggregate Cluster Memory and Disk Capacity
Couchbase data servers
In the data center On the administrator console
18
19. Couchbase and Hadoop Integration
• Support large-scale analytics on application data by streaming data
from Couchbase to Hadoop
– Real-time integration using Flume
– Batch integration using Sqoop
• Examples
– Various game statistics (e.g., monthly / daily / hourly rankings)
– Analyze game patterns from users to enhance various game metrics
Flume
memcached
TAP protocol listener/sender
Sqoop
engine interface
Couchbase Storage Engine
19
20. Agenda
• Introduction
• What is Couchbase Server?
– Simple, Fast, Elastic
– Technology Overview (Architecture, data flow, rebalancing)
• Tribal Crossing Inc: Animal Party
– Challenges before Couchbase
• Original Architecture
– Why Couchbase?
• Simplicity
• Performance
• Flexibility
– Deploying Couchbase
• New Architecture
• EC2
• Data Model
• Accessing data in Couchbase
• Product Roadmap
• Q&A
20
21. Tribal Crossing: Challenges
Common steps on scaling up database:
● Tune queries (indexing, explain query)
● Denormalization
● Cache data (APC / Memcache)
● Tune MySQL configuration
● Replication (read slaves)
Where do we go from here to prepare for the scale of a
successful social game?
21
22. Tribal Crossing: Challenges
● Write-heavy requests
– Caching does not help
– MySQL / InnoDB limitation (Percona)
● Need to scale drastically over night
– My Polls – 100 to 1m users over a weekend
● Small team, no dedicated sysadmin
– Focus on what we do best – making games
● Keeping cost down
22
23. Tribal Crossing: “Old” Architecture and Options
● MySQL with master-to-master replication and
sharding
– Complex to setup, high administration cost
– Requires application level changes
● Cassandra
– High write, but low read throughput
– Live cluster reconfiguration and rebalance is quite complicated
– Eventual consistency gives too much burden to application
developers
● MongoDB
– High read/write, but unpredictable latency
– Live cluster rebalance for existing nodes only
– Eventual consistency with slave nodes
23
24. Tribal Crossing: Why Couchbase Server?
● SPEED, SPEED, SPEED
● Immediate consistency
● Interface is dead simple to use
– We are already using Memcache
● Low sysadmin overhead
● Schema-less data store
● Used and Proven by big guys like Zynga
● … and lastly, because Tribal CAN
– Bigger firms with legacy code base = hard to adapt
– Small team = ability to get on the cutting edge
24
25. Tribal Crossing: New Challenges With Couchbase
● But, there are some different challenges in
using Couchbase (currently 1.7) to handle the game
data:
– No easy way to query data
– No transaction / rollback
➔ Couchbase Server 2.0 resolves them by using
CouchDB as the underlying database engine
● Can this work for an online game?
– Break out of the old ORM / relational paradigm!
– We are not handling bank transactions
25
26. Tribal Crossing: Deploying Couchbase in EC2
Web Server
Apache ● Basic production
environment setup
Client-side Moxi
● Dev/Stage environment –
Cluster Mgmt. Requests
feel free to install
Couchbase on your web
DNS Entry
server
Couchbase … Couchbase
Couchbase Cluster
26
27. Tribal Crossing: Deploying Couchbase in EC2
Web Server
● Amazon Linux AMI,
Apache 64-bit, EBS backed instance
Client-side Moxi ● Setup swap space
Cluster Mgmt. Requests
● Install Couchbase’s
Membase Server 1.7
DNS Entry ● Access web console
http://<hostname>:8091
● Start the new cluster with a
…
single node
Couchbase Couchbase
● Add the other nodes to the
Couchbase Cluster cluster and rebalance
27
28. Tribal Crossing: Deploying Couchbase in EC2
Web Server
Apache Moxi figures out which node in the
cluster holds data for a given key.
Client-side Moxi
● On each web server, install Moxi
Cluster Mgmt. Requests
proxy
● Start Moxi by pointing it to the
DNS Entry
DNS entry you created
● Web apps connect to Moxi that is
running locally
memcache->addServer(‘localhost’,
11211);
Couchbase … Couchbase
Couchbase Cluster
28
29. Tribal Crossing: Representing Game Data in Couchbase
Use case - simple farming game:
● A player can have a variety of plants on their farm.
● A player can add or remove plants from their farm.
● A Player can see what plants are on another player's
farm.
29
30. Tribal Crossing: Representing Game Data in Couchbase
Representing Objects
● Simply treat an object as an associative array
● Determine the key for an object using the class name
(or type) of the object and an unique ID
Representing Object Lists
● Denormalization
● Save a comma separated list or an array of object IDs
30
32. Tribal Crossing: Schema-less Game Data
● No need to “ALTER TABLE”
● Add new “fields” all objects at any time
– Specify default value for missing fields
– Increased development speed
● Using JSON for data objects though, owing to the
ability to query on arbitrary fields in Couchbase 2.0
32
33. Tribal Crossing: Accessing Game Data in Couchbase
Get all plants belong to a given player
Request: GET /player/1/farm
$plant_ids = couchbase->get('Player1_PlantList');
$response = array();
foreach ($plant_ids as $plant_id)
{
$plant = couchbase->get('Plant' . $plant_id);
$response[] = $plant;
}
echo json_encode($response);
33
34. Tribal Crossing: Modifying Game Data in Couchbase
Give a player a new plant
// Create the new plant
$new_plant = array (
'id' => 100,
'name' => 'Mushroom'
);
$couchbase->set('Plant100', $new_plant);
// Update the player plant list
$plant_ids = $couchbase->get('Player1_PlantList');
$plant_ids[] = $new_plant['id'];
$couchbase->set('Player1_PlantList', $plant_ids);
34
35. Tribal Crossing: Concurrency
Concurrency issue can occur when
multiple requests are working with
the same piece of data.
Solution:
● CAS (check-and-set)
– Client can know if someone else
has modified the data while you
are trying to update
– Implement optimistic
concurrency control
● Locking (try/wait cycle)
– GETL (get with lock + timeout)
operations
– Pessimistic concurrency control
35
36. Tribal Crossing: Data Relationship
● Record object relationships both ways
– Example: Plots and Plants
● Plot object stores id of the plant that it hosts
● Plant object stores id of the plot that it grows on
– Resolution in case of mismatch
● Don't sweat the extra calls to load data in a one-to-
many relationship
– Use multiGet
– We can still cache aggregated results in a Memcache
bucket if needed
36
37. Tribal Crossing: Migrating to Couchbase Servers
First migrated large or slow performing tables and
frequently updated fields from MySQL to Couchbase
Web Server
Apache + PHP
Client-side Moxi
MySQL
memcached
TAP protocol listener/sender
engine interface
Reporting TAP Client
Applications Couchbase Storage Engine
37
40. Tribal Crossing: Conclusion
• Significantly reduced the cost incurred by scaling up
database servers and managing them.
• Achieved significant improvements in various
performance metrics (e.g., read, write, latency, etc.)
• Allowed them to focus more on game development and
optimizing key metrics
• Plan to use real-time MapReduce, querying, and
indexing abilities provided by the upcoming Elastic
Couchbase 2.0
40
41. Agenda
• Introduction
• What is Couchbase Server?
– Simple, Fast, Elastic
– Technology Overview (Architecture, data flow, rebalancing)
• Tribal Crossing Inc: Animal Party
– Challenges before Couchbase
• Original Architecture
– Why Couchbase?
• Simplicity
• Performance
• Flexibility
– Deploying Couchbase
• New Architecture
• EC2
• Data Model
• Accessing data in Couchbase
• Product Roadmap
• Q&A
41
42. Product Roadmap: Couchbase Server 2.0
• Mobile to cloud data synchronization
• Cross data center replication
US West Coast Data Center US East Coast Data Center
Couchbase Couchbase
Server CouchSync Server
CouchSync CouchSync
Couchbase Single Server Couchbase Single Server
CouchSync CouchSync …
…
… … … …
42
43. Product Roadmap: Couchbase Server 2.0
• Replace Sqlite-based storage engine with CouchDB
• Support indexing and querying on values
• Integrate real-time MapReduce into Couchbase server
• SDK for Couchbase server
Membase Server 1.7 CouchDB 1.1 Couchbase Server 2.0
The world’s leading caching The most reliable and full- The fastest, most complete and
and clustering technology featured document database most reliable database on the
planet
43
44. Couchbase Product Download
• Community Edition
– Open source build
– Free forum support
• Enterprise Edition
– Free for non-production use
– Certified, QA tested version of open source
– Case tracking and guaranteed SLA for production
environments
• Community Sponsors in China
– Interested in being Couchbase Ambassador for Beijing?
44
45. Q&A
Steven Mih, Couchbase Inc.
(steven@couchbase.com)
Bin Cui, Couchbase Inc.
(bin@couchbase.com)
45