4. Chartbeat: real-time analytics service
● 18 person startup in New York
● part of Betaworks
● peaking at just under 5M concurrents daily
○ up from 1M in July/2010
5. What chartbeat Provides
● real-time view of site performance
○ top pages
○ new/returning visitors
○ traffic flow
■ where are people coming from
■ where are people going to
● historic replay for the last 30 days
7. Architecture, Browser
Part 1:
<head>
<script type="text/javascript">var _sf_startpt=(new Date()).getTime()</script>
...
Part 2:
...
function loadChartbeat() {
// insert script tag
}
window.onload = loadChartbeat;
</body>
(highly simplified)
Ping is standard beacon logic, i.e. loading a 1x1 image.
8. Architecture, Backend
● custom libevent-based C backend
○ real-time collection and aggregation
● real-time system in-memory only
● background queue jobs snapshot every x minutes
○ Gearman
● historical data
○ mostly in MongoDB
9. Why Chartbeat uses MongoDB
● Pure JSON all along
○ Live API
○ Historical data
○ No mapping back and forth
● Fast Inserts (fire and forget)
● Flexible Schema
10. Why Chartbeat uses EC2
● Elastic Capacity
● No trips to datacenter
● EBS snapshots
11. Chartbeat & MongoDB & EC2 (1)
● 3 Clusters
○ 1 for each product
○ 1 as a caching layer
○ 2 - 4 instance/cluster
● m2-2xlarge
○ 34.2 GB merory
○ Ubuntu 10.04
○ RAID0 x 4 - 1 TB volumes
● Dedicated Snapshot Server
○ Shared among clusters
○ Serves as an arbiter as well
13. MongoDB & EC2 Challenges
● Instances disappear
○ MongoDB can have long recovery operations
○ MongoDB is (was) not ACID compliant. Unclean
shutdown could corrupt your data.
● Poor IO performance on EBS
○ MongoDB has global read/write lock
● Variable IO performance on EBS
○ Could cause replication issues
17. Instances Disappearing - Replica Sets
● No down-time :) yay!
● Automatic failover on writes
● Eventual failover on reads
● No code change
18. Instances Disappearing - Replica Sets
(caveats)
● pymongo driver reads/writes from primary
○ pymongo 2.1 will fix this
● chartbeat pymongo driver
○ based on MasterSlaveConnection
○ writes to primary
○ distribute reads among secondaries
○ automatic failover
○ eventual read re-distribution
19. Instances Disappearing - Fact of Life
● Accept this fact of life
● Always snapshot
○ Dedicated snapshot server
○ Hidden, i.e. no reads
● Automate everything
○ puppet
■ New instance from scratch within a minute
○ python-boto
■ Script all EC2 interaction
■ new_instance.py
■ mount_volumes_from_snap.py -o iid -n iid
■ snapshot_mongo.py
20. Instances Disappearing - Caveats
● New volumes - slow!!!
○ EBS loads blocks lazily
● Warm up EBS & File Cache before use
○ Options
■ Slowly direct the reads (app by app)
■ Run cache warm-up scripts
○ Not automated currently
22. Poor IO Performance on EBS
● XFS & RAIDing Helps
but,
● Disk IO varies over time
● MongoDB holds global lock on writes
● Query of death
○ Grinding-halt if not careful
23. Case Study: Historical Data
● For historical data, we store time series.
{
key:<key>
ts:<key>
values: {metric1: int1, metric2: int2}
meta:{}
}
● High Insert Rate vs Fast Historical Read
○ Optimize reads or writes?
● Fast inserts: ~1 MB/sec (through append only)
○ No disk-seek
● Historical reads: painfully slow
24. Faster Reads Through Cache DB
● Avoid reading from disk
● Favor reads over writes
● Aim for disk & memory locality
{day_tskey:<key>values: {metric1: list(int), metric2: list(int)}
}
● Data for historical reads resides together
● .append() to list could cause disk fragmentation
25. Avoid Fragmentation w/ Preallocation
● Fragmentation causes:
○ Inefficient disk usage
○ Slower writes (due to block allocation)
● Preallocate daily arrays instead
○ Pros:
■ No fragmentation
■ Write causes no change in data size
○ Cons:
■ Wasteful (we don't know keys ahead of time)
■ Requires heavy disk IO, ~7MB/sec (~60Mbis/sec on EBS)
● Conclusion: spread preallocation over 1 hour
27. EC2 Unpredictability - Challenges
● Resource contention in virtualized environment
● EBS and Network IO performance varies drastically
● RAID0 over 4 disks = 4 x risk
28. Heavy Monitoring (1)
● Track individual disk performance over time
● Create a new instance if disk not getting better
29. Heavy Monitoring (2)
● Monitor replication lag
● Remove from read mix if lag gets too high
○ Incorrect data
○ Strain on primary
30. Heavy Monitoring (3)
● Track slow queries / opcounts / track page faults / IO
volume
○ Tweak indexes accordingly
○ Limit requested data size if you can
31. Open Issues
● More granular page-fault / memory usage information
○ Difficult due to mmap
● Multi-datacenter usage
● Burn-in scripts
● Sharding
○ Tipping point will be insert volume
○ Or inefficient read memory usage
● Better understand replication failures