Go & Uber’s time series database M3
Rob Skillington, Senior Engineer, Observability
April 26, 2016
● M3 is Uber’s proprietary built metrics platform built entirely in NYC
● Graphite was the metric system used at Uber
○ ht...
What does Graphite and M3QL queries look like?
● Graphite
○ stats.counts.cn.requests.arbiter
● movingAverage(transformNull...
What is M3 used for monitoring
What is M3 used for alerting
● Needs to support extremely high write throughput (MMs writes/s)
● Needs to support subsecond query (MMs read/s)
● Storag...
Host
Host
Host
M3 high level architecture
Aggregation
Tier
Ingestion
Services &
Pipeline
Hybrid
Storage
(In-memory &
Cassa...
● Go profiling tool pprof and github.com/uber/go-torch good for CPU
profiling
● Once installed pprof HTTP endpoints as sim...
M3 and Go instrumentation pprof and go-torch
● On Ubuntu install with
○ sudo apt-get install linux-tools-$(uname -r)
● Run with
○ sudo perf top -p <pid>
M3 and Go inst...
M3 and Go instrumentation linux perf tools
● Capturing traffic for a box using libpcap and then either:
○ Forwarding live traffic to staging (shadow traffic)
○ Save ...
type WorkerPool interface {
GoIfAvailable(f func())
}
type workerPool struct {
ch chan struct{}
}
func NewWorkerPool(size ...
M3 and Go scheduling non-blocking, upstream retries
● Good for nodes with a fixed size capacity
○ e.g., trying to perform ...
M3 and Go scheduling blocking, upstream hangs
func (p *workerPool) Go(f func()) {
s := <-p.ch
go func() {
f()
p.ch <- s
}(...
● Good for “in order” stream based work
○ e.g., when the worker pool is full, will ensure application does
not read from d...
● A lot of the overhead of working in a memory managed environment
is waiting for the garbage collector to run
● On M3 we ...
type NewFunc func() interface{}
type ObjectPool interface {
Get() interface{}
Put(interface{})
}
type spilloverPool struct...
● sync.Pool will purge pooled objects during stop the world garbage
collection
● Spillover pools are good for steady state...
M3 and Go pooling
type Value interface { /* ... */ }
var (
valuesPoolsBuckets = []int{
128, 512, 1024, 8192, 16384, 32768,...
● Helpful when dealing with large contiguous arrays that are
expensive to allocate and need up to a specific capacity
● By...
M3 and Go pooling
// Closer is an interface implemented by objects that
// should be closed when a context completes.
type...
● Ensure clean and uniform return of pooled resources to their
respective pools by registering Closers with the context
● ...
● The heaviest work can cause timeouts during queries, when this
happens without cancellation the request will correctly r...
type Retrier interface {
// Attempt to perform a method with configured retrier options.
Attempt(f func() error) error
// ...
M3 and Go retries and circuit breaking
func (w *indexer) Write(ctx RequestContext, id string, tags []Tag) (bool, error) {
...
● Jitter is important to avoid stampeding herds after downstream
recovers from a failure
● Also important to use a worker ...
Nächste SlideShare
Wird geladen in …5
×

Go and Uber’s time series database m3

2.281 Aufrufe

Veröffentlicht am

Rob Skillington on Go and Uber's time series database M3 at the NYC Golang Meetup, April 26, 2016

Veröffentlicht in: Ingenieurwesen
3 Kommentare
7 Gefällt mir
Statistik
Notizen
Keine Downloads
Aufrufe
Aufrufe insgesamt
2.281
Auf SlideShare
0
Aus Einbettungen
0
Anzahl an Einbettungen
29
Aktionen
Geteilt
0
Downloads
24
Kommentare
3
Gefällt mir
7
Einbettungen 0
Keine Einbettungen

Keine Notizen für die Folie

Go and Uber’s time series database m3

  1. 1. Go & Uber’s time series database M3 Rob Skillington, Senior Engineer, Observability April 26, 2016
  2. 2. ● M3 is Uber’s proprietary built metrics platform built entirely in NYC ● Graphite was the metric system used at Uber ○ https://github.com/graphite-project/graphite-web ● Graphite was used for a long time, however it had: ○ Poor resiliency, clustering, efficiency, no replication ○ Extremely high operational cost to expand capacity ● It provides a Graphite compatible query interface and it’s own query interface M3QL ○ Limitations of Graphite expressions: ■ Read inside-out - opposite the flow of execution ■ Grouping by position in path - not enough flexibility ○ M3QL is a computation language for filtering and grouping by tags, it is pipe based What is M3?
  3. 3. What does Graphite and M3QL queries look like? ● Graphite ○ stats.counts.cn.requests.arbiter ● movingAverage(transformNull(stats.counts.cn.requests.arbiter, 0), ‘5min’) ● M3QL ○ fetch name:requests caller:cn target:arbiter ● fetch name:requests caller:cn target:arbiter | transformNull 0 | movingAverage 5min ● fetch name:requests caller:cn | sum target | sort max | head 5 ● fetch name:errors caller:cn | sum target | asPercent (fetch name:requests caller:cn | sum target)
  4. 4. What is M3 used for monitoring
  5. 5. What is M3 used for alerting
  6. 6. ● Needs to support extremely high write throughput (MMs writes/s) ● Needs to support subsecond query (MMs read/s) ● Storage is hybrid of inhouse in-memory replicated TSDB and Cassandra with Date Tier compaction strategy ● Query execution is resolve and execute ○ Resolve: lookup the list of time series in Elastic Search index required by the query ○ Execute: fetch series from hybrid storage and execute functions M3 services and storage
  7. 7. Host Host Host M3 high level architecture Aggregation Tier Ingestion Services & Pipeline Hybrid Storage (In-memory & Cassandra) Index (ElasticSearch) Query Service . . . . . Collector (Agent running on Fleet)
  8. 8. ● Go profiling tool pprof and github.com/uber/go-torch good for CPU profiling ● Once installed pprof HTTP endpoints as simple as: ○ go-torch --time=15 --file "torch.svg" --url http://localhost:8080 M3 and Go instrumentation pprof and go-torch
  9. 9. M3 and Go instrumentation pprof and go-torch
  10. 10. ● On Ubuntu install with ○ sudo apt-get install linux-tools-$(uname -r) ● Run with ○ sudo perf top -p <pid> M3 and Go instrumentation linux perf tools
  11. 11. M3 and Go instrumentation linux perf tools
  12. 12. ● Capturing traffic for a box using libpcap and then either: ○ Forwarding live traffic to staging (shadow traffic) ○ Save to disk events with timestamps for replay and amplification M3 and Go instrumentation shadow traffic and load testing
  13. 13. type WorkerPool interface { GoIfAvailable(f func()) } type workerPool struct { ch chan struct{} } func NewWorkerPool(size int) WorkerPool { pool := &workerPool{ ch: make(chan struct{}, size), } for i := 0; i < size; i++ { pool.ch <- struct{}{} } return pool } M3 and Go scheduling non-blocking, upstream retries func (p *workerPool) GoIfAvailable(f func()) bool { select { case s := <-p.ch: go func() { f() p.ch <- s }() return true default: return false } }
  14. 14. M3 and Go scheduling non-blocking, upstream retries ● Good for nodes with a fixed size capacity ○ e.g., trying to perform more than a fixed set of work a node causes node to thrash and/or degrades overall throughput and latency ● Good when fronted by a smart load-balancer ○ e.g., using HAProxy when returning 500s and using “redispatch” HAProxy will take node out of rotation and re-attempt on a node that is not “full”
  15. 15. M3 and Go scheduling blocking, upstream hangs func (p *workerPool) Go(f func()) { s := <-p.ch go func() { f() p.ch <- s }() }
  16. 16. ● Good for “in order” stream based work ○ e.g., when the worker pool is full, will ensure application does not read from data on incoming TCP socket causing backpressure on upstream M3 and Go scheduling blocking, upstream hangs
  17. 17. ● A lot of the overhead of working in a memory managed environment is waiting for the garbage collector to run ● On M3 we have a very heavy write and read path and essentially we need to minimize the allocations that are occurring or else we spend a lot of time simply allocating and collecting memory instead of using CPU cycles for real work M3 and Go pooling why
  18. 18. type NewFunc func() interface{} type ObjectPool interface { Get() interface{} Put(interface{}) } type spilloverPool struct { core chan interface{} spill sync.Pool avail int64 } func NewSpilloverPool(size int, f NewFunc) ObjectPool { core := make(chan interface{}, size) for i := 0; i < size; i++ { core <- f() } sz := int64(size) return &spilloverPool{core, sync.Pool{New: f}, sz} } M3 and Go pooling spillover pools, fixed size but with elasticity func (s *spilloverPool) Get() interface{} { left := atomic.AddInt64(&s.avail, -1) if left >= 0 { return <-s.core } atomic.AddInt64(&s.avail, 1) return s.spill.Get() } func (s *spilloverPool) Put(obj interface{}) { left := atomic.AddInt64(&s.avail, 1) if left <= int64(cap(s.core)) { s.core <- obj return } atomic.AddInt64(&s.avail, -1) s.spill.Put(obj) }
  19. 19. ● sync.Pool will purge pooled objects during stop the world garbage collection ● Spillover pools are good for steady state execution as objects never are released if fixed size never exhausted ● Spillover pools are also good for bursty traffic as it reverts to short term lived pooling with sync.Pool when fixed size pool is exhausted M3 and Go pooling spillover pools, fixed size with elasticity
  20. 20. M3 and Go pooling type Value interface { /* ... */ } var ( valuesPoolsBuckets = []int{ 128, 512, 1024, 8192, 16384, 32768, 65536, } valuePools []pool.ObjectPool ) func newValues(ctx Context, capacity int) []Value { var values []Value if idx := findPoolIndex(capacity); idx != -1 { values = valuePools[idx].Get().([]Value) ctx.RegisterCloser(values) values = values[:0] } else { values = make([]Values, 0, capacity) } return vals } array pooling with buckets
  21. 21. ● Helpful when dealing with large contiguous arrays that are expensive to allocate and need up to a specific capacity ● By returning []Value with a slice to the start of the array simply using x = append(x, value) we can rely on append to grow outside of our capacity if in the rare case it is required M3 and Go pooling array pooling with buckets
  22. 22. M3 and Go pooling // Closer is an interface implemented by objects that // should be closed when a context completes. type Closer interface { Close() error } type RequestContext interface { Closer // RegisterCloser registers an object that should be closed when this // context is closed. Can be used to cleanup per-request objects. RegisterCloser(closer Closer) // AddAsyncTask allows asynchronous tasks to be enqueued that will // ensure this context does not call its registered closers until // the tasks are all complete. AddAsyncTasks(count int) // DoneAsyncTask signals that an asynchronous task is complete, when // all asynchronous tasks complete if the context has been closed and // avoided calling its registered closers it will finally call them. DoneAsyncTask() } associating pooled resources with contexts
  23. 23. ● Ensure clean and uniform return of pooled resources to their respective pools by registering Closers with the context ● Ensure Closers not called until all pending AsyncTasks are finished ○ Helpful for when a timeout is hit waiting for a resource and the request is finished early, however the timed out downstream request might unsafely try to modify pooled resources part of the now closed upstream request before noticing the context was cancelled M3 and Go pooling associating pooled resources with contexts
  24. 24. ● The heaviest work can cause timeouts during queries, when this happens without cancellation the request will correctly return an error but continue to perform heavy operations in the background ● You can use golang.org/x/net/context to propogate timeouts and cancellations to child goroutines M3 and Go cancellation cancelling expensive downstream work
  25. 25. type Retrier interface { // Attempt to perform a method with configured retrier options. Attempt(f func() error) error // AttemptWhile to perform a method with configured retrier options while condition evaluates true. AttemptWhile(condition func() bool, f func() error) error } type Circuit interface { // Attempt to perform a method with configured circuit options, when circuit broken immediately return error. Attempt(f func() error) error } func NewIndexer() Indexer { return &indexer{ retrier: retry.NewRetrier(retry.Options().Initial(500 * time.Millisecond).Max(2).Jitter(true)), circuit: circuit.NewCircuit(circuit.Options().RollingPeriod(10 * time.Second).ThresholdPercent(0.1)), } } M3 and Go retries and circuit breaking failing successfully
  26. 26. M3 and Go retries and circuit breaking func (w *indexer) Write(ctx RequestContext, id string, tags []Tag) (bool, error) { var newEntry bool err := w.retrier.AttemptWhile(func() bool { return !ctx.IsCancelled() }, func() error { return w.circuit.Attempt(func() error { if result, indexErr := w.client.Index(id, tags); indexErr != nil { return indexErr } newEntry = result return nil }) }) return newEntry, err } failing successfully
  27. 27. ● Jitter is important to avoid stampeding herds after downstream recovers from a failure ● Also important to use a worker pool with correct desired upstream backpressure ○ This will ensure that when downstream does recover the set of in flight operations is not too high to thrash the downstream M3 and Go retries and circuit breaking failing successfully

×