2. 2015-12-03
We Are Hiring!
We’d love to connect! Our current open positions are:
Engineering Director, UI Engineer and
Distributed Systems Engineer.
We always have positions opening up so feel free to
connect with Sarah Carter (our Head of Recruiting) for
future openings - sarah.carter@metamarkets.com.
3. 2015-12-03
What is Real-Time Bidding?
Real-Time Bidding is resolving advertising
supply and demand at the moment of supply.
+Best suited for systems with internet
connectivity.
4. 2015-12-03
For the sake of this conversation, Real-Time
Bidding (RTB) is the general method by which
digital media supply and demand is commonly
reconciled using programmatic methodologies
over very short time frames.
5. 2015-12-03
What Happens in Real-Time
Bidding?
1. User loads resources which contain ad space
(supply is created by a Publisher)
6. 2015-12-03
What Happens in Real-Time
Bidding?
2. Information / notification is generated and
distributed to interested parties
Avail (a unit of supply of audience attention) is
handled by an Exchange
7. 2015-12-03
What Happens in Real-Time
Bidding?
3. Information on the avail is distributed to
potentially interested parties
We now have an auction
8. 2015-12-03
What Happens in Real-Time
Bidding?
4. Potentially interested parties judge the avail
and either bid on the auction, or they do not.
9. 2015-12-03
What Happens in Real-Time
Bidding?
5. The winner of the auction is determined by
the exchange.
5b. 100 ms has passed
If a human can perceive that an auction took
place YOU ARE TOO SLOW
10. 2015-12-03
What Happens in Real-Time
Bidding?
6. The winning ad is attempted to be served as
an impression
11. 2015-12-03
What Happens in Real-Time
Bidding?
7. The impression hopefully turns into a click
or conversion
14. 2015-12-03
Cern - LHC
The LHC produces about 1GBs average
http://home.cern/about/updates/2015/06/lhc-season-2-cern-computing-ready-data-torrent
MMX raw incoming stream data regularly
exceeds this
* 1hr average
17. 2015-12-03
Druid for Queries!.. But what is Druid?
Official - Druid is a fast column-oriented
distributed data store
Me - Druid is a highly available Data Store
designed for interactive, ad-hoc, OLAP style
queries on time-series, denormalized data.
18. 2015-12-03
Key points for BEST use cases
Highly Available - No downtime for maintenance since 2011
Interactive - FAST
OLAP - Insightful
Ad-hoc - Dynamic
Time-series - Sequential
Denormalized - Flat
* By the way, it works
on Streams
(aka Real Real-Time)
25. 2015-12-03
Lifecycle of a Real-Time Datum
In Memory
Write-Optimized
Store
Time or Size Memory Mapped
Read-Only Store
Persist
26. 2015-12-03
Lifecycle of a Real-Time Datum
Memory Mapped
Read-Only Store
Memory Mapped
Read-Only Store
Memory Mapped
Read-Only Store
Merge Memory Mapped
Read-Only Store
* Segment
27. 2015-12-03
Handoff
Lifecycle of a Real-Time Datum
Memory Mapped
Read-Only Store
Druid RT 0
Druid
Historical
Deep Storage
(S3, HDFS, Azure,
Cassandra)
28. 2015-12-03
Lifecycle of a Real-Time Datum
Druid RT 0
Druid
Historical
Deep Storage
(S3, HDFS, Azure,
Cassandra)
Memory Mapped
Read-Only Store
* Orchestrated by Coordinator
29. 2015-12-03
Lifecycle of a Real-Time Datum
Druid Historical
Memory Mapped
Read-Only Store
Druid - Hot Druid - Cold Druid - Icy
Memory Mapped
Read-Only Store
Very Little Paging Some Paging Lots of Paging
30. 2015-12-03
Lifecycle of a Real-Time Datum
Druid Historical
Memory Mapped
Read-Only Store
Druid - Hot Druid - Cold Druid - Icy
Memory Mapped
Read-Only Store
Very Little Paging Some Paging Lots of Paging
Memory Mapped
Read-Only Store
31. 2015-12-03
Lifecycle of a Real-Time Datum
Druid Historical
Memory Mapped
Read-Only Store
Druid - Hot Druid - Cold Druid - Icy
Memory Mapped
Read-Only Store
Very Little Paging Some Paging Lots of Paging
32. 2015-12-03
Lifecycle of a Real-Time Datum
Druid Historical
Memory Mapped
Read-Only Store
Druid - Hot Druid - Cold Druid - Icy
Very Little Paging Some Paging Lots of Paging
38. 2015-12-03
Lifecycle of a Query
Memory Mapped
Read-Only Store
Column
Dictionary
Dimension
Value Bitmap
Dimension
Value Bitmap
Dimension
Value Bitmap
Metric
Column
Metric
Column
Metric
Column
Metric
Column
39. 2015-12-03
Lifecycle of a Query
Memory Mapped
Read-Only Store
Column
Dictionary
Dimension
Value Bitmap
Dimension
Value Bitmap
Dimension
Value Bitmap
Metric
Column
Metric
Column
Metric
Column
Metric
Column
* ByteBuffer slices
40. 2015-12-03
Lifecycle of a Query
Dimension
Value Bitmap
Dimension
Value Bitmap
Metric
Column
Metric
Column
Metric
Column
Iterator
Aggregator Aggregator Aggregator
Ready, set… GO!
41. 2015-12-03
Lifecycle of a Query
Iterator
Aggregator
Aggregator
Aggregator
“Take 0, take 1,
take 7, take 10”
Scan columns ONCE
Metrics
Dimensions
42. 2015-12-03
Lifecycle of a Query
Iterator
Aggregator
Aggregator
Aggregator
Metrics
Dimensions
Memory Mapped Byte Buffers (Kernel disk cache)
43. 2015-12-03
Lifecycle of a Query
Iterator
Aggregator
Aggregator
Aggregator
Metrics
Dimensions
JVM managed memory
44. 2015-12-03
Lifecycle of a Query
Intermediate
Results
Intermediate
Results
Merge
Cache
Cache
Druid Historical
XYZ Result
45. 2015-12-03
Lifecycle of a Query
Druid Historical
XYZ Result
Druid RT
DEF Result
Druid Historical
ABC Result
Merge Broker
Done!
bubble up to UI
Router UI*
* Technically bubbles
up to Business Logic
layer
56. 2015-12-03
We Are Hiring!
We’d love to connect! Our current open positions are:
Engineering Director, UI Engineer and
Distributed Systems Engineer.
We always have positions opening up so feel free to
connect with Sarah Carter (our Head of Recruiting) for
future openings - sarah.carter@metamarkets.com.