Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Programmatic Bidding Data Streams & Druid

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige

Hier ansehen

1 von 56 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Andere mochten auch (20)

Anzeige

Ähnlich wie Programmatic Bidding Data Streams & Druid (20)

Aktuellste (20)

Anzeige

Programmatic Bidding Data Streams & Druid

  1. 1. 2015-12-03 Programmatic Bidding Data Streams & Druid Charles Allen
  2. 2. 2015-12-03 We Are Hiring! We’d love to connect! Our current open positions are: Engineering Director, UI Engineer and Distributed Systems Engineer. We always have positions opening up so feel free to connect with Sarah Carter (our Head of Recruiting) for future openings - sarah.carter@metamarkets.com.
  3. 3. 2015-12-03 What is Real-Time Bidding? Real-Time Bidding is resolving advertising supply and demand at the moment of supply. +Best suited for systems with internet connectivity.
  4. 4. 2015-12-03 For the sake of this conversation, Real-Time Bidding (RTB) is the general method by which digital media supply and demand is commonly reconciled using programmatic methodologies over very short time frames.
  5. 5. 2015-12-03 What Happens in Real-Time Bidding? 1. User loads resources which contain ad space (supply is created by a Publisher)
  6. 6. 2015-12-03 What Happens in Real-Time Bidding? 2. Information / notification is generated and distributed to interested parties Avail (a unit of supply of audience attention) is handled by an Exchange
  7. 7. 2015-12-03 What Happens in Real-Time Bidding? 3. Information on the avail is distributed to potentially interested parties We now have an auction
  8. 8. 2015-12-03 What Happens in Real-Time Bidding? 4. Potentially interested parties judge the avail and either bid on the auction, or they do not.
  9. 9. 2015-12-03 What Happens in Real-Time Bidding? 5. The winner of the auction is determined by the exchange. 5b. 100 ms has passed If a human can perceive that an auction took place YOU ARE TOO SLOW
  10. 10. 2015-12-03 What Happens in Real-Time Bidding? 6. The winning ad is attempted to be served as an impression
  11. 11. 2015-12-03 What Happens in Real-Time Bidding? 7. The impression hopefully turns into a click or conversion
  12. 12. 2015-12-03 Avail / Auction Bid Impression Click / Conversion ?? ?
  13. 13. 2015-12-03 Programmatic data is 100x larger than Wall Street
  14. 14. 2015-12-03 Cern - LHC The LHC produces about 1GBs average http://home.cern/about/updates/2015/06/lhc-season-2-cern-computing-ready-data-torrent MMX raw incoming stream data regularly exceeds this * 1hr average
  15. 15. 2015-12-03 Avail / Auction Bid Impression Click / Conversion ?? ?
  16. 16. 2015-12-03 General Architecture Kafka Samza/Kafka Druid RTTranquility Raw (S3) Hadoop / Spark Deep Storage (S3) Druid HistoricalUI / User
  17. 17. 2015-12-03 Druid for Queries!.. But what is Druid? Official - Druid is a fast column-oriented distributed data store Me - Druid is a highly available Data Store designed for interactive, ad-hoc, OLAP style queries on time-series, denormalized data.
  18. 18. 2015-12-03 Key points for BEST use cases Highly Available - No downtime for maintenance since 2011 Interactive - FAST OLAP - Insightful Ad-hoc - Dynamic Time-series - Sequential Denormalized - Flat * By the way, it works on Streams (aka Real Real-Time)
  19. 19. 2015-12-03 Lifecycle of a Real-Time Datum Mr. Charlie Event
  20. 20. 2015-12-03 Lifecycle of a Real-Time Datum Firehose Firehose Druid RT Peon 0 Druid RT Peon 1 * Launched by Overlord by way of a Middle Manager
  21. 21. 2015-12-03 Lifecycle of a Real-Time Datum Firehose Druid RT 0 In Memory Write-Optimized Store Parser
  22. 22. 2015-12-03 Lifecycle of a Real-Time Datum Druid RT 0 In Memory Write-Optimized Store
  23. 23. 2015-12-03 Lifecycle of a Real-Time Datum Druid RT 0 In Memory Write-Optimized Store Rollup
  24. 24. 2015-12-03 Lifecycle of a Real-Time Datum Druid RT 0 In Memory Write-Optimized Store
  25. 25. 2015-12-03 Lifecycle of a Real-Time Datum In Memory Write-Optimized Store Time or Size Memory Mapped Read-Only Store Persist
  26. 26. 2015-12-03 Lifecycle of a Real-Time Datum Memory Mapped Read-Only Store Memory Mapped Read-Only Store Memory Mapped Read-Only Store Merge Memory Mapped Read-Only Store * Segment
  27. 27. 2015-12-03 Handoff Lifecycle of a Real-Time Datum Memory Mapped Read-Only Store Druid RT 0 Druid Historical Deep Storage (S3, HDFS, Azure, Cassandra)
  28. 28. 2015-12-03 Lifecycle of a Real-Time Datum Druid RT 0 Druid Historical Deep Storage (S3, HDFS, Azure, Cassandra) Memory Mapped Read-Only Store * Orchestrated by Coordinator
  29. 29. 2015-12-03 Lifecycle of a Real-Time Datum Druid Historical Memory Mapped Read-Only Store Druid - Hot Druid - Cold Druid - Icy Memory Mapped Read-Only Store Very Little Paging Some Paging Lots of Paging
  30. 30. 2015-12-03 Lifecycle of a Real-Time Datum Druid Historical Memory Mapped Read-Only Store Druid - Hot Druid - Cold Druid - Icy Memory Mapped Read-Only Store Very Little Paging Some Paging Lots of Paging Memory Mapped Read-Only Store
  31. 31. 2015-12-03 Lifecycle of a Real-Time Datum Druid Historical Memory Mapped Read-Only Store Druid - Hot Druid - Cold Druid - Icy Memory Mapped Read-Only Store Very Little Paging Some Paging Lots of Paging
  32. 32. 2015-12-03 Lifecycle of a Real-Time Datum Druid Historical Memory Mapped Read-Only Store Druid - Hot Druid - Cold Druid - Icy Very Little Paging Some Paging Lots of Paging
  33. 33. 2015-12-03 Lifecycle of a Real-Time Datum Lifecycle rules tunable by datasource
  34. 34. 2015-12-03 Canary / Metrics cluster Coordinator Console
  35. 35. 2015-12-03 Lifecycle of a Query Query Router Cold - Broker Hot - Broker XOR
  36. 36. 2015-12-03 Lifecycle of a Query Broker Druid RT (Peon) Druid Historical Hot Druid Historical Cold Druid Historical Icy Cache
  37. 37. 2015-12-03 Define Stream Hooks Lifecycle of a Query Cache Druid Historical XYZ Memory Mapped Read-Only Store Memory Mapped Read-Only Store
  38. 38. 2015-12-03 Lifecycle of a Query Memory Mapped Read-Only Store Column Dictionary Dimension Value Bitmap Dimension Value Bitmap Dimension Value Bitmap Metric Column Metric Column Metric Column Metric Column
  39. 39. 2015-12-03 Lifecycle of a Query Memory Mapped Read-Only Store Column Dictionary Dimension Value Bitmap Dimension Value Bitmap Dimension Value Bitmap Metric Column Metric Column Metric Column Metric Column * ByteBuffer slices
  40. 40. 2015-12-03 Lifecycle of a Query Dimension Value Bitmap Dimension Value Bitmap Metric Column Metric Column Metric Column Iterator Aggregator Aggregator Aggregator Ready, set… GO!
  41. 41. 2015-12-03 Lifecycle of a Query Iterator Aggregator Aggregator Aggregator “Take 0, take 1, take 7, take 10” Scan columns ONCE Metrics Dimensions
  42. 42. 2015-12-03 Lifecycle of a Query Iterator Aggregator Aggregator Aggregator Metrics Dimensions Memory Mapped Byte Buffers (Kernel disk cache)
  43. 43. 2015-12-03 Lifecycle of a Query Iterator Aggregator Aggregator Aggregator Metrics Dimensions JVM managed memory
  44. 44. 2015-12-03 Lifecycle of a Query Intermediate Results Intermediate Results Merge Cache Cache Druid Historical XYZ Result
  45. 45. 2015-12-03 Lifecycle of a Query Druid Historical XYZ Result Druid RT DEF Result Druid Historical ABC Result Merge Broker Done! bubble up to UI Router UI* * Technically bubbles up to Business Logic layer
  46. 46. 2015-12-03 Demo!
  47. 47. 2015-12-03 What was in the Demo?
  48. 48. 2015-12-03 Actual Druid Usage Data Query load is about ½ Million Per Day
  49. 49. 2015-12-03 Actual Druid Indexing Data Only 2.8M streaming events/sec yesterday during peak hour. Was a slow day.
  50. 50. 2015-12-03 Druid OSS Clients! Official + R https://github.com/druid-io/RDruid + Python https://github.com/druid-io/pydruid Community + Spark https://github.com/SparklineData/spark-druid-olap + SQL https://github.com/srikalyc/Sql4D + Many more! http://druid.io/docs/latest/development/libraries.html JavaScript, Node.js, Clojure, Ruby, (other) SQL, TypeScript
  51. 51. 2015-12-03 R Example library(RDruid) start_time <- as.POSIXlt(Sys.time(), "UTC", origin = "1970-01-01") start_time$sec <- 0 end_time <- start_time start_time$hour <- start_time$hour - 24 intvl <- interval(start_time, end_time) segment_times <- druid.query.timeseries( url = druid_query_url, # bard endpoint intervals = intvl, dataSource = "mmx_metrics_druid", aggregations = list(count = longSum(metric("count")), value = longSum(metric("value"))), filter = dimension("host") %=% hosts & dimension("metric") %=% "query/segment/time", granularity = "minute", context = list(useCache = T, populateCache = T) )
  52. 52. 2015-12-03 UI - Panoramix https://github.com/mistercrunch/panoramix
  53. 53. 2015-12-03 UI - Grafana https://github.com/Quantiply/grafana- plugins/tree/master/features/druid
  54. 54. 2015-12-03 UI - Pivot https://github.com/implydata/pivot
  55. 55. 2015-12-03 Druid Speed + https://www.linkedin.com/pulse/combining-druid-spark-interactive-flexible- analytics-scale-butani + http://druid.io/blog/2014/03/17/benchmarking-druid.html We’re always getting faster! Very common question in PRs is “How does this affect speed?” and PROVE IT Micro-benchmarks in druid-io master branch https://github.com/druid-io/druid/tree/master/benchmarks Macro-benchmarks done at scale (see your metrics console for answers)
  56. 56. 2015-12-03 We Are Hiring! We’d love to connect! Our current open positions are: Engineering Director, UI Engineer and Distributed Systems Engineer. We always have positions opening up so feel free to connect with Sarah Carter (our Head of Recruiting) for future openings - sarah.carter@metamarkets.com.

×