Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Epiphany:
Connecting Millions Of Events To 50 Billion
Data Points In Real-time
Anirban Banerjee
abanerjee@rocketfuel.com
S...
01
ONLINE ADVERTISING ECOSYSTEM
In a nutshell
Advertisers
Publishers
Users
ZZZX
Exchanges
Page Request
1
Ad Request
2
Bid Request
3
Bid Response
4
Bid Win
5
Ad served
6...
02 Attribution
Mapping effects to causes
How Was This "Conversion" Achieved?
- Identify the effect of every single impression across
every medium during the custom...
Last Touch Attribution
Whole credit to a single event
Multi Touch Attribution Across Multiple Devices
Partial credits across impressions.
Multiple Algorithms
Algorithm 1.
Algorithm 2.
Algorithm 3.
Attribution using Advertiser Data
03 Epiphany
Requirements
Action by
Impression day
Action by
Conversion day
Rocket Fuel
Attribution
Previous Day
Advertiser Data
Current Day
Rocket ...
Batch and Realtime
(|Conversions| * |Impressions| *
|Algorithms|)
Impressions
Tens of Billions
Advertiser reports
Thousands
Conversions
Hundr...
04 Epiphany
Rocket Fuel attribution platform
Rocket Fuel Attribution
HBase backed object lookup of impression table
Powered by Blackbird Collections
Lookup in milliseconds
Rocket Fuel Attribution
Advertiser Attribution
Skipping filter on column qualifiers
Point updates to hive
Point updates to a intermediate HBase table
Periodically pulled to Hive
Epiphany Tables
Action keyed by
Impression day
Action keyed by
Conversion day
Action keyed by
User Id
HBase Table
Hive Tab...
Intermediate Table Data Flow
Action keyed by
User Id
Records with deltaTimestamp
based scan
Action keyed by
Impression day...
Idempotency
Reducer_1_attempt_1
Reducer_2_attempt_1
Reducer_1_attempt_2
Job
Hbase
Idempotency at record level is necessary...
Hive Table Data Flow
Hbase intermediate table
Hive table
Snapshot
Scan with
prefix filter
Snapshot using
HBase admin
Epiphany Architecture
Test releases with HBase snapshots
Monitor health of HBase instance
Use WAL (Write ahead log)
Generic solution at scale
One ring to rule them all
- Multiple attribution algorithms
- Cross-device scenario
- Advertiser...
Anirban Banerjee
abanerjee@rocketfuel.com
Shahansad Kp
skp@rocketfuel.com
[Major Contributors]
Abhijit Pol
Savin Goyal
Zha...
Epiphany: Connecting Millions of Events to Thirty Billion Data Points in Real-Time
Nächste SlideShare
Wird geladen in …5
×

Epiphany: Connecting Millions of Events to Thirty Billion Data Points in Real-Time

Hadoop Summit 2015

Ähnliche Bücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

Ähnliche Hörbücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen
  • Als Erste(r) kommentieren

Epiphany: Connecting Millions of Events to Thirty Billion Data Points in Real-Time

  1. 1. Epiphany: Connecting Millions Of Events To 50 Billion Data Points In Real-time Anirban Banerjee abanerjee@rocketfuel.com Shahansad Kp skp@rocketfuel.com
  2. 2. 01 ONLINE ADVERTISING ECOSYSTEM In a nutshell
  3. 3. Advertisers Publishers Users ZZZX Exchanges Page Request 1 Ad Request 2 Bid Request 3 Bid Response 4 Bid Win 5 Ad served 6 Ad served 7 0 Set preferences Click & Visit Serve Impression Convert (e.g. buy a product) Observers
  4. 4. 02 Attribution Mapping effects to causes
  5. 5. How Was This "Conversion" Achieved? - Identify the effect of every single impression across every medium during the customer’s journey - Needed by modeling, reporting, analysts, customers.
  6. 6. Last Touch Attribution Whole credit to a single event
  7. 7. Multi Touch Attribution Across Multiple Devices Partial credits across impressions.
  8. 8. Multiple Algorithms Algorithm 1. Algorithm 2. Algorithm 3.
  9. 9. Attribution using Advertiser Data
  10. 10. 03 Epiphany Requirements
  11. 11. Action by Impression day Action by Conversion day Rocket Fuel Attribution Previous Day Advertiser Data Current Day Rocket Fuel Conversion Data Reattribution Data Flow & Data Democracy Analysts Downstream ETL Rocket Fuel Impression History Data
  12. 12. Batch and Realtime
  13. 13. (|Conversions| * |Impressions| * |Algorithms|) Impressions Tens of Billions Advertiser reports Thousands Conversions Hundreds of millions Algorithms Hundreds O
  14. 14. 04 Epiphany Rocket Fuel attribution platform
  15. 15. Rocket Fuel Attribution
  16. 16. HBase backed object lookup of impression table Powered by Blackbird Collections Lookup in milliseconds
  17. 17. Rocket Fuel Attribution
  18. 18. Advertiser Attribution
  19. 19. Skipping filter on column qualifiers
  20. 20. Point updates to hive Point updates to a intermediate HBase table Periodically pulled to Hive
  21. 21. Epiphany Tables Action keyed by Impression day Action keyed by Conversion day Action keyed by User Id HBase Table Hive Table INDEXAction by Conversion day Action by Impression day
  22. 22. Intermediate Table Data Flow Action keyed by User Id Records with deltaTimestamp based scan Action keyed by Impression day Action keyed by Conversion day Old state of records with delta Point reads Computed “changes”
  23. 23. Idempotency Reducer_1_attempt_1 Reducer_2_attempt_1 Reducer_1_attempt_2 Job Hbase Idempotency at record level is necessary for correctness
  24. 24. Hive Table Data Flow Hbase intermediate table Hive table Snapshot Scan with prefix filter Snapshot using HBase admin
  25. 25. Epiphany Architecture
  26. 26. Test releases with HBase snapshots Monitor health of HBase instance Use WAL (Write ahead log)
  27. 27. Generic solution at scale One ring to rule them all - Multiple attribution algorithms - Cross-device scenario - Advertiser attribution data Faster availability, faster experiments More accessible data - e.g. point-readable actions
  28. 28. Anirban Banerjee abanerjee@rocketfuel.com Shahansad Kp skp@rocketfuel.com [Major Contributors] Abhijit Pol Savin Goyal Zhan Yuan WE ARE HIRING!!!

    Als Erste(r) kommentieren

    Loggen Sie sich ein, um Kommentare anzuzeigen.

  • whome00

    Jul. 14, 2015

Hadoop Summit 2015

Aufrufe

Aufrufe insgesamt

764

Auf Slideshare

0

Aus Einbettungen

0

Anzahl der Einbettungen

6

Befehle

Downloads

0

Geteilt

0

Kommentare

0

Likes

1

×