SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Epiphany:
Connecting Millions Of Events To 50 Billion
Data Points In Real-time
Anirban Banerjee
abanerjee@rocketfuel.com
Shahansad Kp
skp@rocketfuel.com
01
ONLINE ADVERTISING ECOSYSTEM
In a nutshell
Advertisers
Publishers
Users
ZZZX
Exchanges
Page Request
1
Ad Request
2
Bid Request
3
Bid Response
4
Bid Win
5
Ad served
6
Ad served
7
0
Set preferences
Click & Visit
Serve Impression
Convert (e.g. buy a product)
Observers
02 Attribution
Mapping effects to causes
How Was This "Conversion" Achieved?
- Identify the effect of every single impression across
every medium during the customer’s journey
- Needed by modeling, reporting, analysts, customers.
Last Touch Attribution
Whole credit to a single event
Multi Touch Attribution Across Multiple Devices
Partial credits across impressions.
Multiple Algorithms
Algorithm 1.
Algorithm 2.
Algorithm 3.
Attribution using Advertiser Data
03 Epiphany
Requirements
Action by
Impression day
Action by
Conversion day
Rocket Fuel
Attribution
Previous Day
Advertiser Data
Current Day
Rocket Fuel Conversion
Data
Reattribution
Data Flow & Data Democracy
Analysts
Downstream
ETL
Rocket Fuel Impression
History Data
Batch and Realtime
(|Conversions| * |Impressions| *
|Algorithms|)
Impressions
Tens of Billions
Advertiser reports
Thousands
Conversions
Hundreds
of millions
Algorithms
Hundreds
O
04 Epiphany
Rocket Fuel attribution platform
Rocket Fuel Attribution
HBase backed object lookup of impression table
Powered by Blackbird Collections
Lookup in milliseconds
Rocket Fuel Attribution
Advertiser Attribution
Skipping filter on column qualifiers
Point updates to hive
Point updates to a intermediate HBase table
Periodically pulled to Hive
Epiphany Tables
Action keyed by
Impression day
Action keyed by
Conversion day
Action keyed by
User Id
HBase Table
Hive Table
INDEXAction by
Conversion day
Action by
Impression day
Intermediate Table Data Flow
Action keyed by
User Id
Records with deltaTimestamp
based scan
Action keyed by
Impression day
Action keyed by
Conversion day
Old state of records
with delta
Point reads
Computed “changes”
Idempotency
Reducer_1_attempt_1
Reducer_2_attempt_1
Reducer_1_attempt_2
Job
Hbase
Idempotency at record level is necessary for correctness
Hive Table Data Flow
Hbase intermediate table
Hive table
Snapshot
Scan with
prefix filter
Snapshot using
HBase admin
Epiphany Architecture
Test releases with HBase snapshots
Monitor health of HBase instance
Use WAL (Write ahead log)
Generic solution at scale
One ring to rule them all
- Multiple attribution algorithms
- Cross-device scenario
- Advertiser attribution data
Faster availability, faster experiments
More accessible data
- e.g. point-readable actions
Anirban Banerjee
abanerjee@rocketfuel.com
Shahansad Kp
skp@rocketfuel.com
[Major Contributors]
Abhijit Pol
Savin Goyal
Zhan Yuan
WE ARE HIRING!!!

Weitere ähnliche Inhalte

Was ist angesagt?

Jonathan Weber - All Things DATA 2017
Jonathan Weber - All Things DATA 2017Jonathan Weber - All Things DATA 2017
Jonathan Weber - All Things DATA 2017Shuki Mann
 
Google’s new analytics features
Google’s new analytics featuresGoogle’s new analytics features
Google’s new analytics featuresJon Adam
 
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...Amazon Web Services
 
[Webinar] Interacting with BigQuery and Working with Advanced Queries
[Webinar] Interacting with BigQuery and Working with Advanced Queries[Webinar] Interacting with BigQuery and Working with Advanced Queries
[Webinar] Interacting with BigQuery and Working with Advanced QueriesTatvic Analytics
 
[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations
[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations
[Webinar Deck] Google Data Studio for Mastering the Art of Data VisualizationsTatvic Analytics
 
IC tomorrow digital test bed
IC tomorrow digital test bedIC tomorrow digital test bed
IC tomorrow digital test bedSteveJPrice
 
Firebase Analytics - Best Practices To Attract, Engage, Convert & Measure You...
Firebase Analytics - Best Practices To Attract, Engage, Convert & Measure You...Firebase Analytics - Best Practices To Attract, Engage, Convert & Measure You...
Firebase Analytics - Best Practices To Attract, Engage, Convert & Measure You...Tatvic Analytics
 
SMX Advanced - When to use Machine Learning for Search Campaigns
SMX Advanced - When to use Machine Learning for Search CampaignsSMX Advanced - When to use Machine Learning for Search Campaigns
SMX Advanced - When to use Machine Learning for Search CampaignsChristopher Gutknecht
 

Was ist angesagt? (10)

Jonathan Weber - All Things DATA 2017
Jonathan Weber - All Things DATA 2017Jonathan Weber - All Things DATA 2017
Jonathan Weber - All Things DATA 2017
 
Google’s new analytics features
Google’s new analytics featuresGoogle’s new analytics features
Google’s new analytics features
 
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
 
[Webinar] Interacting with BigQuery and Working with Advanced Queries
[Webinar] Interacting with BigQuery and Working with Advanced Queries[Webinar] Interacting with BigQuery and Working with Advanced Queries
[Webinar] Interacting with BigQuery and Working with Advanced Queries
 
[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations
[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations
[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations
 
Shape.io - PPC Budget Management Suite
Shape.io - PPC Budget Management SuiteShape.io - PPC Budget Management Suite
Shape.io - PPC Budget Management Suite
 
IC tomorrow digital test bed
IC tomorrow digital test bedIC tomorrow digital test bed
IC tomorrow digital test bed
 
Firebase Analytics - Best Practices To Attract, Engage, Convert & Measure You...
Firebase Analytics - Best Practices To Attract, Engage, Convert & Measure You...Firebase Analytics - Best Practices To Attract, Engage, Convert & Measure You...
Firebase Analytics - Best Practices To Attract, Engage, Convert & Measure You...
 
SMX Advanced - When to use Machine Learning for Search Campaigns
SMX Advanced - When to use Machine Learning for Search CampaignsSMX Advanced - When to use Machine Learning for Search Campaigns
SMX Advanced - When to use Machine Learning for Search Campaigns
 
CPXi projects
CPXi projectsCPXi projects
CPXi projects
 

Andere mochten auch

Seda an architecture for well-conditioned scalable internet services
Seda   an architecture for well-conditioned scalable internet servicesSeda   an architecture for well-conditioned scalable internet services
Seda an architecture for well-conditioned scalable internet servicesbdemchak
 
Facebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platformsFacebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platformsNitish Upreti
 
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse MatricesPresto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse MatricesQian Lin
 
Cassandra Summit - What's New In Apache TinkerPop?
Cassandra Summit - What's New In Apache TinkerPop?Cassandra Summit - What's New In Apache TinkerPop?
Cassandra Summit - What's New In Apache TinkerPop?Stephen Mallette
 
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...DataStax
 
Configurando o Geany para Python - 03/2012
Configurando o Geany para Python - 03/2012Configurando o Geany para Python - 03/2012
Configurando o Geany para Python - 03/2012Marco Mendes
 
Configurando o geany_para_python
Configurando o geany_para_pythonConfigurando o geany_para_python
Configurando o geany_para_pythonMarco Mendes
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014Patrick McFadin
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageMarko Rodriguez
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph ComputingMarko Rodriguez
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerDataWorks Summit
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...DataStax
 
Cassandra Virtual Node talk
Cassandra Virtual Node talkCassandra Virtual Node talk
Cassandra Virtual Node talkPatrick McFadin
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryMarko Rodriguez
 
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBase
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBaseHBaseCon 2015 General Session: Zen - A Graph Data Model on HBase
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBaseHBaseCon
 
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon
 
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...DataStax
 

Andere mochten auch (20)

F8 tech talk_pinterest_v4
F8 tech talk_pinterest_v4F8 tech talk_pinterest_v4
F8 tech talk_pinterest_v4
 
Seda an architecture for well-conditioned scalable internet services
Seda   an architecture for well-conditioned scalable internet servicesSeda   an architecture for well-conditioned scalable internet services
Seda an architecture for well-conditioned scalable internet services
 
Facebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platformsFacebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platforms
 
Data Driven Growth
Data Driven GrowthData Driven Growth
Data Driven Growth
 
IDEs y Frameworks mas utilizados
IDEs y Frameworks mas utilizadosIDEs y Frameworks mas utilizados
IDEs y Frameworks mas utilizados
 
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse MatricesPresto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices
 
Cassandra Summit - What's New In Apache TinkerPop?
Cassandra Summit - What's New In Apache TinkerPop?Cassandra Summit - What's New In Apache TinkerPop?
Cassandra Summit - What's New In Apache TinkerPop?
 
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
 
Configurando o Geany para Python - 03/2012
Configurando o Geany para Python - 03/2012Configurando o Geany para Python - 03/2012
Configurando o Geany para Python - 03/2012
 
Configurando o geany_para_python
Configurando o geany_para_pythonConfigurando o geany_para_python
Configurando o geany_para_python
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal Language
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph Computing
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
 
Cassandra Virtual Node talk
Cassandra Virtual Node talkCassandra Virtual Node talk
Cassandra Virtual Node talk
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal Machinery
 
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBase
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBaseHBaseCon 2015 General Session: Zen - A Graph Data Model on HBase
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBase
 
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
 
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
 

Ähnlich wie Connecting 50B Data Points to Millions of Events in Real-Time

Taming the Big Data Beast to Drive More Internet Sales
Taming the Big Data Beast to Drive More Internet SalesTaming the Big Data Beast to Drive More Internet Sales
Taming the Big Data Beast to Drive More Internet SalesVickie Gibbs
 
2015 Advance Visibility Summit - Attribution Strategy and How to Make it Work
2015 Advance Visibility Summit - Attribution Strategy and How to Make it Work2015 Advance Visibility Summit - Attribution Strategy and How to Make it Work
2015 Advance Visibility Summit - Attribution Strategy and How to Make it WorkKevin Bekker, MBA
 
Invite media playbook report
Invite media playbook reportInvite media playbook report
Invite media playbook reportAdCMO
 
Invite media playbook
Invite media playbookInvite media playbook
Invite media playbookAdCMO
 
Digital marketing strategy playbook
Digital marketing strategy playbookDigital marketing strategy playbook
Digital marketing strategy playbookAdCMO
 
Vpon - 廣告效果導向為基礎的行動廣告系統
Vpon - 廣告效果導向為基礎的行動廣告系統Vpon - 廣告效果導向為基礎的行動廣告系統
Vpon - 廣告效果導向為基礎的行動廣告系統Vpon
 
Epam BI - Near Realtime Marketing Support System
Epam BI - Near Realtime Marketing Support SystemEpam BI - Near Realtime Marketing Support System
Epam BI - Near Realtime Marketing Support SystemDmitry Tolpeko
 
TripleLift: Preparing for a New Programmatic Ad-Tech World
TripleLift: Preparing for a New Programmatic Ad-Tech WorldTripleLift: Preparing for a New Programmatic Ad-Tech World
TripleLift: Preparing for a New Programmatic Ad-Tech WorldVoltDB
 
Nicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at TwitterNicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at TwitterDavid Garrison
 
Part2 Conversion Optimizer
Part2  Conversion OptimizerPart2  Conversion Optimizer
Part2 Conversion Optimizerguest1c6349
 
Computational Advertising in Yelp Local Ads
Computational Advertising in Yelp Local AdsComputational Advertising in Yelp Local Ads
Computational Advertising in Yelp Local Adssoupsranjan
 
Web Marketing Week1
Web Marketing Week1Web Marketing Week1
Web Marketing Week1cghb1210
 
Cross channel attribution overview feb 2010
Cross channel attribution overview feb 2010Cross channel attribution overview feb 2010
Cross channel attribution overview feb 2010xplusone
 
Top 10 Social Gaming Metrics
Top 10 Social Gaming MetricsTop 10 Social Gaming Metrics
Top 10 Social Gaming Metricsjefftee
 
Display Advertising Basics
Display Advertising BasicsDisplay Advertising Basics
Display Advertising BasicsBidGear Inc.
 
Mobile Ad Monetization for Games | Christian Calderon
Mobile Ad Monetization for Games | Christian CalderonMobile Ad Monetization for Games | Christian Calderon
Mobile Ad Monetization for Games | Christian CalderonJessica Tams
 
Digital Travel Summit LAS 2015 - Confronting The Challenges Of Attribution Mo...
Digital Travel Summit LAS 2015 - Confronting The Challenges Of Attribution Mo...Digital Travel Summit LAS 2015 - Confronting The Challenges Of Attribution Mo...
Digital Travel Summit LAS 2015 - Confronting The Challenges Of Attribution Mo...Jonathan Isernhagen
 
Personalized Retargeting
Personalized RetargetingPersonalized Retargeting
Personalized RetargetingNoraGravity
 

Ähnlich wie Connecting 50B Data Points to Millions of Events in Real-Time (20)

Taming the Big Data Beast to Drive More Internet Sales
Taming the Big Data Beast to Drive More Internet SalesTaming the Big Data Beast to Drive More Internet Sales
Taming the Big Data Beast to Drive More Internet Sales
 
2015 Advance Visibility Summit - Attribution Strategy and How to Make it Work
2015 Advance Visibility Summit - Attribution Strategy and How to Make it Work2015 Advance Visibility Summit - Attribution Strategy and How to Make it Work
2015 Advance Visibility Summit - Attribution Strategy and How to Make it Work
 
Invite media playbook report
Invite media playbook reportInvite media playbook report
Invite media playbook report
 
Invite media playbook
Invite media playbookInvite media playbook
Invite media playbook
 
Digital marketing strategy playbook
Digital marketing strategy playbookDigital marketing strategy playbook
Digital marketing strategy playbook
 
Vpon - 廣告效果導向為基礎的行動廣告系統
Vpon - 廣告效果導向為基礎的行動廣告系統Vpon - 廣告效果導向為基礎的行動廣告系統
Vpon - 廣告效果導向為基礎的行動廣告系統
 
Epam BI - Near Realtime Marketing Support System
Epam BI - Near Realtime Marketing Support SystemEpam BI - Near Realtime Marketing Support System
Epam BI - Near Realtime Marketing Support System
 
TripleLift: Preparing for a New Programmatic Ad-Tech World
TripleLift: Preparing for a New Programmatic Ad-Tech WorldTripleLift: Preparing for a New Programmatic Ad-Tech World
TripleLift: Preparing for a New Programmatic Ad-Tech World
 
Nicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at TwitterNicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at Twitter
 
Part2 Conversion Optimizer
Part2  Conversion OptimizerPart2  Conversion Optimizer
Part2 Conversion Optimizer
 
Computational Advertising in Yelp Local Ads
Computational Advertising in Yelp Local AdsComputational Advertising in Yelp Local Ads
Computational Advertising in Yelp Local Ads
 
Web Marketing Week1
Web Marketing Week1Web Marketing Week1
Web Marketing Week1
 
Enliven cem clickstream solution
Enliven cem clickstream solutionEnliven cem clickstream solution
Enliven cem clickstream solution
 
Cross channel attribution overview feb 2010
Cross channel attribution overview feb 2010Cross channel attribution overview feb 2010
Cross channel attribution overview feb 2010
 
Top 10 Social Gaming Metrics
Top 10 Social Gaming MetricsTop 10 Social Gaming Metrics
Top 10 Social Gaming Metrics
 
Display Advertising Basics
Display Advertising BasicsDisplay Advertising Basics
Display Advertising Basics
 
Cheat sheetmonetization1
Cheat sheetmonetization1Cheat sheetmonetization1
Cheat sheetmonetization1
 
Mobile Ad Monetization for Games | Christian Calderon
Mobile Ad Monetization for Games | Christian CalderonMobile Ad Monetization for Games | Christian Calderon
Mobile Ad Monetization for Games | Christian Calderon
 
Digital Travel Summit LAS 2015 - Confronting The Challenges Of Attribution Mo...
Digital Travel Summit LAS 2015 - Confronting The Challenges Of Attribution Mo...Digital Travel Summit LAS 2015 - Confronting The Challenges Of Attribution Mo...
Digital Travel Summit LAS 2015 - Confronting The Challenges Of Attribution Mo...
 
Personalized Retargeting
Personalized RetargetingPersonalized Retargeting
Personalized Retargeting
 

Mehr von DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Mehr von DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Kürzlich hochgeladen

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 

Kürzlich hochgeladen (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 

Connecting 50B Data Points to Millions of Events in Real-Time