SlideShare ist ein Scribd-Unternehmen logo
1 von 72
Downloaden Sie, um offline zu lesen
@ItaiYaffe
●
○
○
○
○
@ItaiYaffe
●
○
○
○
○
●
○
○
○
@ItaiYaffe
●
○
@ItaiYaffe
● 😉
@ItaiYaffe
●
●
●
@ItaiYaffe
…
●
●
@ItaiYaffe
●
○
○
●
○
○
○
●
@ItaiYaffe
●
●
●
●
●
@ItaiYaffe
>10B events/day >20TB/day
S3
1000’s nodes/day 10’s of TB
ingested/day
druid
$100K’s/month
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
●
○
●
●
●
○
●
●
@ItaiYaffe
●
○
●
●
●
○
○
○
@ItaiYaffe
●
○
●
●
●
○
○
○
@ItaiYaffe
Awareness
Exposed to
campaign (e.g
via online ad)
Consideration
Interest is
expressed (e.g
clicked ad)
Intent
Steps taken towards
making a purchase (e.g
added product to cart)
Purchase
@ItaiYaffe
Awareness
Exposed to
campaign (e.g
via online ad)
Consideration
Interest is
expressed (e.g
clicked ad)
Intent
Steps taken towards
making a purchase (e.g
added product to cart)
Purchase
Tactic Stages
@ItaiYaffe
Awareness Consideration Intent Purchase
Drop-
off
Drop-
off
Drop-
off
@ItaiYaffe
PRODUCT PAGE
10M UUs
CHECKOUT
3M UUs
HOMEPAGE
15M UUs
7M
Drop-off
5M
Drop-off
AD EXPOSURE
100M UUs
85M
Drop-off
* UUs = Unique Users
@ItaiYaffe
2 Users
7 Views
2 Purchases $$$ $$$
@ItaiYaffe
PRODUCT PAGE
10M
CHECKOUT
3M
HOMEPAGE
15M
7M
Drop-off
5M
Drop-off
AD EXPOSURE
100M
85M
Drop-off
@ItaiYaffe
●
●
○
●
@ItaiYaffe
…
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
●
●
●
●
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
●
●
○
○
●
@ItaiYaffe
…
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
{event_time=2020-01-28T..., userid=uid1, attribute=online_ad}
{event_time=2020-01-28T..., userid=uid1, attribute=homepage}
{event_time=2020-01-28T..., userid=uid1, attribute=productX_page}
....
@ItaiYaffe
{event_time=2020-01-28T... , userid=uid1, attribute=online_ad, type=Tactic}
{event_time=2020-01-28T... , userid=uid1, attribute=homepage, type=Stage}
{event_time=2020-01-28T... , userid=uid1, attribute=productX_page , type=Stage}
....
@ItaiYaffe
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=homepage}
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=productX_page }
....
....
@ItaiYaffe
"type": "index_hadoop",
"spec": {
"dataSchema": {
"dataSource": "campaign_1472",
"granularitySpec": {
"queryGranularity": "day",
"segmentGranularity": "day",
"type": "uniform",
"intervals": ["2020-01-01/2020-01-29"]
...
@ItaiYaffe
"timestampSpec": {
"column": "event_date", "format": "yyyy-MM-dd"
},
"dimensionsSpec": {
"dimensions": ["tactic", "stage"]
},
"metricsSpec": [{
"fieldName": "userid", "type": "thetaSketch",
"name": "user_id_sketch", "size": 65536}],
...
@ItaiYaffe
"inputSpec": {"type": " multi",
"children": [
{"type": " dataSource",
"ingestionSpec": {
"intervals": ["2020-01-01/2020-01-29"],
"dataSource": "campaign_1472", ...}},
{"type": " static",
"Paths": "s3://<BUCKET_NAME>/date=2020-01-28/campaign=1472",
...},
...
@ItaiYaffe
{__time=2020-01-28, tactic=online_ad, stage=homepage, user_id_sketch=<Object>}
{__time=2020-01-28, tactic=online_ad, stage=productX_page , user_id_sketch=<Object>}
....
....
@ItaiYaffe
{"filter":{"type":"and","fields":[{"type":"or","fields":[
{"type":"selector","dimension":"stage","value":"homepage"
}]},{"type":"or","fields":[{"type":"selector","dimension"
:"tactic","value":"online_ad"}]}]},"intervals":["2020-01-
01T00:00:00.000/2020-01-29T23:59:59.000"],"granularity":"
ALL","dataSource":"campaign_1472","aggregations":[{"filte
r":{"type":"selector","dimension":"stage","value":"homepa
ge"},"aggregator":{"fieldName":"user_id_sketch","size":65
536,"name":"homepage_sketch","type":"thetaSketch"},"type"
:"filtered"}],"queryType":"groupBy","dimensions":[]}
@ItaiYaffe
SELECT
APPROX_COUNT_DISTINCT_DS_THETA(user_id_sketch,65536)
as homepage_sketch
FROM campaign_1472
WHERE (("tactic" = 'online_ad')
AND ("stage" = 'homepage'))
AND __time BETWEEN '2020-01-01T00:00:00.000'
AND '2020-01-29T23:59:59.000'
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
PRODUCT PAGE
1K UUs
...
HOMEPAGE
3.1K UUs
2.5K
Drop-off
ONLINE AD
8.1M UUs
* UUs = Unique Users
@ItaiYaffe
PRODUCT PAGE
1K UUs
...
HOMEPAGE
3.1K UUs
2.5K
Drop-off
ONLINE AD
8.1M UUs
* UUs = Unique Users
@ItaiYaffe
@ItaiYaffe
●
○
● …
○
●
@ItaiYaffe
@ItaiYaffe
{event_time=2020-01-28T09:15, userid=uid1, attribute=productX_page}
{event_time=2020-01-28T10:10, userid=uid1, attribute=online_ad}
{event_time=2020-01-28T10:11, userid=uid1, attribute=homepage}
....
@ItaiYaffe
{event_time=2020-01-28T09:15 , userid=uid1, attribute=productX_page , type=Stage}
{event_time=2020-01-28T10:10 , userid=uid1, attribute=online_ad, type=Tactic}
{event_time=2020-01-28T10:11 , userid=uid1, attribute=homepage, type=Stage}
....
@ItaiYaffe
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=productX_page }
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=homepage}
....
....
@ItaiYaffe
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=productX_page }
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=homepage}
....
....
@ItaiYaffe
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=homepage}
....
....
@ItaiYaffe
{"filter":{"type":"and","fields":[{"type":"or","fields":[{"type":"selector","dimension":"stage","value":"homepage"},{"type":"selecto
r","dimension":"stage","value":"productX_page"},{"type":"selector","dimension":"stage","value":"add_to_cart"},{"type":"selector","di
mension":"stage","value":"checkout"}]},{"type":"or","fields":[{"type":"selector","dimension":"tactic","value":"online_ad"}]}]},"inte
rvals":["2018-12-06T00:00:00.000/2020-01-29T23:59:59.000"],"granularity":"ALL","dataSource":"campaign_974","aggregations":[{"filter"
:{"type":"selector","dimension":"stage","value":"homepage"},"aggregator":{"fieldName":"user_id_sketch","size":65536,"name":"A","type
":"thetaSketch"},"type":"filtered"},{"filter":{"type":"selector","dimension":"tactic","value":"online_ad"},"aggregator":{"fieldName"
:"user_id_sketch","size":65536,"name":"B","type":"thetaSketch"},"type":"filtered"},{"filter":{"type":"selector","dimension":"stage",
"value":"productX_page"},"aggregator":{"fieldName":"user_id_sketch","size":65536,"name":"C","type":"thetaSketch"},"type":"filtered"}
,{"filter":{"type":"selector","dimension":"stage","value":"add_to_cart"},"aggregator":{"fieldName":"user_id_sketch","size":65536,"na
me":"D","type":"thetaSketch"},"type":"filtered"},{"filter":{"type":"selector","dimension":"stage","value":"checkout"},"aggregator":{
"fieldName":"user_id_sketch","size":65536,"name":"E","type":"thetaSketch"},"type":"filtered"}],"postAggregations":[{"field":{"func":
"NOT","size":65536,"name":"(homepage AND online_ad AND ( NOT (productX_page OR add_to_cart OR
checkout)))","type":"thetaSketchSetOp","fields":[{"func":"INTERSECT","size":65536,"name":"(homepage AND online_ad AND ( NOT
(productX_page OR add_to_cart OR
checkout)))","type":"thetaSketchSetOp","fields":[{"fieldName":"A","type":"fieldAccess"},{"fieldName":"B","type":"fieldAccess"}]},{"f
unc":"UNION","size":65536,"name":"(productX_page OR add_to_cart OR
checkout)","type":"thetaSketchSetOp","fields":[{"fieldName":"C","type":"fieldAccess"},{"fieldName":"D","type":"fieldAccess"},{"field
Name":"E","type":"fieldAccess"}]}]},"name":"online_ad_596","type":"thetaSketchEstimate"}],"queryType":"groupBy","dimensions":[]}
@ItaiYaffe
SELECT THETA_SKETCH_NOT(65536,
THETA_SKETCH_INTERSECT(65536,a,b), THETA_SKETCH_UNION(65536,c,d,e)
) as online_ad_596
FROM (
SELECT
DS_THETA("user_id_sketch") FILTER (WHERE stage = 'homepage') as a,
DS_THETA("user_id_sketch") FILTER (WHERE tactic = 'online_ad') as b,
DS_THETA("user_id_sketch") FILTER (WHERE stage = 'productX_page') as c,
DS_THETA("user_id_sketch") FILTER (WHERE stage = 'add_to_cart') as d,
DS_THETA("user_id_sketch") FILTER (WHERE stage = 'checkout') as e
FROM campaign_1472
WHERE stage in ('homepage','productX_page','checkout','add_to_cart')
AND tactic = 'online_ad') subquery
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
PRODUCT PAGE
0.6K UUs
...
HOMEPAGE
3.1K UUs
2.5K
Drop-off
ONLINE AD
8.1M UUs
* UUs = Unique Users
@ItaiYaffe
PRODUCT PAGE
0.6K UUs
...
HOMEPAGE
3.1K UUs
2.5K
Drop-off
ONLINE AD
8.1M UUs
* UUs = Unique Users
@ItaiYaffe
●
○
●
○
○
●
○
○
@ItaiYaffe
●
○
○
●
○
○
●
○
○
@ItaiYaffe
●
○
■
■
○
○
●
○
○
●
○
Funnel Analysis with Spark and Druid
Funnel Analysis with Spark and Druid

Weitere ähnliche Inhalte

Was ist angesagt?

Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Big Data Spain
 
Petabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and LookerPetabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and LookerRittman Analytics
 
Refactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsLuke Han
 
30 days of google cloud event
30 days of google cloud event30 days of google cloud event
30 days of google cloud eventPreetyKhatkar
 
Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. ServicesTigerGraph
 
The Convergence of Data Science and Software Development
The Convergence of Data Science and Software DevelopmentThe Convergence of Data Science and Software Development
The Convergence of Data Science and Software DevelopmentMargriet Groenendijk
 
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Databricks
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...TigerGraph
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDeepak Chandramouli
 
Net conf uy v2018 real time analytics
Net conf uy v2018 real time analyticsNet conf uy v2018 real time analytics
Net conf uy v2018 real time analyticsGaston Cruz
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingDatabricks
 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 TigerGraph
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupBlake Irvine
 
Don't build a data science team
Don't build a data science teamDon't build a data science team
Don't build a data science teamLars Albertsson
 
Viewbix tracking journey
Viewbix tracking journeyViewbix tracking journey
Viewbix tracking journeyidan_by
 
“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...Tom Diederich
 
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...TigerGraph
 

Was ist angesagt? (20)

Big query
Big queryBig query
Big query
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
 
Petabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and LookerPetabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and Looker
 
Refactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics Products
 
30 days of google cloud event
30 days of google cloud event30 days of google cloud event
30 days of google cloud event
 
Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. Services
 
The Convergence of Data Science and Software Development
The Convergence of Data Science and Software DevelopmentThe Convergence of Data Science and Software Development
The Convergence of Data Science and Software Development
 
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platform
 
Net conf uy v2018 real time analytics
Net conf uy v2018 real time analyticsNet conf uy v2018 real time analytics
Net conf uy v2018 real time analytics
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce Setting
 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4
 
TechTuesdays Session 2
TechTuesdays Session 2TechTuesdays Session 2
TechTuesdays Session 2
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
 
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
 
Don't build a data science team
Don't build a data science teamDon't build a data science team
Don't build a data science team
 
Viewbix tracking journey
Viewbix tracking journeyViewbix tracking journey
Viewbix tracking journey
 
“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...
 
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...
 

Ähnlich wie Funnel Analysis with Spark and Druid

DevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and DruidDevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and DruidItai Yaffe
 
Funnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and DruidFunnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and DruidDatabricks
 
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...Itai Yaffe
 
Nielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in PracticeNielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in PracticeImply
 
Ambitious Analytics: Google Analytics Customisation
Ambitious Analytics: Google Analytics CustomisationAmbitious Analytics: Google Analytics Customisation
Ambitious Analytics: Google Analytics CustomisationiLive Conference
 
Intro to Segment & Tracking for Live Streaming by Livestorm
 Intro to Segment & Tracking for Live Streaming by Livestorm Intro to Segment & Tracking for Live Streaming by Livestorm
Intro to Segment & Tracking for Live Streaming by LivestormLivestorm
 
EECS 441 Company Presentation (Lewisant-Twitch)
EECS 441 Company Presentation (Lewisant-Twitch)EECS 441 Company Presentation (Lewisant-Twitch)
EECS 441 Company Presentation (Lewisant-Twitch)Anthony Lewis
 
The next level of social integration
The next level of social integrationThe next level of social integration
The next level of social integrationMat Clayton
 
Google Grants for Nonprofits
Google Grants for NonprofitsGoogle Grants for Nonprofits
Google Grants for Nonprofitsprotocol 80
 

Ähnlich wie Funnel Analysis with Spark and Druid (9)

DevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and DruidDevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
 
Funnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and DruidFunnel Analysis with Apache Spark and Druid
Funnel Analysis with Apache Spark and Druid
 
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
 
Nielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in PracticeNielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in Practice
 
Ambitious Analytics: Google Analytics Customisation
Ambitious Analytics: Google Analytics CustomisationAmbitious Analytics: Google Analytics Customisation
Ambitious Analytics: Google Analytics Customisation
 
Intro to Segment & Tracking for Live Streaming by Livestorm
 Intro to Segment & Tracking for Live Streaming by Livestorm Intro to Segment & Tracking for Live Streaming by Livestorm
Intro to Segment & Tracking for Live Streaming by Livestorm
 
EECS 441 Company Presentation (Lewisant-Twitch)
EECS 441 Company Presentation (Lewisant-Twitch)EECS 441 Company Presentation (Lewisant-Twitch)
EECS 441 Company Presentation (Lewisant-Twitch)
 
The next level of social integration
The next level of social integrationThe next level of social integration
The next level of social integration
 
Google Grants for Nonprofits
Google Grants for NonprofitsGoogle Grants for Nonprofits
Google Grants for Nonprofits
 

Mehr von Itai Yaffe

Mastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data ProcessingMastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data ProcessingItai Yaffe
 
Solving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse AutomationSolving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse AutomationItai Yaffe
 
Lessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark ApplicationsLessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark ApplicationsItai Yaffe
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Itai Yaffe
 
Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Itai Yaffe
 
Evaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening NotesEvaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening NotesItai Yaffe
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeItai Yaffe
 
Data Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management MonolithsData Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management MonolithsItai Yaffe
 
Unleashing the Power of your Data
Unleashing the Power of your DataUnleashing the Power of your Data
Unleashing the Power of your DataItai Yaffe
 
Data Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening NotesData Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening NotesItai Yaffe
 
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)Itai Yaffe
 
Introducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom ConnectorsIntroducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom ConnectorsItai Yaffe
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapItai Yaffe
 
Scalable Incremental Index for Druid
Scalable Incremental Index for DruidScalable Incremental Index for Druid
Scalable Incremental Index for DruidItai Yaffe
 
The benefits of running Spark on your own Docker
The benefits of running Spark on your own DockerThe benefits of running Spark on your own Docker
The benefits of running Spark on your own DockerItai Yaffe
 
Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?Itai Yaffe
 
Scheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructureScheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructureItai Yaffe
 
GraphQL API on a Serverless Environment
GraphQL API on a Serverless EnvironmentGraphQL API on a Serverless Environment
GraphQL API on a Serverless EnvironmentItai Yaffe
 
Serverless data processing built for internet SCALE
Serverless data processing built for internet SCALEServerless data processing built for internet SCALE
Serverless data processing built for internet SCALEItai Yaffe
 
Ask me anything - Women in Big Data Israel
Ask me anything - Women in Big Data IsraelAsk me anything - Women in Big Data Israel
Ask me anything - Women in Big Data IsraelItai Yaffe
 

Mehr von Itai Yaffe (20)

Mastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data ProcessingMastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data Processing
 
Solving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse AutomationSolving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse Automation
 
Lessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark ApplicationsLessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark Applications
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"
 
Evaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening NotesEvaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening Notes
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Data Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management MonolithsData Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management Monoliths
 
Unleashing the Power of your Data
Unleashing the Power of your DataUnleashing the Power of your Data
Unleashing the Power of your Data
 
Data Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening NotesData Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening Notes
 
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
 
Introducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom ConnectorsIntroducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom Connectors
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
 
Scalable Incremental Index for Druid
Scalable Incremental Index for DruidScalable Incremental Index for Druid
Scalable Incremental Index for Druid
 
The benefits of running Spark on your own Docker
The benefits of running Spark on your own DockerThe benefits of running Spark on your own Docker
The benefits of running Spark on your own Docker
 
Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?
 
Scheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructureScheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructure
 
GraphQL API on a Serverless Environment
GraphQL API on a Serverless EnvironmentGraphQL API on a Serverless Environment
GraphQL API on a Serverless Environment
 
Serverless data processing built for internet SCALE
Serverless data processing built for internet SCALEServerless data processing built for internet SCALE
Serverless data processing built for internet SCALE
 
Ask me anything - Women in Big Data Israel
Ask me anything - Women in Big Data IsraelAsk me anything - Women in Big Data Israel
Ask me anything - Women in Big Data Israel
 

Kürzlich hochgeladen

Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss ConfederationEfruzAsilolu
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxVivek487417
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATIONLakpaYanziSherpa
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 

Kürzlich hochgeladen (20)

Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 

Funnel Analysis with Spark and Druid