SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
How a media data platform drives
real-time insights & analytics using Spark
Bart Van Der Vurst
Partner at element61
The media landscape is in its biggest transformation
Ongoing market changes
▪ rise of the “Tech Giants”
▪ death of anonymous tracking and
third-party cookies
▪ GDPR / privacy
▪ digital transformation
▪ growing paid content, digital subscriptions
▪ search for personalisation
Data is flooding the media landscape
Enormous amount of data is tracked continuously
▪ Online behavioural or clickstream data
▪ Reading behaviour & attention time
▪ Interests & cross-brand tracking
▪ Digital subscription data
▪ Digital newspaper
▪ Weekly magazines
▪ Programmatic advertising & marketing
▪ Impressions
▪ Clicks
VALUE?
Media companies need a data & digital turnaround
Challenge to tackle
▪ From tackling small to huge data volumes
▪ Subscription data vs. clickstream data
▪ From reporting to data-driven (media) services
▪ From classical BI to a modern data platform
2 years ago,
Roularta & element61 embarked this challenge
Who is Roularta Media Group?
▪ Roularta Media Group is a Belgian multimedia
group, market leader in the field of magazines,
local media and business television
since 1954
▪ 1300+ employees, EUR 295+ mln revenue
Who is element61?
▪ Analytics & AI consultancy team (70 people)
& Databricks partner in Belgium
Roularta build a vision focused on its own data
Roularta build a data strategy with 4 pillars
A data analytics platform serving as foundation
of reporting, real-time personalization & insight services
The available BI stack couldn’t deliver on requirements
BI Stack vs. Modern Requirements
▪ One Analytical Data Hub
▪ Manage big volumes of data – 2,5 TB/month
▪ High performance platform – 20 mio trans/day
▪ True real-time dependencies
(content recommendation, advertising audiences, etc.)
▪ Process structured & unstructured data
▪ GDPR compliant
▪ Advanced scoring, modeling and AI/ML capabilities
▪ Data democratization/self-service
▪ Dashboards in different departments
We’ve built this modern data analytics platform
with Azure & Databricks
Approach taken
▪ Leverage latest
best-practices
(e.g. Delta Lake)
▪ First use-case live in
<6 months
▪ Built iteratively
& use-case driven
▪ First focused on
reporting, now added
AI services
We use Delta Lake end-to-end
What it means
▪ Generally available 2019Q2
Introduced at Roularta 2019Q3
▪ Used across all data lake layers
and for both real-time & batch
data processing
▪ Cornerstone for the GDPR solution
& compliance put into place
Tangible results quickly: We started with editorial
dashboards serving real-time insights
Newsletter dashboards for optimizing marketing
We use predictive article quality scoring
for editorial tuning & paywall decisions
Traffic score Engagement score Conversion score
(Predicted) article quality score
Data of every article
Want to know more?
(Re-)watch the Data & AI session “Building an ML Tool to predict Article Quality Scores using Delta & MLFlow”
Quality scores as self-service insight cockpits
The best is yet to come !
• Enhanced content classification algorithms
for traffic, engagement & conversion
• Dynamic content-tagging for advertising
• Consumer segmentation & profiling
• Publisher dashboards
• Marketeer dashboards
• Further use of (Artificial) Intelligence for Dynamic Paywall
• Newsfeed curation based on behaviour & content potential
Roularta has a media data platform capable to scale
… and best if yet to come
• Enhanced content classification algorithms
for traffic, engagement & conversion
• Publisher dashboards
• Dynamic content-tagging for advertising
• Consumer segmentation
• Marketeer dashboards
• Further AI for Dynamic Paywall
• Newsfeed curation based on behaviour &
content potential
Questions?
Feedback
Your feedback is important to us.
Don’t forget to rate
and review the sessions.

Weitere ähnliche Inhalte

Was ist angesagt?

Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Denodo DataFest 2017: Outpace Your Competition with Real-Time ResponsesDenodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Denodo DataFest 2017: Outpace Your Competition with Real-Time ResponsesDenodo
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersVoltDB
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Dataconomy Media
 
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...Denodo
 
Real-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data GridsReal-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data GridsAli Hodroj
 
Data in Motion vs Data at Rest
Data in Motion vs Data at RestData in Motion vs Data at Rest
Data in Motion vs Data at RestInternap
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesPaul Van Siclen
 
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...apidays
 
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...Denodo
 
From ingest to insights with AWS
From ingest to insights with AWSFrom ingest to insights with AWS
From ingest to insights with AWSPaul Van Siclen
 
Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Hybrid Transactional/Analytics Processing: Beyond the Big Database HypeHybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Hybrid Transactional/Analytics Processing: Beyond the Big Database HypeAli Hodroj
 
Why Finance Should Consider Agile Modern Data Delivery Platform
Why Finance Should Consider Agile Modern Data Delivery PlatformWhy Finance Should Consider Agile Modern Data Delivery Platform
Why Finance Should Consider Agile Modern Data Delivery Platformsyed_javed
 
Building a Modern FinTech Big Data Infrastructure
Building a Modern FinTech Big Data InfrastructureBuilding a Modern FinTech Big Data Infrastructure
Building a Modern FinTech Big Data InfrastructureDatabricks
 
Altis Webinar: Use Cases For The Modern Data Platform
Altis Webinar: Use Cases For The Modern Data PlatformAltis Webinar: Use Cases For The Modern Data Platform
Altis Webinar: Use Cases For The Modern Data PlatformAltis Consulting
 
Advanced analytics integration with python
Advanced analytics integration with pythonAdvanced analytics integration with python
Advanced analytics integration with pythonPaul Van Siclen
 
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...confluent
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...confluent
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017SingleStore
 
Getting Started with Big Data Analytics
Getting Started with Big Data AnalyticsGetting Started with Big Data Analytics
Getting Started with Big Data AnalyticsRob Winters
 
HP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataHP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataRob Winters
 

Was ist angesagt? (20)

Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Denodo DataFest 2017: Outpace Your Competition with Real-Time ResponsesDenodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top Contenders
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
 
Real-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data GridsReal-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data Grids
 
Data in Motion vs Data at Rest
Data in Motion vs Data at RestData in Motion vs Data at Rest
Data in Motion vs Data at Rest
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelines
 
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
 
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
 
From ingest to insights with AWS
From ingest to insights with AWSFrom ingest to insights with AWS
From ingest to insights with AWS
 
Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Hybrid Transactional/Analytics Processing: Beyond the Big Database HypeHybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype
 
Why Finance Should Consider Agile Modern Data Delivery Platform
Why Finance Should Consider Agile Modern Data Delivery PlatformWhy Finance Should Consider Agile Modern Data Delivery Platform
Why Finance Should Consider Agile Modern Data Delivery Platform
 
Building a Modern FinTech Big Data Infrastructure
Building a Modern FinTech Big Data InfrastructureBuilding a Modern FinTech Big Data Infrastructure
Building a Modern FinTech Big Data Infrastructure
 
Altis Webinar: Use Cases For The Modern Data Platform
Altis Webinar: Use Cases For The Modern Data PlatformAltis Webinar: Use Cases For The Modern Data Platform
Altis Webinar: Use Cases For The Modern Data Platform
 
Advanced analytics integration with python
Advanced analytics integration with pythonAdvanced analytics integration with python
Advanced analytics integration with python
 
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
 
Getting Started with Big Data Analytics
Getting Started with Big Data AnalyticsGetting Started with Big Data Analytics
Getting Started with Big Data Analytics
 
HP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataHP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big Data
 

Ähnlich wie How a Media Data Platform Drives Real-time Insights & Analytics using Apache Spark

Data Mining Services in various types
Data Mining Services in various typesData Mining Services in various types
Data Mining Services in various typesloginworks software
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Concurrency - Modern BI
Concurrency - Modern BIConcurrency - Modern BI
Concurrency - Modern BIJake Borzym
 
Business Intelligence Meets Big Data Variety
Business Intelligence Meets Big Data VarietyBusiness Intelligence Meets Big Data Variety
Business Intelligence Meets Big Data Varietywww.panorama.com
 
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...Fred Isbell
 
Five Trends in Real Time Applications
Five Trends in Real Time ApplicationsFive Trends in Real Time Applications
Five Trends in Real Time Applicationsconfluent
 
Data Mining Services in various types
Data Mining Services in various typesData Mining Services in various types
Data Mining Services in various typesloginworks software
 
Analytics trends report 2017
Analytics trends report 2017Analytics trends report 2017
Analytics trends report 2017Robert Sibo
 
Data Visualization Trends - Next Steps for Tableau
Data Visualization Trends - Next Steps for TableauData Visualization Trends - Next Steps for Tableau
Data Visualization Trends - Next Steps for TableauArunima Gupta
 
Strategyzing big data in telco industry
Strategyzing big data in telco industryStrategyzing big data in telco industry
Strategyzing big data in telco industryParviz Iskhakov
 
Company Profile - NPC with TIBCO Spotfire solution
Company Profile - NPC with TIBCO Spotfire solution  Company Profile - NPC with TIBCO Spotfire solution
Company Profile - NPC with TIBCO Spotfire solution Sirinporn Setworaya
 
Cloud computing: Stan Freck
Cloud computing: Stan FreckCloud computing: Stan Freck
Cloud computing: Stan FreckLisa Malone
 
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip ReportGartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip ReportPaul Woudstra
 
Data Integration Trends Businesses Should Watch for in 2021
Data Integration Trends Businesses Should Watch for in 2021Data Integration Trends Businesses Should Watch for in 2021
Data Integration Trends Businesses Should Watch for in 2021Safe Software
 
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking GroupHybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking GroupHostedbyConfluent
 
Di in the age of digital disruptions v1.0
Di in the age of digital disruptions v1.0Di in the age of digital disruptions v1.0
Di in the age of digital disruptions v1.0Amar Roy
 
Transformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias CloudTransformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias CloudRaul Goycoolea Seoane
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Matt Stubbs
 

Ähnlich wie How a Media Data Platform Drives Real-time Insights & Analytics using Apache Spark (20)

Data Mining Services in various types
Data Mining Services in various typesData Mining Services in various types
Data Mining Services in various types
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Concurrency - Modern BI
Concurrency - Modern BIConcurrency - Modern BI
Concurrency - Modern BI
 
Business Intelligence Meets Big Data Variety
Business Intelligence Meets Big Data VarietyBusiness Intelligence Meets Big Data Variety
Business Intelligence Meets Big Data Variety
 
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
 
Five Trends in Real Time Applications
Five Trends in Real Time ApplicationsFive Trends in Real Time Applications
Five Trends in Real Time Applications
 
Data Mining Services in various types
Data Mining Services in various typesData Mining Services in various types
Data Mining Services in various types
 
Analytics trends report 2017
Analytics trends report 2017Analytics trends report 2017
Analytics trends report 2017
 
Data Visualization Trends - Next Steps for Tableau
Data Visualization Trends - Next Steps for TableauData Visualization Trends - Next Steps for Tableau
Data Visualization Trends - Next Steps for Tableau
 
Strategyzing big data in telco industry
Strategyzing big data in telco industryStrategyzing big data in telco industry
Strategyzing big data in telco industry
 
Company Profile - NPC with TIBCO Spotfire solution
Company Profile - NPC with TIBCO Spotfire solution  Company Profile - NPC with TIBCO Spotfire solution
Company Profile - NPC with TIBCO Spotfire solution
 
Cloud computing: Stan Freck
Cloud computing: Stan FreckCloud computing: Stan Freck
Cloud computing: Stan Freck
 
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip ReportGartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
 
Data Integration Trends Businesses Should Watch for in 2021
Data Integration Trends Businesses Should Watch for in 2021Data Integration Trends Businesses Should Watch for in 2021
Data Integration Trends Businesses Should Watch for in 2021
 
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking GroupHybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
 
Di in the age of digital disruptions v1.0
Di in the age of digital disruptions v1.0Di in the age of digital disruptions v1.0
Di in the age of digital disruptions v1.0
 
Transformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias CloudTransformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias Cloud
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
 
The Cloud for SMEs
The Cloud for SMEsThe Cloud for SMEs
The Cloud for SMEs
 

Mehr von Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Mehr von Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Kürzlich hochgeladen

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excelysmaelreyes
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一F sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 

Kürzlich hochgeladen (20)

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excel
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 

How a Media Data Platform Drives Real-time Insights & Analytics using Apache Spark

  • 1. How a media data platform drives real-time insights & analytics using Spark Bart Van Der Vurst Partner at element61
  • 2. The media landscape is in its biggest transformation Ongoing market changes ▪ rise of the “Tech Giants” ▪ death of anonymous tracking and third-party cookies ▪ GDPR / privacy ▪ digital transformation ▪ growing paid content, digital subscriptions ▪ search for personalisation
  • 3. Data is flooding the media landscape Enormous amount of data is tracked continuously ▪ Online behavioural or clickstream data ▪ Reading behaviour & attention time ▪ Interests & cross-brand tracking ▪ Digital subscription data ▪ Digital newspaper ▪ Weekly magazines ▪ Programmatic advertising & marketing ▪ Impressions ▪ Clicks VALUE?
  • 4. Media companies need a data & digital turnaround Challenge to tackle ▪ From tackling small to huge data volumes ▪ Subscription data vs. clickstream data ▪ From reporting to data-driven (media) services ▪ From classical BI to a modern data platform
  • 5. 2 years ago, Roularta & element61 embarked this challenge Who is Roularta Media Group? ▪ Roularta Media Group is a Belgian multimedia group, market leader in the field of magazines, local media and business television since 1954 ▪ 1300+ employees, EUR 295+ mln revenue Who is element61? ▪ Analytics & AI consultancy team (70 people) & Databricks partner in Belgium
  • 6. Roularta build a vision focused on its own data
  • 7. Roularta build a data strategy with 4 pillars
  • 8. A data analytics platform serving as foundation of reporting, real-time personalization & insight services
  • 9. The available BI stack couldn’t deliver on requirements BI Stack vs. Modern Requirements ▪ One Analytical Data Hub ▪ Manage big volumes of data – 2,5 TB/month ▪ High performance platform – 20 mio trans/day ▪ True real-time dependencies (content recommendation, advertising audiences, etc.) ▪ Process structured & unstructured data ▪ GDPR compliant ▪ Advanced scoring, modeling and AI/ML capabilities ▪ Data democratization/self-service ▪ Dashboards in different departments
  • 10. We’ve built this modern data analytics platform with Azure & Databricks Approach taken ▪ Leverage latest best-practices (e.g. Delta Lake) ▪ First use-case live in <6 months ▪ Built iteratively & use-case driven ▪ First focused on reporting, now added AI services
  • 11. We use Delta Lake end-to-end What it means ▪ Generally available 2019Q2 Introduced at Roularta 2019Q3 ▪ Used across all data lake layers and for both real-time & batch data processing ▪ Cornerstone for the GDPR solution & compliance put into place
  • 12. Tangible results quickly: We started with editorial dashboards serving real-time insights
  • 13. Newsletter dashboards for optimizing marketing
  • 14. We use predictive article quality scoring for editorial tuning & paywall decisions Traffic score Engagement score Conversion score (Predicted) article quality score Data of every article Want to know more? (Re-)watch the Data & AI session “Building an ML Tool to predict Article Quality Scores using Delta & MLFlow”
  • 15. Quality scores as self-service insight cockpits
  • 16. The best is yet to come ! • Enhanced content classification algorithms for traffic, engagement & conversion • Dynamic content-tagging for advertising • Consumer segmentation & profiling • Publisher dashboards • Marketeer dashboards • Further use of (Artificial) Intelligence for Dynamic Paywall • Newsfeed curation based on behaviour & content potential
  • 17. Roularta has a media data platform capable to scale … and best if yet to come • Enhanced content classification algorithms for traffic, engagement & conversion • Publisher dashboards • Dynamic content-tagging for advertising • Consumer segmentation • Marketeer dashboards • Further AI for Dynamic Paywall • Newsfeed curation based on behaviour & content potential
  • 19. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.