Suche senden
Hochladen
Apache Spark streaming and HBase
âą
20 gefÀllt mir
âą
5,687 views
Carol McDonald
Folgen
Overview of Apache Spark Streaming with HBase
Weniger lesen
Mehr lesen
Technologie
Melden
Teilen
Melden
Teilen
1 von 39
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Carol McDonald
Â
Getting Started with HBase
Getting Started with HBase
Carol McDonald
Â
Free Code Friday - Spark Streaming with HBase
Free Code Friday - Spark Streaming with HBase
MapR Technologies
Â
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill
Carol McDonald
Â
Introduction to Spark on Hadoop
Introduction to Spark on Hadoop
Carol McDonald
Â
Getting started with HBase
Getting started with HBase
Carol McDonald
Â
Drill into Drill â How Providing Flexibility and Performance is Possible
Drill into Drill â How Providing Flexibility and Performance is Possible
MapR Technologies
Â
Apache Spark Overview @ ferret
Apache Spark Overview @ ferret
Andrii Gakhov
Â
Empfohlen
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Carol McDonald
Â
Getting Started with HBase
Getting Started with HBase
Carol McDonald
Â
Free Code Friday - Spark Streaming with HBase
Free Code Friday - Spark Streaming with HBase
MapR Technologies
Â
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill
Carol McDonald
Â
Introduction to Spark on Hadoop
Introduction to Spark on Hadoop
Carol McDonald
Â
Getting started with HBase
Getting started with HBase
Carol McDonald
Â
Drill into Drill â How Providing Flexibility and Performance is Possible
Drill into Drill â How Providing Flexibility and Performance is Possible
MapR Technologies
Â
Apache Spark Overview @ ferret
Apache Spark Overview @ ferret
Andrii Gakhov
Â
Using Apache Drill
Using Apache Drill
Chicago Hadoop Users Group
Â
Cmu-2011-09.pptx
Cmu-2011-09.pptx
Ted Dunning
Â
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
Modern Data Stack France
Â
Apache Spark & Hadoop
Apache Spark & Hadoop
MapR Technologies
Â
Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student Slides
Databricks
Â
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Legacy Typesafe (now Lightbend)
Â
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Summit
Â
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
t3rmin4t0r
Â
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Codemotion
Â
Tez Data Processing over Yarn
Tez Data Processing over Yarn
InMobi Technology
Â
Dealing with an Upside Down Internet
Dealing with an Upside Down Internet
MapR Technologies
Â
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Brian O'Neill
Â
Yahoo's Experience Running Pig on Tez at Scale
Yahoo's Experience Running Pig on Tez at Scale
DataWorks Summit/Hadoop Summit
Â
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
spark-project
Â
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
The Hive
Â
Drill at the Chug 9-19-12
Drill at the Chug 9-19-12
Ted Dunning
Â
October 2014 HUG : Hive On Spark
October 2014 HUG : Hive On Spark
Yahoo Developer Network
Â
Hug france-2012-12-04
Hug france-2012-12-04
Ted Dunning
Â
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
DataWorks Summit/Hadoop Summit
Â
What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS
DataWorks Summit/Hadoop Summit
Â
Apache HBase ć „é (珏ïŒć)
Apache HBase ć „é (珏ïŒć)
tatsuya6502
Â
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
DataWorks Summit/Hadoop Summit
Â
Weitere Àhnliche Inhalte
Was ist angesagt?
Using Apache Drill
Using Apache Drill
Chicago Hadoop Users Group
Â
Cmu-2011-09.pptx
Cmu-2011-09.pptx
Ted Dunning
Â
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
Modern Data Stack France
Â
Apache Spark & Hadoop
Apache Spark & Hadoop
MapR Technologies
Â
Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student Slides
Databricks
Â
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Legacy Typesafe (now Lightbend)
Â
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Summit
Â
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
t3rmin4t0r
Â
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Codemotion
Â
Tez Data Processing over Yarn
Tez Data Processing over Yarn
InMobi Technology
Â
Dealing with an Upside Down Internet
Dealing with an Upside Down Internet
MapR Technologies
Â
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Brian O'Neill
Â
Yahoo's Experience Running Pig on Tez at Scale
Yahoo's Experience Running Pig on Tez at Scale
DataWorks Summit/Hadoop Summit
Â
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
spark-project
Â
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
The Hive
Â
Drill at the Chug 9-19-12
Drill at the Chug 9-19-12
Ted Dunning
Â
October 2014 HUG : Hive On Spark
October 2014 HUG : Hive On Spark
Yahoo Developer Network
Â
Hug france-2012-12-04
Hug france-2012-12-04
Ted Dunning
Â
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
DataWorks Summit/Hadoop Summit
Â
What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS
DataWorks Summit/Hadoop Summit
Â
Was ist angesagt?
(20)
Using Apache Drill
Using Apache Drill
Â
Cmu-2011-09.pptx
Cmu-2011-09.pptx
Â
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
Â
Apache Spark & Hadoop
Apache Spark & Hadoop
Â
Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student Slides
Â
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Â
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Â
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
Â
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Â
Tez Data Processing over Yarn
Tez Data Processing over Yarn
Â
Dealing with an Upside Down Internet
Dealing with an Upside Down Internet
Â
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Â
Yahoo's Experience Running Pig on Tez at Scale
Yahoo's Experience Running Pig on Tez at Scale
Â
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Â
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Â
Drill at the Chug 9-19-12
Drill at the Chug 9-19-12
Â
October 2014 HUG : Hive On Spark
October 2014 HUG : Hive On Spark
Â
Hug france-2012-12-04
Hug france-2012-12-04
Â
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Â
What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS
Â
Andere mochten auch
Apache HBase ć „é (珏ïŒć)
Apache HBase ć „é (珏ïŒć)
tatsuya6502
Â
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
DataWorks Summit/Hadoop Summit
Â
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
Michael Stack
Â
HBaseăšSparkă§ă»ăłă”ăŒăăŒăżăæćčæŽ»çš #hbasejp
HBaseăšSparkă§ă»ăłă”ăŒăăŒăżăæćčæŽ»çš #hbasejp
FwardNetwork
Â
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
Josh Elser
Â
Spark + HBase
Spark + HBase
DataWorks Summit/Hadoop Summit
Â
Apache HBase ć „é (珏ïŒć)
Apache HBase ć „é (珏ïŒć)
tatsuya6502
Â
Andere mochten auch
(7)
Apache HBase ć „é (珏ïŒć)
Apache HBase ć „é (珏ïŒć)
Â
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
Â
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
Â
HBaseăšSparkă§ă»ăłă”ăŒăăŒăżăæćčæŽ»çš #hbasejp
HBaseăšSparkă§ă»ăłă”ăŒăăŒăżăæćčæŽ»çš #hbasejp
Â
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
Â
Spark + HBase
Spark + HBase
Â
Apache HBase ć „é (珏ïŒć)
Apache HBase ć „é (珏ïŒć)
Â
Ăhnlich wie Apache Spark streaming and HBase
Spark Streaming Data Pipelines
Spark Streaming Data Pipelines
MapR Technologies
Â
The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache Spark
Cloudera, Inc.
Â
Deep dive into spark streaming
Deep dive into spark streaming
Tao Li
Â
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
Djamel Zouaoui
Â
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Tugdual Grall
Â
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
Debraj GuhaThakurta
Â
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
Debraj GuhaThakurta
Â
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream Data
DataWorks Summit
Â
Applying Machine Learning to Live Patient Data
Applying Machine Learning to Live Patient Data
Carol McDonald
Â
An introduction To Apache Spark
An introduction To Apache Spark
Amir Sedighi
Â
Spark Study Notes
Spark Study Notes
Richard Kuo
Â
MATLAB and Scientific Data: New Features and Capabilities
MATLAB and Scientific Data: New Features and Capabilities
The HDF-EOS Tools and Information Center
Â
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
Tomer Shiran
Â
Microsoft R Server for Data Sciencea
Microsoft R Server for Data Sciencea
Data Science Thailand
Â
Real Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark Streaming
Hari Shreedharan
Â
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
Data Con LA
Â
Spark Streaming & Kafka-The Future of Stream Processing
Spark Streaming & Kafka-The Future of Stream Processing
Jack Gudenkauf
Â
Scrap Your MapReduce - Apache Spark
Scrap Your MapReduce - Apache Spark
IndicThreads
Â
Apache Spark Overview
Apache Spark Overview
Carol McDonald
Â
Big data processing with Apache Spark and Oracle Database
Big data processing with Apache Spark and Oracle Database
Martin Toshev
Â
Ăhnlich wie Apache Spark streaming and HBase
(20)
Spark Streaming Data Pipelines
Spark Streaming Data Pipelines
Â
The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache Spark
Â
Deep dive into spark streaming
Deep dive into spark streaming
Â
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
Â
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Â
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
Â
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
Â
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream Data
Â
Applying Machine Learning to Live Patient Data
Applying Machine Learning to Live Patient Data
Â
An introduction To Apache Spark
An introduction To Apache Spark
Â
Spark Study Notes
Spark Study Notes
Â
MATLAB and Scientific Data: New Features and Capabilities
MATLAB and Scientific Data: New Features and Capabilities
Â
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
Â
Microsoft R Server for Data Sciencea
Microsoft R Server for Data Sciencea
Â
Real Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark Streaming
Â
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
Â
Spark Streaming & Kafka-The Future of Stream Processing
Spark Streaming & Kafka-The Future of Stream Processing
Â
Scrap Your MapReduce - Apache Spark
Scrap Your MapReduce - Apache Spark
Â
Apache Spark Overview
Apache Spark Overview
Â
Big data processing with Apache Spark and Oracle Database
Big data processing with Apache Spark and Oracle Database
Â
Mehr von Carol McDonald
Introduction to machine learning with GPUs
Introduction to machine learning with GPUs
Carol McDonald
Â
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Carol McDonald
Â
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Carol McDonald
Â
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Carol McDonald
Â
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
Carol McDonald
Â
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Carol McDonald
Â
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Carol McDonald
Â
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Carol McDonald
Â
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Carol McDonald
Â
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
Carol McDonald
Â
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep Learning
Carol McDonald
Â
Spark graphx
Spark graphx
Carol McDonald
Â
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Carol McDonald
Â
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures
Carol McDonald
Â
Spark machine learning predicting customer churn
Spark machine learning predicting customer churn
Carol McDonald
Â
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
Carol McDonald
Â
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka API
Carol McDonald
Â
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision Trees
Carol McDonald
Â
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
Carol McDonald
Â
Apache Spark Machine Learning
Apache Spark Machine Learning
Carol McDonald
Â
Mehr von Carol McDonald
(20)
Introduction to machine learning with GPUs
Introduction to machine learning with GPUs
Â
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Â
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Â
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Â
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
Â
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Â
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Â
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Â
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Â
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
Â
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep Learning
Â
Spark graphx
Spark graphx
Â
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Â
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures
Â
Spark machine learning predicting customer churn
Spark machine learning predicting customer churn
Â
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
Â
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka API
Â
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision Trees
Â
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
Â
Apache Spark Machine Learning
Apache Spark Machine Learning
Â
KĂŒrzlich hochgeladen
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
Â
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
apidays
Â
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
Â
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
Â
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
naman860154
Â
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
wesley chun
Â
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
Â
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
Â
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Â
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
The Digital Insurer
Â
đŹ The future of MySQL is Postgres đ
đŹ The future of MySQL is Postgres đ
RTylerCroy
Â
WhatsApp 9892124323 âCall Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 âCall Girls In Kalyan ( Mumbai ) secure service
Pooja Nehwal
Â
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
Â
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Allon Mureinik
Â
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
Â
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Â
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
naman860154
Â
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
Â
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Enterprise Knowledge
Â
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Igalia
Â
KĂŒrzlich hochgeladen
(20)
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Â
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Â
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Â
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Â
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
Â
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Â
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Â
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
Â
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Â
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
Â
đŹ The future of MySQL is Postgres đ
đŹ The future of MySQL is Postgres đ
Â
WhatsApp 9892124323 âCall Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 âCall Girls In Kalyan ( Mumbai ) secure service
Â
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
Â
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Â
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Â
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Â
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
Â
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Â
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Â
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Â
Apache Spark streaming and HBase
1.
Ÿ © 2015 MapR
Technologies 1 Ÿ © 2014 MapR Technologies Overview of Apache Spark Streaming Carol McDonald
2.
Ÿ © 2015 MapR
Technologies 2 Agenda âąâŻ Why Apache Spark Streaming ? âąâŻ What is Apache Spark Streaming? â⯠Key Concepts and Architecture âąâŻ How it works by Example
3.
Ÿ © 2015 MapR
Technologies 3 Why Spark Streaming? âąâŻ Process Time Series data : â⯠Results in near-real-time âąâŻ Use Cases â⯠Social network trends â⯠Website statistics, monitoring â⯠Fraud detection â⯠Advertising click monetization put put put put Time stamped data data âąâŻ Sensor, System Metrics, Events, log files âąâŻ Stock Ticker, User Activity âąâŻ Hi Volume, Velocity Data for real-time monitoring
4.
Ÿ © 2015 MapR
Technologies 4 What is time series data? âąâŻ Stuff with timestamps â⯠Sensor data â⯠log files â⯠Phones.. Credit Card Transactions Web user behaviour Social media Log files Geodata Sensors
5.
Ÿ © 2015 MapR
Technologies 5 Why Spark Streaming ? What If? âąâŻ You want to analyze data as it arrives? For Example Time Series Data: Sensors, Clicks, Logs, Stats
6.
Ÿ © 2015 MapR
Technologies 6 Batch Processing It's 6:01 and 72 degrees It's 6:02 and 75 degrees It's 6:03 and 77 degrees It's 6:04 and 85 degrees It's 6:05 and 90 degrees It's 6:06 and 85 degrees It's 6:07 and 77 degrees It's 6:08 and 75 degrees It was hot at 6:05 yesterday! Batch processing may be too late for some events
7.
Ÿ © 2015 MapR
Technologies 7 Event Processing It's 6:05 and 90 degrees Someone should open a window! Streaming Its becoming important to process events as they arrive
8.
Ÿ © 2015 MapR
Technologies 8 What is Spark Streaming? âąâŻ extension of the core Spark AP âąâŻ enables scalable, high-throughput, fault-tolerant stream processing of live data Data Sources Data Sinks
9.
Ÿ © 2015 MapR
Technologies 9 Stream Processing Architecture Streaming Sources/Apps MapR-FS Data Ingest Topics MapR-DB Data Storage MapR-FS Apps  Stream Processing
10.
Ÿ © 2015 MapR
Technologies 10 Key Concepts âąâŻ Data Sources: â⯠File Based: HDFS â⯠Network Based: TCP sockets, Twitter, Kafka, Flume, ZeroMQ, Akka Actor âąâŻ Transformations âąâŻ Output Operations MapR-FS Topics
11.
Ÿ © 2015 MapR
Technologies 11 Spark Streaming Architecture âąâŻ Divide  data  stream  into  batches  of  X  seconds   ââŻCalled  DStream  =   sequence  of  RDDs    Spark Streaming input data stream DStream RDD batches Batch interval data from time 0 to 1 data from time 1 to 2 RDD @ time 2 data from time 2 to 3 RDD @ time 3RDD @ time 1
12.
Ÿ © 2015 MapR
Technologies 12 Resilient Distributed Datasets (RDD) Spark revolves around RDDs âąâŻ read only collection of elements
13.
Ÿ © 2015 MapR
Technologies 13 Resilient Distributed Datasets (RDD) Spark revolves around RDDs âąâŻ read only collection of elements âąâŻ operated on in parallel âąâŻ Cached in memory â⯠Or on disk âąâŻ Fault tolerant
14.
Ÿ © 2015 MapR
Technologies 14 Working With RDDs RDD RDD RDD RDD Transformations Action Value linesWithErrorRDD.count()! 6! ! linesWithErrorRDD.first()! # Error line! textFile = sc.textFile(âSomeFile.txtâ)! linesWithErrorRDD = linesRDD.filter(lambda line: âERRORâ in line)!
15.
Ÿ © 2015 MapR
Technologies 15 Process DStream transform  Transform  map  reduceByValue  count  DStream RDDs Dstream  RDDs  transform  transform  âąâŻ Process  using  transformaBons   ââŻcreates  new  RDDs  data from time 0 to 1 data from time 1 to 2 RDD @ time 2 data from time 2 to 3 RDD @ time 3RDD @ time 1 RDD @ time 1 RDD @ time 2 RDD @ time 3
16.
Ÿ © 2015 MapR
Technologies 16 Key Concepts âąâŻ Data Sources âąâŻ Transformations: create new DStream â⯠Standard RDD operations: map, filter, union, reduce, join, ⊠â⯠Stateful operations: UpdateStateByKey(function), countByValueAndWindow, ⊠âąâŻ Output Operations
17.
Ÿ © 2015 MapR
Technologies 17 Spark Streaming Architecture âąâŻ processed  results  are  pushed  out   in  batches  Spark batches of processed results Spark Streaming input data stream DStream RDD batches data from time 0 to 1 data from time 1 to 2 RDD @ time 2 data from time 2 to 3 RDD @ time 3RDD @ time 1
18.
Ÿ © 2015 MapR
Technologies 18 Key Concepts âąâŻ Data Sources âąâŻ Transformations âąâŻ Output Operations: trigger Computation â⯠saveAsHadoopFiles â save to HDFS â⯠saveAsHadoopDataset â save to Hbase â⯠saveAsTextFiles â⯠foreach â do anything with each batch of RDDs MapR-DB MapR-FS
19.
Ÿ © 2015 MapR
Technologies 19 Learning Goals âąâŻ How it works by example
20.
Ÿ © 2015 MapR
Technologies 20 Use Case: Time Series Data Data for real-time monitoring read Spark Processing Spark Streaming Oil Pump Sensor data
21.
Ÿ © 2015 MapR
Technologies 21 Convert Line of CSV data to Sensor Object case class Sensor(resid: String, date: String, time: String, hz: Double, disp: Double, flo: Double, sedPPM: Double, psi: Double, chlPPM: Double) def parseSensor(str: String): Sensor = { val p = str.split(",") Sensor(p(0), p(1), p(2), p(3).toDouble, p(4).toDouble, p(5).toDouble, p(6).toDouble, p(7).toDouble, p(8).toDouble) }
22.
Ÿ © 2015 MapR
Technologies 22 Schema âąâŻ All events stored, data CF could be set to expire data âąâŻ Filtered alerts put in alerts CF âąâŻ Daily summaries put in Stats CF Row key CF data CF alerts CF stats hz ⊠psi psi ⊠hz_avg ⊠psi_min COHUTTA_3/10/14_1:01 10.37 84 0 COHUTTA_3/10/14 10 0
23.
Ÿ © 2015 MapR
Technologies 23 Basic Steps for Spark Streaming code These are the basic steps for Spark Streaming code: 1.⯠create a Dstream 1.⯠Apply transformations 2.⯠Apply output operations 2.⯠Start receiving data and processing it â⯠using streamingContext.start(). 3.⯠Wait for the processing to be stopped â⯠using streamingContext.awaitTermination().
24.
Ÿ © 2015 MapR
Technologies 24 Create a DStream val ssc = new StreamingContext(sparkConf, Seconds(2)) val linesDStream = ssc.textFileStream(â/mapr/stream") batch  'me  0-Ââ1  linesDStream batch  'me  1-Ââ2  batch  'me  1-Ââ2  DStream:  a  sequence  of  RDDs  represenBng  a  stream  of  data  stored  in  memory  as  an  RDD Â
25.
Ÿ © 2015 MapR
Technologies 25 Process DStream val linesDStream = ssc.textFileStream(âdirectory path") val sensorDStream = linesDStream.map(parseSensor) map  new  RDDs  created  for  every  batch   batch  'me  0-Ââ1  linesDStream RDDs sensorDstream  RDDs  batch  'me  1-Ââ2  map  map  batch  'me  1-Ââ2 Â
26.
Ÿ © 2015 MapR
Technologies 26 Process DStream // for Each RDD sensorDStream.foreachRDD { rdd => // filter sensor data for low psi val alertRDD = rdd.filter(sensor => sensor.psi < 5.0) . . . }
27.
Ÿ © 2015 MapR
Technologies 27 DataFrame and SQL Operations // for Each RDD parse into a sensor object filter sensorDStream.foreachRDD { rdd => . . . alertRdd.toDF().registerTempTable(âalertâ) // join alert data with pump maintenance info val res = sqlContext.sql( "select s.resid,s.psi, p.pumpType from alert s join pump p on s.resid = p.resid join maint m on p.resid=m.resid") . . . }
28.
Ÿ © 2015 MapR
Technologies 28 Save to HBase // for Each RDD parse into a sensor object filter sensorDStream.foreachRDD { rdd => . . . // convert alert to put object write to HBase alerts alertRDD.map(Sensor.convertToPutAlert) .saveAsHadoopDataset(jobConfig) }
29.
Ÿ © 2015 MapR
Technologies 29 Save to HBase rdd.map(Sensor.convertToPut).saveAsHadoopDataset(jobConfig) map  Put  objects  wriFen   To  HBase  batch  'me  0-Ââ1  linesRDD DStream sensorRDD  Dstream  batch  'me  1-Ââ2  map  map  batch  'me  1-Ââ2  HBase save save save output  opera'on:  persist  data  to  external  storage Â
30.
Ÿ © 2015 MapR
Technologies 30 Start Receiving Data sensorDStream.foreachRDD { rdd => . . . } // Start the computation ssc.start() // Wait for the computation to terminate ssc.awaitTermination()
31.
Ÿ © 2015 MapR
Technologies 31 Using HBase as a Source and Sink read write Spark applicationHBase database EXAMPLE: calculate and store summaries, Pre-Computed, Materialized View
32.
Ÿ © 2015 MapR
Technologies 32 HBase HBase Read and Write val hBaseRDD = sc.newAPIHadoopRDD( conf,classOf[TableInputFormat], classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], classOf[org.apache.hadoop.hbase.client.Result]) keyStatsRDD.map { case (k, v) => convertToPut(k, v) }.saveAsHadoopDataset(jobConfig) newAPIHadoopRDD Row key Result saveAsHadoopDataset Key Put HBase Scan Result
33.
Ÿ © 2015 MapR
Technologies 33 Read HBase // Load an RDD of (rowkey, Result) tuples from HBase table val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], classOf[org.apache.hadoop.hbase.client.Result]) // get Result val resultRDD = hBaseRDD.map(tuple => tuple._2) // transform into an RDD of (RowKey, ColumnValue)s val keyValueRDD = resultRDD.map( result => (Bytes.toString(result.getRow()).split(" ")(0), Bytes.toDouble(result.value))) // group by rowkey , get statistics for column value val keyStatsRDD = keyValueRDD.groupByKey().mapValues(list => StatCounter(list))
34.
Ÿ © 2015 MapR
Technologies 34 Write HBase // save to HBase table CF data val jobConfig: JobConf = new JobConf(conf, this.getClass) jobConfig.setOutputFormat(classOf[TableOutputFormat]) jobConfig.set(TableOutputFormat.OUTPUT_TABLE, tableName) // convert psi stats to put and write to hbase table stats column family keyStatsRDD.map { case (k, v) => convertToPut(k, v) }.saveAsHadoopDataset(jobConfig)
35.
Ÿ © 2015 MapR
Technologies 35 MapR Blog: Using Apache Spark DataFrames for Processing of Tabular Data âąâŻ https://www.mapr.com/blog/spark-streaming-hbase
36.
Ÿ © 2015 MapR
Technologies 36 Free HBase On Demand Training (includes Hive and MapReduce with HBase) âąâŻ https://www.mapr.com/services/mapr-academy/big-data-hadoop- online-training
37.
Ÿ © 2015 MapR
Technologies 37 Soon to Come âąâŻ Spark On Demand Training â⯠https://www.mapr.com/services/mapr-academy/
38.
Ÿ © 2015 MapR
Technologies 38 References âąâŻ Spark web site: http://spark.apache.org/ âąâŻ https://databricks.com/ âąâŻ Spark on MapR: â⯠http://www.mapr.com/products/apache-spark âąâŻ Spark SQL and DataFrame Guide âąâŻ Apache Spark vs. MapReduce â Whiteboard Walkthrough âąâŻ Learning Spark - O'Reilly Book âąâŻ Apache Spark
39.
Ÿ © 2015 MapR
Technologies 39 Q&A @mapr maprtech Engage with us! MapR maprtech mapr-technologies
Jetzt herunterladen