SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
1 © Hortonworks Inc. 2011–2018. All rights reserved.1 © Hortonworks Inc. 2011–2018. All rights reserved.
Thiago Santiago
Solution Engineer – Latam
2 © Hortonworks Inc. 2011–2018. All rights reserved.
Why Customers Choose Hortonworks
Global Data Management
• Hybrid
• Multi-cloud
• End-to-end security and governance
100% Open Source –
“We are the Linux of Big Data”
• Innovation
• Interoperability
• No vendor lock-in
• Rapid community innovation
Proven Business Model:
• 1,300 enterprise customers
• First to IPO
• Fastest to $100M
• First to profitability
Most Comprehensive Platform
• Data at Rest and Data in Motion
• Any style of workload
• Centralized management, security,
governance
3 © Hortonworks Inc. 2011–2018. All rights reserved.
Powering the Modern Data Architecture
DATA AT RESTDATA IN MOTION
ACTIONABLE
INTELLIGENCE
COMPLETE DATA
LIFECYCLE
MANAGEMENT
RUN CONTAINERIZED
APPLICATIONS
CONCURRENTLY
EDGECLOUD
H O L I S T I C M A N A G E M E N T, G OV E R N A N C E A N D S E C U R I T Y
ON-PREMISES
MULTI-WORKLOADS MULTI-TYPE MULTI-TIER
Data Science SQL Query Engine
4 © Hortonworks Inc. 2011–2018. All rights reserved.
The Datalake
Data Science
IT Systems & Ops
HDP
HDF
5 © Hortonworks Inc. 2011–2018. All rights reserved.
Capture
streaming data
Deliver
perishable insights
Combine
new & old data
Store
data forever
Access
a multi-tenant data lake
Model
with artificial intelligence
DATA AT RESTDATA IN MOTION
ACTIONABLE
INTELLIGENCE
Perishable Insights Historical Insights
6 © Hortonworks Inc. 2011–2018. All rights reserved.
HORTONWORKS DATA FLOW
NIFI
1.2.0HDF 3.0
Jul 2017
1.0.0
HDF 2.0
Mar 2016
* HDF 3.1 – Shows current Apache branches being used. Final component version subject to change based on Apache release process.
1.1.0
NiFiRegistry
Ranger
0.7.0
0.5.0
0.6.0
Ambari
2.5.1
2.4.0
2.4.2
Kafka
0.10.1.0
0.9.0
0.10.0
Zookeeper
3.4.6
3.4.6
3.4.6
Storm
1.1.0
1.0.1
1.0.2SAM
0.5.0
SchemaRegistry
0.3.0
HDF 2.1
Aug 2016
Ongoing Innovation in Apache
HDF 1.0
Dec 2014
0.3.0
0.6.1
HDF 1.2
Oct 2015
MiNiFi0.2.0
Ongoing Innovation in OpenSource
1.0.0
0.0.1
0.10.0
HDF 3.1
Jan 2018 1.5.0 0.1.0 0.7.02.6.11.0 3.4.61.1.10.6.0 0.5.00.4.0
SECURITYSTREAMING & INTEGRATION OPERATIONS
Hortonworks Data Flow 3.1
7 © Hortonworks Inc. 2011–2018. All rights reserved.
HORTONWORKS DATA PLATFORM
Hadoop
&YARN
HDP 2.2
Dec 2014
HDP 2.2
Dec 2014
2.2.0
2.4.0
2.6.0
2.7.1
HDP 2.3
Oct 2015
2.7.3
HDP 2.6*
2017
2.7.1
HDP 2.4
Mar 2016
* HDP 2.6 – Shows current Apache branches being used. Final component version subject to change based on Apache release process.
** Spark 1.6.3+ Spark 2.1 – HDP 2.6 supports both Spark 1.6.3 and Spark 2.1 as GA.
*** Hive 2.1 is GA within HDP 2.6.
**** Apache Solr is available as an add-on product HDP Search.
2.7.3
Sqoop
1.4.4
1.4.5
1.4.4
1.4.6
1.4.6
1.4.6
1.4.6
Druid
0.9.2
Knox
0.4.0
0.5.0
0.6.0
0.11.0
0.6.0
0.9.0
Ranger
0.4.0
0.5.0
0.7.0
0.5.0
0.6.0
Ambari
1.4.4
2.0.0
1.5.1
2.1.0
2.5.0
2.2.1
2.4.0
Kafka
0.8.2
0.8.1
0.10.1.0
0.9.0
0.10.0
Zookeeper
3.4.5
3.4.6
3.4.5
3.4.6
3.4.6
3.4.6
3.4.6
Flume
1.5.2
1.4.0
1.3.1
1.5.2
1.5.2
1.5.2
1.5.2
Solr
4.10.2
4.7.2
5.2.1
5.5.1
****
5.2.1
5.5.1
Slider
0.60.0
0.80.0
0.91.0
0.80.0
0.91.0
Atlas
0.5.0
0.8.0
0.5.0
0.7.0
Accumulo
1.6.1
1.5.1
1.7.0
1.7.0
1.7.0
1.7.0
Phoenix
4.0.0
4.2.0
4.4.0
4.7.0
4.4.0
4.7.0
Storm
0.9.3
0.10.0
0.9.1
1.1.0
0.10.0
1.0.1
Falcon
0.5.0
0.6.0
0.6.1
0.10.0
0.6.1
0.10.0
Tez
0.4.0
0.5.2
0.7.0
0.7.0
0.7.0
0.7.0
Hive
0.12.0
0.13.0
0.14.0
1.2.1
1.2.1+
2.1***
1.2.1
1.2.1+
2.1***Pig
0.12.0
0.12.1
0.14.0
0.15.0
0.16.0
0.15.0
0.16.0HDP 2.5
Aug 2016
Oozie
3.3.2
4.1.0
4.0.0
4.2.0
4.2.0
4.2.0
4.2.0
Spark
1.2.1
1.4.1
1.6.3+
2.1**
1.6.0
1.6.2+
2.0**
HBase
0.98.4
0.96.1
0.98.0
1.1.2
1.1.2
1.1.2
1.1.2
Zeppelin
0.7.0
0.6.0
HDP 2.1
April 2014
HDP 2.0
Oct 2013
Ongoing Innovation in OpenSource
Hortonworks Data Platform 2.6
DATA MGMT DATA ACCESS GOVERNANCE & INTEGRATION OPERATIONS SECURITY
8 © Hortonworks Inc. 2011–2018. All rights reserved.8 © Hortonworks Inc. 2011–2018. All rights reserved.
Clients
ApplicationsLegacy On Premises
Lambda Architecture in Hortonworks complementing IBM investments
Tooling Data Science, Machine
Learning
Model Pré-
processing
Analytics, BI, Ad-hoc
Exploration
Data
Exploration
Complex
Event
Processing
Kafka SAM
Analytics, BI, Ad-hoc
Exploration
Visualization
& Reporting
All Data
HDFS
Tooling
Hive
Bach Views
Tooling
SuperSet
Real Time Views
Custom Applications
Dashboards
BatchLayerSpeedLayerServingLayer
Ingest
Atlas/Ranger
Model
Building
IBM DB2
Big SQL
Druid
Marketing
Zeppelin
Relational Bases
Social Networks
WebSites
Mobile Apps
CDR - Network
OOT
Adwords/adserver
Beacon
TWW/Smart Focus
CRM
…
IBM Spectrum
Scale
…
…
IBM Stream
Computing
9 © Hortonworks Inc. 2011–2018. All rights reserved.
10 © Hortonworks Inc. 2011–2018. All rights reserved.10 © Hortonworks Inc. 2011–2018. All rights reserved.
Source
Json Parsing
Nifi Druid
Processor
Social Media analysis is a great
use case for show how we can
build a dashboard showing
streaming analytics with NiFi,
Druid, and Superset
This processing flow has these steps:
1) Tweets ingestion using Apache NiFi
2) OLAP database storage using Druid
3) Visualization using Apache Superset
11 © Hortonworks Inc. 2011–2018. All rights reserved.11 © Hortonworks Inc. 2011–2018. All rights reserved.
Clients
ApplicationsLegacy On Premises
Lambda Architecture in Hortonworks complementing IBM investments
Tooling Data Science, Machine
Learning
Model Pré-
processing
Analytics, BI, Ad-hoc
Exploration
Data
Exploration
Complex
Event
Processing
Kafka SAM
Analytics, BI, Ad-hoc
Exploration
Visualization
& Reporting
All Data
HDFS
Tooling
Hive
Bach Views
Tooling
SuperSet
Real Time Views
Custom Applications
Dashboards
BatchLayerSpeedLayerServingLayer
Ingest
Atlas/Ranger
Model
Building
IBM DB2
Big SQL
Druid
Marketing
Zeppelin
Relational Bases
Social Networks
WebSites
Mobile Apps
CDR - Network
OOT
Adwords/adserver
Beacon
TWW/Smart Focus
CRM
…
IBM Spectrum
Scale
…
…
IBM Stream
Computing
12 © Hortonworks Inc. 2011–2018. All rights reserved.12 © Hortonworks Inc. 2011–2018. All rights reserved.
The Social Stalker
Tooling
SuperSet
SpeedLayer
Ingest
Atlas/Ranger
Druid
Social Networks
HDFS
Data Science, Machine
Learning
Model Pré-
processing
13 © Hortonworks Inc. 2011–2018. All rights reserved.13 © Hortonworks Inc. 2011–2018. All rights reserved.
NiFi makes data ingestion fast, easy and secure
Druid is a data store designed for business intelligence (OLAP) queries on event data.
Superset's main goal is to make it easy to slice, dice and visualize data
https://community.hortonworks.com/
14 © Hortonworks Inc. 2011–2018. All rights reserved.
Article Demo
https://community.hortonworks.com/articles/177561/streaming-tweets-with-nifi-kafka-tranquility-druid.html
15 © Hortonworks Inc. 2011–2018. All rights reserved.
Thank you

Weitere ähnliche Inhalte

Was ist angesagt?

Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
DataWorks Summit
 

Was ist angesagt? (20)

Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - Ranger
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
The Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationThe Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data Integration
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi
 
Leveraging Docker for Hadoop build automation and Big Data stack provisioning
Leveraging Docker for Hadoop build automation and Big Data stack provisioningLeveraging Docker for Hadoop build automation and Big Data stack provisioning
Leveraging Docker for Hadoop build automation and Big Data stack provisioning
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud Architecture
 
Alfresco y SOLR, presentación en español
Alfresco y SOLR, presentación en españolAlfresco y SOLR, presentación en español
Alfresco y SOLR, presentación en español
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache Ambari
 
Managing enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystemManaging enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystem
 

Ähnlich wie Social Media Monitoring with NiFi, Druid and Superset

IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFi
DataWorks Summit
 

Ähnlich wie Social Media Monitoring with NiFi, Druid and Superset (20)

IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep Dive
 
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDruid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
 
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
MiniFi and Apache NiFi : IoT in Berlin Germany 2018
MiniFi and Apache NiFi : IoT in Berlin Germany 2018MiniFi and Apache NiFi : IoT in Berlin Germany 2018
MiniFi and Apache NiFi : IoT in Berlin Germany 2018
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
 
Unlocking insights in streaming data
Unlocking insights in streaming dataUnlocking insights in streaming data
Unlocking insights in streaming data
 
Make Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for YouMake Streaming IoT Analytics Work for You
Make Streaming IoT Analytics Work for You
 
Apache NiFi + Tensorflow + Hadoop: Big Data AI サンドイッチの作り方
Apache NiFi + Tensorflow + Hadoop:Big Data AI サンドイッチの作り方Apache NiFi + Tensorflow + Hadoop:Big Data AI サンドイッチの作り方
Apache NiFi + Tensorflow + Hadoop: Big Data AI サンドイッチの作り方
 
HDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesHDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New Features
 
Hive acid and_2.x new_features
Hive acid and_2.x new_featuresHive acid and_2.x new_features
Hive acid and_2.x new_features
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFi
 
Apache MXNet for IoT with Apache NiFi
Apache MXNet for IoT with Apache NiFiApache MXNet for IoT with Apache NiFi
Apache MXNet for IoT with Apache NiFi
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
 
Scalable olap with druid
Scalable olap with druidScalable olap with druid
Scalable olap with druid
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 

Mehr von Thiago Santiago

Mehr von Thiago Santiago (13)

LGPD - Webinar Cloudera e FIAP
LGPD - Webinar Cloudera e FIAPLGPD - Webinar Cloudera e FIAP
LGPD - Webinar Cloudera e FIAP
 
Harvard Business Review - LGPD
Harvard Business Review - LGPDHarvard Business Review - LGPD
Harvard Business Review - LGPD
 
Meet up roadmap cloudera 2020 - janeiro
Meet up   roadmap cloudera 2020 - janeiroMeet up   roadmap cloudera 2020 - janeiro
Meet up roadmap cloudera 2020 - janeiro
 
Hortonworks - IBM - Cloud Event
Hortonworks - IBM - Cloud EventHortonworks - IBM - Cloud Event
Hortonworks - IBM - Cloud Event
 
Hortonworks - IBM Cognitive - The Future of Data Science
Hortonworks - IBM Cognitive - The Future of Data ScienceHortonworks - IBM Cognitive - The Future of Data Science
Hortonworks - IBM Cognitive - The Future of Data Science
 
PGDay Brasilia 2017
PGDay Brasilia 2017PGDay Brasilia 2017
PGDay Brasilia 2017
 
Big Data Week São Paulo 2017
Big Data Week São Paulo 2017 Big Data Week São Paulo 2017
Big Data Week São Paulo 2017
 
Hortonworks & IBM solutions
Hortonworks & IBM solutionsHortonworks & IBM solutions
Hortonworks & IBM solutions
 
Instituto Infnet - BigData e Hadoop
Instituto Infnet  - BigData e HadoopInstituto Infnet  - BigData e Hadoop
Instituto Infnet - BigData e Hadoop
 
Hadoop Day - MeetUp - O poder da Informação
Hadoop Day - MeetUp - O poder da InformaçãoHadoop Day - MeetUp - O poder da Informação
Hadoop Day - MeetUp - O poder da Informação
 
BigData & Hadoop - Technology Latinoware 2016
BigData & Hadoop - Technology Latinoware 2016BigData & Hadoop - Technology Latinoware 2016
BigData & Hadoop - Technology Latinoware 2016
 
TDC 2014 - Hadoop Hands ON
TDC 2014 - Hadoop Hands ONTDC 2014 - Hadoop Hands ON
TDC 2014 - Hadoop Hands ON
 
Hadoop - Mãos à massa! Qcon2014
Hadoop - Mãos à massa! Qcon2014Hadoop - Mãos à massa! Qcon2014
Hadoop - Mãos à massa! Qcon2014
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Social Media Monitoring with NiFi, Druid and Superset

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved.1 © Hortonworks Inc. 2011–2018. All rights reserved. Thiago Santiago Solution Engineer – Latam
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved. Why Customers Choose Hortonworks Global Data Management • Hybrid • Multi-cloud • End-to-end security and governance 100% Open Source – “We are the Linux of Big Data” • Innovation • Interoperability • No vendor lock-in • Rapid community innovation Proven Business Model: • 1,300 enterprise customers • First to IPO • Fastest to $100M • First to profitability Most Comprehensive Platform • Data at Rest and Data in Motion • Any style of workload • Centralized management, security, governance
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved. Powering the Modern Data Architecture DATA AT RESTDATA IN MOTION ACTIONABLE INTELLIGENCE COMPLETE DATA LIFECYCLE MANAGEMENT RUN CONTAINERIZED APPLICATIONS CONCURRENTLY EDGECLOUD H O L I S T I C M A N A G E M E N T, G OV E R N A N C E A N D S E C U R I T Y ON-PREMISES MULTI-WORKLOADS MULTI-TYPE MULTI-TIER Data Science SQL Query Engine
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved. The Datalake Data Science IT Systems & Ops HDP HDF
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved. Capture streaming data Deliver perishable insights Combine new & old data Store data forever Access a multi-tenant data lake Model with artificial intelligence DATA AT RESTDATA IN MOTION ACTIONABLE INTELLIGENCE Perishable Insights Historical Insights
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved. HORTONWORKS DATA FLOW NIFI 1.2.0HDF 3.0 Jul 2017 1.0.0 HDF 2.0 Mar 2016 * HDF 3.1 – Shows current Apache branches being used. Final component version subject to change based on Apache release process. 1.1.0 NiFiRegistry Ranger 0.7.0 0.5.0 0.6.0 Ambari 2.5.1 2.4.0 2.4.2 Kafka 0.10.1.0 0.9.0 0.10.0 Zookeeper 3.4.6 3.4.6 3.4.6 Storm 1.1.0 1.0.1 1.0.2SAM 0.5.0 SchemaRegistry 0.3.0 HDF 2.1 Aug 2016 Ongoing Innovation in Apache HDF 1.0 Dec 2014 0.3.0 0.6.1 HDF 1.2 Oct 2015 MiNiFi0.2.0 Ongoing Innovation in OpenSource 1.0.0 0.0.1 0.10.0 HDF 3.1 Jan 2018 1.5.0 0.1.0 0.7.02.6.11.0 3.4.61.1.10.6.0 0.5.00.4.0 SECURITYSTREAMING & INTEGRATION OPERATIONS Hortonworks Data Flow 3.1
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved. HORTONWORKS DATA PLATFORM Hadoop &YARN HDP 2.2 Dec 2014 HDP 2.2 Dec 2014 2.2.0 2.4.0 2.6.0 2.7.1 HDP 2.3 Oct 2015 2.7.3 HDP 2.6* 2017 2.7.1 HDP 2.4 Mar 2016 * HDP 2.6 – Shows current Apache branches being used. Final component version subject to change based on Apache release process. ** Spark 1.6.3+ Spark 2.1 – HDP 2.6 supports both Spark 1.6.3 and Spark 2.1 as GA. *** Hive 2.1 is GA within HDP 2.6. **** Apache Solr is available as an add-on product HDP Search. 2.7.3 Sqoop 1.4.4 1.4.5 1.4.4 1.4.6 1.4.6 1.4.6 1.4.6 Druid 0.9.2 Knox 0.4.0 0.5.0 0.6.0 0.11.0 0.6.0 0.9.0 Ranger 0.4.0 0.5.0 0.7.0 0.5.0 0.6.0 Ambari 1.4.4 2.0.0 1.5.1 2.1.0 2.5.0 2.2.1 2.4.0 Kafka 0.8.2 0.8.1 0.10.1.0 0.9.0 0.10.0 Zookeeper 3.4.5 3.4.6 3.4.5 3.4.6 3.4.6 3.4.6 3.4.6 Flume 1.5.2 1.4.0 1.3.1 1.5.2 1.5.2 1.5.2 1.5.2 Solr 4.10.2 4.7.2 5.2.1 5.5.1 **** 5.2.1 5.5.1 Slider 0.60.0 0.80.0 0.91.0 0.80.0 0.91.0 Atlas 0.5.0 0.8.0 0.5.0 0.7.0 Accumulo 1.6.1 1.5.1 1.7.0 1.7.0 1.7.0 1.7.0 Phoenix 4.0.0 4.2.0 4.4.0 4.7.0 4.4.0 4.7.0 Storm 0.9.3 0.10.0 0.9.1 1.1.0 0.10.0 1.0.1 Falcon 0.5.0 0.6.0 0.6.1 0.10.0 0.6.1 0.10.0 Tez 0.4.0 0.5.2 0.7.0 0.7.0 0.7.0 0.7.0 Hive 0.12.0 0.13.0 0.14.0 1.2.1 1.2.1+ 2.1*** 1.2.1 1.2.1+ 2.1***Pig 0.12.0 0.12.1 0.14.0 0.15.0 0.16.0 0.15.0 0.16.0HDP 2.5 Aug 2016 Oozie 3.3.2 4.1.0 4.0.0 4.2.0 4.2.0 4.2.0 4.2.0 Spark 1.2.1 1.4.1 1.6.3+ 2.1** 1.6.0 1.6.2+ 2.0** HBase 0.98.4 0.96.1 0.98.0 1.1.2 1.1.2 1.1.2 1.1.2 Zeppelin 0.7.0 0.6.0 HDP 2.1 April 2014 HDP 2.0 Oct 2013 Ongoing Innovation in OpenSource Hortonworks Data Platform 2.6 DATA MGMT DATA ACCESS GOVERNANCE & INTEGRATION OPERATIONS SECURITY
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved.8 © Hortonworks Inc. 2011–2018. All rights reserved. Clients ApplicationsLegacy On Premises Lambda Architecture in Hortonworks complementing IBM investments Tooling Data Science, Machine Learning Model Pré- processing Analytics, BI, Ad-hoc Exploration Data Exploration Complex Event Processing Kafka SAM Analytics, BI, Ad-hoc Exploration Visualization & Reporting All Data HDFS Tooling Hive Bach Views Tooling SuperSet Real Time Views Custom Applications Dashboards BatchLayerSpeedLayerServingLayer Ingest Atlas/Ranger Model Building IBM DB2 Big SQL Druid Marketing Zeppelin Relational Bases Social Networks WebSites Mobile Apps CDR - Network OOT Adwords/adserver Beacon TWW/Smart Focus CRM … IBM Spectrum Scale … … IBM Stream Computing
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved.
  • 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved.10 © Hortonworks Inc. 2011–2018. All rights reserved. Source Json Parsing Nifi Druid Processor Social Media analysis is a great use case for show how we can build a dashboard showing streaming analytics with NiFi, Druid, and Superset This processing flow has these steps: 1) Tweets ingestion using Apache NiFi 2) OLAP database storage using Druid 3) Visualization using Apache Superset
  • 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved.11 © Hortonworks Inc. 2011–2018. All rights reserved. Clients ApplicationsLegacy On Premises Lambda Architecture in Hortonworks complementing IBM investments Tooling Data Science, Machine Learning Model Pré- processing Analytics, BI, Ad-hoc Exploration Data Exploration Complex Event Processing Kafka SAM Analytics, BI, Ad-hoc Exploration Visualization & Reporting All Data HDFS Tooling Hive Bach Views Tooling SuperSet Real Time Views Custom Applications Dashboards BatchLayerSpeedLayerServingLayer Ingest Atlas/Ranger Model Building IBM DB2 Big SQL Druid Marketing Zeppelin Relational Bases Social Networks WebSites Mobile Apps CDR - Network OOT Adwords/adserver Beacon TWW/Smart Focus CRM … IBM Spectrum Scale … … IBM Stream Computing
  • 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved.12 © Hortonworks Inc. 2011–2018. All rights reserved. The Social Stalker Tooling SuperSet SpeedLayer Ingest Atlas/Ranger Druid Social Networks HDFS Data Science, Machine Learning Model Pré- processing
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved.13 © Hortonworks Inc. 2011–2018. All rights reserved. NiFi makes data ingestion fast, easy and secure Druid is a data store designed for business intelligence (OLAP) queries on event data. Superset's main goal is to make it easy to slice, dice and visualize data https://community.hortonworks.com/
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved. Article Demo https://community.hortonworks.com/articles/177561/streaming-tweets-with-nifi-kafka-tranquility-druid.html
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved. Thank you