Weitere ähnliche Inhalte
Ähnlich wie Social Media Monitoring with NiFi, Druid and Superset (20)
Mehr von Thiago Santiago (13)
Kürzlich hochgeladen (20)
Social Media Monitoring with NiFi, Druid and Superset
- 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved.1 © Hortonworks Inc. 2011–2018. All rights reserved.
Thiago Santiago
Solution Engineer – Latam
- 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved.
Why Customers Choose Hortonworks
Global Data Management
• Hybrid
• Multi-cloud
• End-to-end security and governance
100% Open Source –
“We are the Linux of Big Data”
• Innovation
• Interoperability
• No vendor lock-in
• Rapid community innovation
Proven Business Model:
• 1,300 enterprise customers
• First to IPO
• Fastest to $100M
• First to profitability
Most Comprehensive Platform
• Data at Rest and Data in Motion
• Any style of workload
• Centralized management, security,
governance
- 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved.
Powering the Modern Data Architecture
DATA AT RESTDATA IN MOTION
ACTIONABLE
INTELLIGENCE
COMPLETE DATA
LIFECYCLE
MANAGEMENT
RUN CONTAINERIZED
APPLICATIONS
CONCURRENTLY
EDGECLOUD
H O L I S T I C M A N A G E M E N T, G OV E R N A N C E A N D S E C U R I T Y
ON-PREMISES
MULTI-WORKLOADS MULTI-TYPE MULTI-TIER
Data Science SQL Query Engine
- 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved.
The Datalake
Data Science
IT Systems & Ops
HDP
HDF
- 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved.
Capture
streaming data
Deliver
perishable insights
Combine
new & old data
Store
data forever
Access
a multi-tenant data lake
Model
with artificial intelligence
DATA AT RESTDATA IN MOTION
ACTIONABLE
INTELLIGENCE
Perishable Insights Historical Insights
- 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved.
HORTONWORKS DATA FLOW
NIFI
1.2.0HDF 3.0
Jul 2017
1.0.0
HDF 2.0
Mar 2016
* HDF 3.1 – Shows current Apache branches being used. Final component version subject to change based on Apache release process.
1.1.0
NiFiRegistry
Ranger
0.7.0
0.5.0
0.6.0
Ambari
2.5.1
2.4.0
2.4.2
Kafka
0.10.1.0
0.9.0
0.10.0
Zookeeper
3.4.6
3.4.6
3.4.6
Storm
1.1.0
1.0.1
1.0.2SAM
0.5.0
SchemaRegistry
0.3.0
HDF 2.1
Aug 2016
Ongoing Innovation in Apache
HDF 1.0
Dec 2014
0.3.0
0.6.1
HDF 1.2
Oct 2015
MiNiFi0.2.0
Ongoing Innovation in OpenSource
1.0.0
0.0.1
0.10.0
HDF 3.1
Jan 2018 1.5.0 0.1.0 0.7.02.6.11.0 3.4.61.1.10.6.0 0.5.00.4.0
SECURITYSTREAMING & INTEGRATION OPERATIONS
Hortonworks Data Flow 3.1
- 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved.
HORTONWORKS DATA PLATFORM
Hadoop
&YARN
HDP 2.2
Dec 2014
HDP 2.2
Dec 2014
2.2.0
2.4.0
2.6.0
2.7.1
HDP 2.3
Oct 2015
2.7.3
HDP 2.6*
2017
2.7.1
HDP 2.4
Mar 2016
* HDP 2.6 – Shows current Apache branches being used. Final component version subject to change based on Apache release process.
** Spark 1.6.3+ Spark 2.1 – HDP 2.6 supports both Spark 1.6.3 and Spark 2.1 as GA.
*** Hive 2.1 is GA within HDP 2.6.
**** Apache Solr is available as an add-on product HDP Search.
2.7.3
Sqoop
1.4.4
1.4.5
1.4.4
1.4.6
1.4.6
1.4.6
1.4.6
Druid
0.9.2
Knox
0.4.0
0.5.0
0.6.0
0.11.0
0.6.0
0.9.0
Ranger
0.4.0
0.5.0
0.7.0
0.5.0
0.6.0
Ambari
1.4.4
2.0.0
1.5.1
2.1.0
2.5.0
2.2.1
2.4.0
Kafka
0.8.2
0.8.1
0.10.1.0
0.9.0
0.10.0
Zookeeper
3.4.5
3.4.6
3.4.5
3.4.6
3.4.6
3.4.6
3.4.6
Flume
1.5.2
1.4.0
1.3.1
1.5.2
1.5.2
1.5.2
1.5.2
Solr
4.10.2
4.7.2
5.2.1
5.5.1
****
5.2.1
5.5.1
Slider
0.60.0
0.80.0
0.91.0
0.80.0
0.91.0
Atlas
0.5.0
0.8.0
0.5.0
0.7.0
Accumulo
1.6.1
1.5.1
1.7.0
1.7.0
1.7.0
1.7.0
Phoenix
4.0.0
4.2.0
4.4.0
4.7.0
4.4.0
4.7.0
Storm
0.9.3
0.10.0
0.9.1
1.1.0
0.10.0
1.0.1
Falcon
0.5.0
0.6.0
0.6.1
0.10.0
0.6.1
0.10.0
Tez
0.4.0
0.5.2
0.7.0
0.7.0
0.7.0
0.7.0
Hive
0.12.0
0.13.0
0.14.0
1.2.1
1.2.1+
2.1***
1.2.1
1.2.1+
2.1***Pig
0.12.0
0.12.1
0.14.0
0.15.0
0.16.0
0.15.0
0.16.0HDP 2.5
Aug 2016
Oozie
3.3.2
4.1.0
4.0.0
4.2.0
4.2.0
4.2.0
4.2.0
Spark
1.2.1
1.4.1
1.6.3+
2.1**
1.6.0
1.6.2+
2.0**
HBase
0.98.4
0.96.1
0.98.0
1.1.2
1.1.2
1.1.2
1.1.2
Zeppelin
0.7.0
0.6.0
HDP 2.1
April 2014
HDP 2.0
Oct 2013
Ongoing Innovation in OpenSource
Hortonworks Data Platform 2.6
DATA MGMT DATA ACCESS GOVERNANCE & INTEGRATION OPERATIONS SECURITY
- 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved.8 © Hortonworks Inc. 2011–2018. All rights reserved.
Clients
ApplicationsLegacy On Premises
Lambda Architecture in Hortonworks complementing IBM investments
Tooling Data Science, Machine
Learning
Model Pré-
processing
Analytics, BI, Ad-hoc
Exploration
Data
Exploration
Complex
Event
Processing
Kafka SAM
Analytics, BI, Ad-hoc
Exploration
Visualization
& Reporting
All Data
HDFS
Tooling
Hive
Bach Views
Tooling
SuperSet
Real Time Views
Custom Applications
Dashboards
BatchLayerSpeedLayerServingLayer
Ingest
Atlas/Ranger
Model
Building
IBM DB2
Big SQL
Druid
Marketing
Zeppelin
Relational Bases
Social Networks
WebSites
Mobile Apps
CDR - Network
OOT
Adwords/adserver
Beacon
TWW/Smart Focus
CRM
…
IBM Spectrum
Scale
…
…
IBM Stream
Computing
- 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved.10 © Hortonworks Inc. 2011–2018. All rights reserved.
Source
Json Parsing
Nifi Druid
Processor
Social Media analysis is a great
use case for show how we can
build a dashboard showing
streaming analytics with NiFi,
Druid, and Superset
This processing flow has these steps:
1) Tweets ingestion using Apache NiFi
2) OLAP database storage using Druid
3) Visualization using Apache Superset
- 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved.11 © Hortonworks Inc. 2011–2018. All rights reserved.
Clients
ApplicationsLegacy On Premises
Lambda Architecture in Hortonworks complementing IBM investments
Tooling Data Science, Machine
Learning
Model Pré-
processing
Analytics, BI, Ad-hoc
Exploration
Data
Exploration
Complex
Event
Processing
Kafka SAM
Analytics, BI, Ad-hoc
Exploration
Visualization
& Reporting
All Data
HDFS
Tooling
Hive
Bach Views
Tooling
SuperSet
Real Time Views
Custom Applications
Dashboards
BatchLayerSpeedLayerServingLayer
Ingest
Atlas/Ranger
Model
Building
IBM DB2
Big SQL
Druid
Marketing
Zeppelin
Relational Bases
Social Networks
WebSites
Mobile Apps
CDR - Network
OOT
Adwords/adserver
Beacon
TWW/Smart Focus
CRM
…
IBM Spectrum
Scale
…
…
IBM Stream
Computing
- 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved.12 © Hortonworks Inc. 2011–2018. All rights reserved.
The Social Stalker
Tooling
SuperSet
SpeedLayer
Ingest
Atlas/Ranger
Druid
Social Networks
HDFS
Data Science, Machine
Learning
Model Pré-
processing
- 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved.13 © Hortonworks Inc. 2011–2018. All rights reserved.
NiFi makes data ingestion fast, easy and secure
Druid is a data store designed for business intelligence (OLAP) queries on event data.
Superset's main goal is to make it easy to slice, dice and visualize data
https://community.hortonworks.com/
- 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved.
Article Demo
https://community.hortonworks.com/articles/177561/streaming-tweets-with-nifi-kafka-tranquility-druid.html