SlideShare ist ein Scribd-Unternehmen logo
1 von 65
Real-time Big Data Applications 
with Hadoop Ecosystem 
Chris Huang 
Sr. Manager, Core Tech 
2014/9/24 
1 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc.
About – Chris Huang 
• Chris Huang 
– SPN Solution Developer Manager 
– SPN Hadoop Architect 
– Hadoop.TW Active Member 
• Believes Cloud, Service, Software, Big 
Data are critical factors for Taiwan’s 
future economic development 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2
Conference Talks 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 3
Conference Talks 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 4
Hot Keywords in Hadoop Community 
Real-time 
• Impala, Stinger 
Computing Framework 
• YARN, Tez 
In Memory 
• Spark 
Streaming 
• Kafka, Storm 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 5
Big Data Applications 
• Operational 
– Real-time 
– Near Real-time 
• Analytical 
– Batch 
– Interactive 
– Near Real-time 
– Streaming 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 6
An Online Music Example 
• Operational 
– Recent N login time (listen duration) 
– Recent N album/artist user browses 
– Recent N keyword user search 
– Recent N song/album/artist user listens (buys) 
– Recent N month user’s purchase amount 
• Analytical 
– Recommend right song/album/artist to right user at right time 
– Correlate similar song/album/artist (CDDB or user behavior) 
– Know seasonal music trending (X’max, Valentine’s Day, New Year) 
– Know regional music trending 
– Calculate regional leaderboard 
– Connect user with social network 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 7
An Online Banking Example 
• Operational 
– Recent N login time / frequency 
– Recent N items purchased by credit card 
– Recent N month balance amount 
– Recent N transfer in/out amount 
– Recent N investment event 
– Recent N month investment balance 
• Analytical 
– Know user’s profile more (assets/debts/shopping habits/family) 
– Recommend right product to right user (investment, credit card, loan) 
– Know seasonal trending (tax month/year end/back to school/X’mas) 
– Know regional investment product leaderboard (by different age) 
– Recommend product by similar user profile 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 8
Building Your Big Data Applications 
• Think about your data 
– Entity or Event? 
• Think about your use case 
– Operational or Analytic? 
• Think about your data user 
– External or Internal? 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 9
Think About Your Data 
Slides from “Apache HBase Application Archetypes”, 
HBaseCon 2014 
You can Replace HBase with similar alternatives, but 
concepts are the same 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 10
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 11
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 12
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 13
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 14
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 15
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 16
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 17
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 18
Think About Your Use 
Case 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 19
Operational Use Case 1 
MR / 
Spark 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 20 
Real-time 
MR / 
Spark 
Real-time 
Batch 
Batch 
Real-time 
HDFS
HBase: No Secondary Index (yet) 
• Search index building (row key) 
• Use Solr to make text data searchable 
– Snapshot & clone table 
– Index column qualifier text 
– Record row-key in Solr document 
– Use HBase client to fetch data 
• Usually less than few seconds 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 21
Operational Use Case 2 (SPN) 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 22 
Get, Scan 
Solr Client 
low latency 
high throughput 
Index Query 
MapReduce 
Pig 
HDFS 
Flume 
Feed App 
Real-time 
Real-time 
Batch
Operational Use Case 3 (Mixed) 
Real-time 
Put, Incr, 
Append 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 23 
Get, Scan 
Solr Client 
low latency 
high throughput 
Index Query 
Gets 
Short scan 
MapReduce 
Pig 
HDFS 
Flume 
Feed App 
Real-time 
Batch 
HBase Client 
HBase Client 
Bulk Import 
HBase Client 
MR / 
Spark Batch 
HBase 
Replication 
Solr 
MR / 
Batch Spark
HBase or HDFS? 
• Depends on what’s your data 
– Entity or Event? 
• Depends on your workload 
– Low latency? 
– Random read/write? 
– Short/full scan? 
– Sequential read/write? 
– Update? 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 24
Wait… 
Batch for 
Operational? 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 25
Yes, 
Why not? 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 26
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 27
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 28
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 29
Operational: Batch + Real-time 
• Bridge the gap between batch and now 
• 80/20 rule 
– HDFS/MapReduce/Spark solves 80% easily 
– Remaining 20% takes 80% of the efforts 
• Go as close as possible, don’t overdo it! 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 30
What is Real-time? 
• Real-time is NOT always “faster than batch” 
– If you have really BIG DATA 
• Most of the time, we want Timely Information 
• Minimize the gap between scheduled batch jobs 
Hourly Job 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 31 
Hourly Job 
Hourly Job 
How to get result at 1:33?
Analytical Use Case 
Batch/streaming compute 
Near real-time/interactive deliver 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 32
Near Real-time Interactive 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 33
Recommendation 
System 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 34
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 35
The Online Music Example 
• Operational 
– Recent N login time (listen duration ) 
– Recent N album/artist user browses 
– Recent N keyword user search 
– Recent N song/album/artist user listens (buys) 
– Recent N month user’s purchase amount 
Do you really want to analytical result 
• Analytical 
(recommendation) 
EVERY 50 millisecond? 
– Recommend right song/album/artist to right user at right time 
– Correlate similar song/album/artist (CDDB or user behavior) 
– Know seasonal music trending (X’max, Valentine’s Day, New Year) 
– Know regional music trending 
– Calculate regional leaderboard 
– Connect user with social network 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 36
Analytical Use Case 1 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 37 
Batch 
HDFS 
Index Query 
Solr Client 
Real-time
Analytical Use Case 2 (SPN) 
“A Graph Service for Global Web Entities Traversal and Reputation Evaluation Based on HBase”, 
HBaseCon 2014 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 38
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 39
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 40
You Need an 
Interactive 
Analytic Engine 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 41
Stinger 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 42
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 43
Impala Architecture 
Datanode 
Tasktracker 
Regionserver 
impala 
daemon 
NN, JT, HM 
Active 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2 
NN, JT, HM 
Standby 
Datanode 
Tasktracker 
Regionserver 
impala 
daemon 
Datanode 
Tasktracker 
Regionserver 
impala 
daemon 
State store 
Catalog 
Datanode 
Tasktracker 
Regionserver 
impala 
daemon 
Hive 
Metastore
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2
Apache Pig (MapReduce) 
• Do hourly count on akamai log 
– A = load 'date://2014/07/20/00' 
using AkamaiRCLoader(); 
B = foreach (group A all) COUNT_STAR(A); 
dump B; 
– … 
0% complete 
100% complete 
(194202349) 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2 
Too Slow for 
Interactive
Using Impala 
• No memory cache 
– > select count(*) from akafast 
where day=20140720 and hour=0 
– 194202349 
• with OS cache 
• Do a further query: 
– select count(*) from akafast where day=20140720 
and hour=00 and c='US'; 
– 41118019 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2 
Make Sense 
Now
Don’t Connect 
Analytic 
Engine with 
Operational 
Use Case 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 51
Analytical Use Case 3 
low latency 
high throughput 
Real-time 
Put, Incr, 
Append 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 52 
Gets 
Short scan 
HBase Client 
Impala/Stinger 
HDFS 
Flume 
Feed App 
Real-time 
Interactive 
HBase Client 
Bulk Import 
HBase Client 
MR / 
Spark Batch 
Customer 
Analyst
Streaming Use Cases 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 53
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 54
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 55
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 56
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 57
TME – Trend Message Exchange 
http://trendmicro.github.io/tme/ 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 58
Streaming Operational Use Case 
Real-time 
Gets 
Short scan 
Kafka/Storm 
Put, Incr, 
Append 
HBase Client 
Kafka/Storm 
low latency 
HDFS 
high throughput 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 59 
HBase Client 
Streaming 
Index Query 
Solr Client 
Streaming
Streaming Analytical Use Case 
Put, Incr, 
Append 
HBase Client 
Kafka/Storm 
low latency 
HDFS 
high throughput 
Flume 
Feed App 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 60 
Gets 
Short scan 
HBase Client 
Impala/Stinger 
Interactive 
Analyst 
Real-time 
Customer 
Streaming
Think About Your Data User 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 61
Data User 
• External 
– Customer 
– Partner 
• Internal 
– Business report user 
– Data researcher 
– Data analyst 
– Algorithm developer 
• They want instant response 
• They don’t know (and don’t care) if 
the recommendation is computed 1 
hour ago or 50 ms ago 
• Interactive or near real-time is 
enough 
• Sometimes even wait for batch (make 
data small and analyze) 
• Of course, everyone wants result 
faster, but it depends on your 
investment $$ 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 62
No Silver Bullet 
For Real-time, 
Or Big Data Application 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 63
Q&A 
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 64
9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 65

Weitere ähnliche Inhalte

Was ist angesagt?

Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Carol McDonald
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBaseCarol McDonald
 
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningDemystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningCarol McDonald
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Carol McDonald
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Carol McDonald
 
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBCarol McDonald
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataCarol McDonald
 
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeData Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeSpark Summit
 
Practical Machine Learning: Innovations in Recommendation Workshop
Practical Machine Learning:  Innovations in Recommendation WorkshopPractical Machine Learning:  Innovations in Recommendation Workshop
Practical Machine Learning: Innovations in Recommendation WorkshopMapR Technologies
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningCarol McDonald
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Carol McDonald
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesApache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesCarol McDonald
 
AI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat DetectionAI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat DetectionDatabricks
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidDataWorks Summit
 
Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsSriram Krishnan
 
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Data Con LA
 
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesmyHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesSriram Krishnan
 

Was ist angesagt? (20)

Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBase
 
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningDemystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep Learning
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
 
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
 
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeData Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
 
Practical Machine Learning: Innovations in Recommendation Workshop
Practical Machine Learning:  Innovations in Recommendation WorkshopPractical Machine Learning:  Innovations in Recommendation Workshop
Practical Machine Learning: Innovations in Recommendation Workshop
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesApache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision Trees
 
AI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat DetectionAI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat Detection
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
 
Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific Applications
 
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
 
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesmyHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
 
ebay
ebayebay
ebay
 

Andere mochten auch

Sm Case2 Walmart
Sm Case2 WalmartSm Case2 Walmart
Sm Case2 WalmartChris Huang
 
Sm Case4 Fuji Xerox
Sm Case4 Fuji XeroxSm Case4 Fuji Xerox
Sm Case4 Fuji XeroxChris Huang
 
重構—改善既有程式的設計(chapter 8)part 2
重構—改善既有程式的設計(chapter 8)part 2重構—改善既有程式的設計(chapter 8)part 2
重構—改善既有程式的設計(chapter 8)part 2Chris Huang
 
Disney報告 最終版
Disney報告 最終版Disney報告 最終版
Disney報告 最終版Chris Huang
 
策略管理報告
策略管理報告策略管理報告
策略管理報告chl808
 
策略管理個案(2) Wal-Mart
策略管理個案(2) Wal-Mart策略管理個案(2) Wal-Mart
策略管理個案(2) Wal-MartChris Huang
 

Andere mochten auch (7)

Sm Case1 Ikea
Sm Case1 IkeaSm Case1 Ikea
Sm Case1 Ikea
 
Sm Case2 Walmart
Sm Case2 WalmartSm Case2 Walmart
Sm Case2 Walmart
 
Sm Case4 Fuji Xerox
Sm Case4 Fuji XeroxSm Case4 Fuji Xerox
Sm Case4 Fuji Xerox
 
重構—改善既有程式的設計(chapter 8)part 2
重構—改善既有程式的設計(chapter 8)part 2重構—改善既有程式的設計(chapter 8)part 2
重構—改善既有程式的設計(chapter 8)part 2
 
Disney報告 最終版
Disney報告 最終版Disney報告 最終版
Disney報告 最終版
 
策略管理報告
策略管理報告策略管理報告
策略管理報告
 
策略管理個案(2) Wal-Mart
策略管理個案(2) Wal-Mart策略管理個案(2) Wal-Mart
策略管理個案(2) Wal-Mart
 

Ähnlich wie Real time big data applications with hadoop ecosystem

For Developers : Real-Time Analytics on Data in Motion
For Developers : Real-Time Analytics on Data in MotionFor Developers : Real-Time Analytics on Data in Motion
For Developers : Real-Time Analytics on Data in MotionAvadhoot Patwardhan
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
20131111 - Santa Monica - BigDataCamp - Big Data Design PatternsAllen Day, PhD
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionGuido Schmutz
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Enabling the Active Data Warehouse with Apache Kudu
Enabling the Active Data Warehouse with Apache KuduEnabling the Active Data Warehouse with Apache Kudu
Enabling the Active Data Warehouse with Apache KuduGrant Henke
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixNicolas Morales
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014MapR Technologies
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsData Con LA
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 
Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3Snowplow Analytics
 
Hadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataHadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataSenturus
 
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...DataStax Academy
 
Real time applications using the R Language
Real time applications using the R LanguageReal time applications using the R Language
Real time applications using the R LanguageLou Bajuk
 
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...Rainer Sternfeld
 
Big data beyond the hype may 2014
Big data beyond the hype may 2014Big data beyond the hype may 2014
Big data beyond the hype may 2014bigdatagurus_meetup
 
Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataBig Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataMatt Stubbs
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformCloudera, Inc.
 
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Publicis Sapient Engineering
 
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...Cloudera, Inc.
 

Ähnlich wie Real time big data applications with hadoop ecosystem (20)

For Developers : Real-Time Analytics on Data in Motion
For Developers : Real-Time Analytics on Data in MotionFor Developers : Real-Time Analytics on Data in Motion
For Developers : Real-Time Analytics on Data in Motion
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
20131111 - Santa Monica - BigDataCamp - Big Data Design Patterns
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Enabling the Active Data Warehouse with Apache Kudu
Enabling the Active Data Warehouse with Apache KuduEnabling the Active Data Warehouse with Apache Kudu
Enabling the Active Data Warehouse with Apache Kudu
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 
Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3
 
Hadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataHadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big Data
 
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
 
Real time applications using the R Language
Real time applications using the R LanguageReal time applications using the R Language
Real time applications using the R Language
 
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
 
Big data beyond the hype may 2014
Big data beyond the hype may 2014Big data beyond the hype may 2014
Big data beyond the hype may 2014
 
Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataBig Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast Data
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
 
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
 
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
 

Mehr von Chris Huang

Data compression, data security, and machine learning
Data compression, data security, and machine learningData compression, data security, and machine learning
Data compression, data security, and machine learningChris Huang
 
Kks sre book_ch10
Kks sre book_ch10Kks sre book_ch10
Kks sre book_ch10Chris Huang
 
Kks sre book_ch1,2
Kks sre book_ch1,2Kks sre book_ch1,2
Kks sre book_ch1,2Chris Huang
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorialChris Huang
 
Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...Chris Huang
 
Hbase status quo apache-con europe - nov 2012
Hbase status quo   apache-con europe - nov 2012Hbase status quo   apache-con europe - nov 2012
Hbase status quo apache-con europe - nov 2012Chris Huang
 
Hbase schema design and sizing apache-con europe - nov 2012
Hbase schema design and sizing   apache-con europe - nov 2012Hbase schema design and sizing   apache-con europe - nov 2012
Hbase schema design and sizing apache-con europe - nov 2012Chris Huang
 
重構—改善既有程式的設計(chapter 12,13)
重構—改善既有程式的設計(chapter 12,13)重構—改善既有程式的設計(chapter 12,13)
重構—改善既有程式的設計(chapter 12,13)Chris Huang
 
重構—改善既有程式的設計(chapter 10)
重構—改善既有程式的設計(chapter 10)重構—改善既有程式的設計(chapter 10)
重構—改善既有程式的設計(chapter 10)Chris Huang
 
重構—改善既有程式的設計(chapter 9)
重構—改善既有程式的設計(chapter 9)重構—改善既有程式的設計(chapter 9)
重構—改善既有程式的設計(chapter 9)Chris Huang
 
重構—改善既有程式的設計(chapter 8)part 1
重構—改善既有程式的設計(chapter 8)part 1重構—改善既有程式的設計(chapter 8)part 1
重構—改善既有程式的設計(chapter 8)part 1Chris Huang
 
重構—改善既有程式的設計(chapter 7)
重構—改善既有程式的設計(chapter 7)重構—改善既有程式的設計(chapter 7)
重構—改善既有程式的設計(chapter 7)Chris Huang
 
重構—改善既有程式的設計(chapter 6)
重構—改善既有程式的設計(chapter 6)重構—改善既有程式的設計(chapter 6)
重構—改善既有程式的設計(chapter 6)Chris Huang
 
重構—改善既有程式的設計(chapter 4,5)
重構—改善既有程式的設計(chapter 4,5)重構—改善既有程式的設計(chapter 4,5)
重構—改善既有程式的設計(chapter 4,5)Chris Huang
 
重構—改善既有程式的設計(chapter 2,3)
重構—改善既有程式的設計(chapter 2,3)重構—改善既有程式的設計(chapter 2,3)
重構—改善既有程式的設計(chapter 2,3)Chris Huang
 
重構—改善既有程式的設計(chapter 1)
重構—改善既有程式的設計(chapter 1)重構—改善既有程式的設計(chapter 1)
重構—改善既有程式的設計(chapter 1)Chris Huang
 
Designs, Lessons and Advice from Building Large Distributed Systems
Designs, Lessons and Advice from Building Large Distributed SystemsDesigns, Lessons and Advice from Building Large Distributed Systems
Designs, Lessons and Advice from Building Large Distributed SystemsChris Huang
 
Hw5 my house in yong he
Hw5 my house in yong heHw5 my house in yong he
Hw5 my house in yong heChris Huang
 
Social English Class HW4
Social English Class HW4Social English Class HW4
Social English Class HW4Chris Huang
 

Mehr von Chris Huang (20)

Data compression, data security, and machine learning
Data compression, data security, and machine learningData compression, data security, and machine learning
Data compression, data security, and machine learning
 
Kks sre book_ch10
Kks sre book_ch10Kks sre book_ch10
Kks sre book_ch10
 
Kks sre book_ch1,2
Kks sre book_ch1,2Kks sre book_ch1,2
Kks sre book_ch1,2
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorial
 
Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...
 
Wissbi osdc pdf
Wissbi osdc pdfWissbi osdc pdf
Wissbi osdc pdf
 
Hbase status quo apache-con europe - nov 2012
Hbase status quo   apache-con europe - nov 2012Hbase status quo   apache-con europe - nov 2012
Hbase status quo apache-con europe - nov 2012
 
Hbase schema design and sizing apache-con europe - nov 2012
Hbase schema design and sizing   apache-con europe - nov 2012Hbase schema design and sizing   apache-con europe - nov 2012
Hbase schema design and sizing apache-con europe - nov 2012
 
重構—改善既有程式的設計(chapter 12,13)
重構—改善既有程式的設計(chapter 12,13)重構—改善既有程式的設計(chapter 12,13)
重構—改善既有程式的設計(chapter 12,13)
 
重構—改善既有程式的設計(chapter 10)
重構—改善既有程式的設計(chapter 10)重構—改善既有程式的設計(chapter 10)
重構—改善既有程式的設計(chapter 10)
 
重構—改善既有程式的設計(chapter 9)
重構—改善既有程式的設計(chapter 9)重構—改善既有程式的設計(chapter 9)
重構—改善既有程式的設計(chapter 9)
 
重構—改善既有程式的設計(chapter 8)part 1
重構—改善既有程式的設計(chapter 8)part 1重構—改善既有程式的設計(chapter 8)part 1
重構—改善既有程式的設計(chapter 8)part 1
 
重構—改善既有程式的設計(chapter 7)
重構—改善既有程式的設計(chapter 7)重構—改善既有程式的設計(chapter 7)
重構—改善既有程式的設計(chapter 7)
 
重構—改善既有程式的設計(chapter 6)
重構—改善既有程式的設計(chapter 6)重構—改善既有程式的設計(chapter 6)
重構—改善既有程式的設計(chapter 6)
 
重構—改善既有程式的設計(chapter 4,5)
重構—改善既有程式的設計(chapter 4,5)重構—改善既有程式的設計(chapter 4,5)
重構—改善既有程式的設計(chapter 4,5)
 
重構—改善既有程式的設計(chapter 2,3)
重構—改善既有程式的設計(chapter 2,3)重構—改善既有程式的設計(chapter 2,3)
重構—改善既有程式的設計(chapter 2,3)
 
重構—改善既有程式的設計(chapter 1)
重構—改善既有程式的設計(chapter 1)重構—改善既有程式的設計(chapter 1)
重構—改善既有程式的設計(chapter 1)
 
Designs, Lessons and Advice from Building Large Distributed Systems
Designs, Lessons and Advice from Building Large Distributed SystemsDesigns, Lessons and Advice from Building Large Distributed Systems
Designs, Lessons and Advice from Building Large Distributed Systems
 
Hw5 my house in yong he
Hw5 my house in yong heHw5 my house in yong he
Hw5 my house in yong he
 
Social English Class HW4
Social English Class HW4Social English Class HW4
Social English Class HW4
 

Kürzlich hochgeladen

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Kürzlich hochgeladen (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

Real time big data applications with hadoop ecosystem

  • 1. Real-time Big Data Applications with Hadoop Ecosystem Chris Huang Sr. Manager, Core Tech 2014/9/24 1 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc.
  • 2. About – Chris Huang • Chris Huang – SPN Solution Developer Manager – SPN Hadoop Architect – Hadoop.TW Active Member • Believes Cloud, Service, Software, Big Data are critical factors for Taiwan’s future economic development 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2
  • 3. Conference Talks 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 3
  • 4. Conference Talks 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 4
  • 5. Hot Keywords in Hadoop Community Real-time • Impala, Stinger Computing Framework • YARN, Tez In Memory • Spark Streaming • Kafka, Storm 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 5
  • 6. Big Data Applications • Operational – Real-time – Near Real-time • Analytical – Batch – Interactive – Near Real-time – Streaming 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 6
  • 7. An Online Music Example • Operational – Recent N login time (listen duration) – Recent N album/artist user browses – Recent N keyword user search – Recent N song/album/artist user listens (buys) – Recent N month user’s purchase amount • Analytical – Recommend right song/album/artist to right user at right time – Correlate similar song/album/artist (CDDB or user behavior) – Know seasonal music trending (X’max, Valentine’s Day, New Year) – Know regional music trending – Calculate regional leaderboard – Connect user with social network 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 7
  • 8. An Online Banking Example • Operational – Recent N login time / frequency – Recent N items purchased by credit card – Recent N month balance amount – Recent N transfer in/out amount – Recent N investment event – Recent N month investment balance • Analytical – Know user’s profile more (assets/debts/shopping habits/family) – Recommend right product to right user (investment, credit card, loan) – Know seasonal trending (tax month/year end/back to school/X’mas) – Know regional investment product leaderboard (by different age) – Recommend product by similar user profile 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 8
  • 9. Building Your Big Data Applications • Think about your data – Entity or Event? • Think about your use case – Operational or Analytic? • Think about your data user – External or Internal? 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 9
  • 10. Think About Your Data Slides from “Apache HBase Application Archetypes”, HBaseCon 2014 You can Replace HBase with similar alternatives, but concepts are the same 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 10
  • 11. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 11
  • 12. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 12
  • 13. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 13
  • 14. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 14
  • 15. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 15
  • 16. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 16
  • 17. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 17
  • 18. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 18
  • 19. Think About Your Use Case 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 19
  • 20. Operational Use Case 1 MR / Spark 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 20 Real-time MR / Spark Real-time Batch Batch Real-time HDFS
  • 21. HBase: No Secondary Index (yet) • Search index building (row key) • Use Solr to make text data searchable – Snapshot & clone table – Index column qualifier text – Record row-key in Solr document – Use HBase client to fetch data • Usually less than few seconds 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 21
  • 22. Operational Use Case 2 (SPN) 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 22 Get, Scan Solr Client low latency high throughput Index Query MapReduce Pig HDFS Flume Feed App Real-time Real-time Batch
  • 23. Operational Use Case 3 (Mixed) Real-time Put, Incr, Append 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 23 Get, Scan Solr Client low latency high throughput Index Query Gets Short scan MapReduce Pig HDFS Flume Feed App Real-time Batch HBase Client HBase Client Bulk Import HBase Client MR / Spark Batch HBase Replication Solr MR / Batch Spark
  • 24. HBase or HDFS? • Depends on what’s your data – Entity or Event? • Depends on your workload – Low latency? – Random read/write? – Short/full scan? – Sequential read/write? – Update? 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 24
  • 25. Wait… Batch for Operational? 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 25
  • 26. Yes, Why not? 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 26
  • 27. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 27
  • 28. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 28
  • 29. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 29
  • 30. Operational: Batch + Real-time • Bridge the gap between batch and now • 80/20 rule – HDFS/MapReduce/Spark solves 80% easily – Remaining 20% takes 80% of the efforts • Go as close as possible, don’t overdo it! 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 30
  • 31. What is Real-time? • Real-time is NOT always “faster than batch” – If you have really BIG DATA • Most of the time, we want Timely Information • Minimize the gap between scheduled batch jobs Hourly Job 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 31 Hourly Job Hourly Job How to get result at 1:33?
  • 32. Analytical Use Case Batch/streaming compute Near real-time/interactive deliver 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 32
  • 33. Near Real-time Interactive 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 33
  • 34. Recommendation System 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 34
  • 35. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 35
  • 36. The Online Music Example • Operational – Recent N login time (listen duration ) – Recent N album/artist user browses – Recent N keyword user search – Recent N song/album/artist user listens (buys) – Recent N month user’s purchase amount Do you really want to analytical result • Analytical (recommendation) EVERY 50 millisecond? – Recommend right song/album/artist to right user at right time – Correlate similar song/album/artist (CDDB or user behavior) – Know seasonal music trending (X’max, Valentine’s Day, New Year) – Know regional music trending – Calculate regional leaderboard – Connect user with social network 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 36
  • 37. Analytical Use Case 1 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 37 Batch HDFS Index Query Solr Client Real-time
  • 38. Analytical Use Case 2 (SPN) “A Graph Service for Global Web Entities Traversal and Reputation Evaluation Based on HBase”, HBaseCon 2014 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 38
  • 39. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 39
  • 40. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 40
  • 41. You Need an Interactive Analytic Engine 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 41
  • 42. Stinger 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 42
  • 43. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 43
  • 44. Impala Architecture Datanode Tasktracker Regionserver impala daemon NN, JT, HM Active 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2 NN, JT, HM Standby Datanode Tasktracker Regionserver impala daemon Datanode Tasktracker Regionserver impala daemon State store Catalog Datanode Tasktracker Regionserver impala daemon Hive Metastore
  • 45. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2
  • 46. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2
  • 47. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2
  • 48. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2
  • 49. Apache Pig (MapReduce) • Do hourly count on akamai log – A = load 'date://2014/07/20/00' using AkamaiRCLoader(); B = foreach (group A all) COUNT_STAR(A); dump B; – … 0% complete 100% complete (194202349) 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2 Too Slow for Interactive
  • 50. Using Impala • No memory cache – > select count(*) from akafast where day=20140720 and hour=0 – 194202349 • with OS cache • Do a further query: – select count(*) from akafast where day=20140720 and hour=00 and c='US'; – 41118019 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 2 Make Sense Now
  • 51. Don’t Connect Analytic Engine with Operational Use Case 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 51
  • 52. Analytical Use Case 3 low latency high throughput Real-time Put, Incr, Append 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 52 Gets Short scan HBase Client Impala/Stinger HDFS Flume Feed App Real-time Interactive HBase Client Bulk Import HBase Client MR / Spark Batch Customer Analyst
  • 53. Streaming Use Cases 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 53
  • 54. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 54
  • 55. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 55
  • 56. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 56
  • 57. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 57
  • 58. TME – Trend Message Exchange http://trendmicro.github.io/tme/ 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 58
  • 59. Streaming Operational Use Case Real-time Gets Short scan Kafka/Storm Put, Incr, Append HBase Client Kafka/Storm low latency HDFS high throughput 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 59 HBase Client Streaming Index Query Solr Client Streaming
  • 60. Streaming Analytical Use Case Put, Incr, Append HBase Client Kafka/Storm low latency HDFS high throughput Flume Feed App 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 60 Gets Short scan HBase Client Impala/Stinger Interactive Analyst Real-time Customer Streaming
  • 61. Think About Your Data User 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 61
  • 62. Data User • External – Customer – Partner • Internal – Business report user – Data researcher – Data analyst – Algorithm developer • They want instant response • They don’t know (and don’t care) if the recommendation is computed 1 hour ago or 50 ms ago • Interactive or near real-time is enough • Sometimes even wait for batch (make data small and analyze) • Of course, everyone wants result faster, but it depends on your investment $$ 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 62
  • 63. No Silver Bullet For Real-time, Or Big Data Application 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 63
  • 64. Q&A 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 64
  • 65. 9/25/2014 Confidential | Copyright 2013 TrendMicro Inc. 65

Hinweis der Redaktion

  1. group sum at datanode group sum at coordinator