講者:Etu 資深協理 | 陳育杰
簡介:過去這兩年內,Big Data 在企業的應用架構已逐漸形塑出來,我們看到,不同的產業,陸續開始運用 Hadoop 來解決不同的問題,而背後的 IT 架構,其實都具有一些共通性。我們將透過這些共通性的架構來探索 Big Data / Hadoop 具體展現的企業應用。
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Big Data Taiwan 2014 Keynote 4: Monetize Enterprise Data – Big Data 在台灣的經典應用與行動
1. Monetize Enterprise Data
Big Data 在台灣的經典應用與行動
陳育杰 Eric Chen
Senior AVP, Etu Business Development
eric_chen@etusolution.com
2. 的特性
2
傳統平行運算架構
The
Old
Way:
Bringing
Data
to
Compute
Hadoop
架構
=
平行運算 +
分佈式儲存
The
New
Way:
Bringing
Compute
to
Data
運算
儲存
MapReduce
HDFS
3. Big Data 新應用架構
3
RDB
Business
Intelligence
ETL
Business
Analycs
Voice file
Video file
Image file
Doc file
Txt file
XML file
Web Logs
Click event
Social
network
Associated
map News
Feeds
Sensor
Embedded
RFID Tags
Geographic
GPS
Event
Others
MapReduce
HDFS
HBase
HIVE
Impala
Mahout
Pig
4. Big Data 新應用架構
Hadoop as a “Data Store”
4
RDB
Business
Intelligence
ETL
Business
Analycs
Voice file
Video file
Image file
Doc file
Txt file
XML file
Web Logs
Click event
Social
network
Associated
map News
Feeds
Sensor
Embedded
RFID Tags
Geographic
GPS
Event
Others
MapReduce
HDFS
HBase
HIVE
Impala
Mahout
Pig
5. Big Data 新應用架構
Hadoop as a “Data Pre-
processing Platform”
5
RDB
Business
Intelligence
ETL
Business
Analycs
Voice file
Video file
Image file
Doc file
Txt file
XML file
Web Logs
Click event
Social
network
Associated
map News
Feeds
Sensor
Embedded
RFID Tags
Geographic
GPS
Event
Others
HDFS
HBase
HIVE
Impala
Mahout
MapReduce
Pig
HIVE
QL
Join,
Aggrega,on,
Filter,
Sor,ng,
Correla,on
……..
Data
Integra4on
6. Big Data 新應用架構
Hadoop as a “DB”
6
RDB
BI
ETL
Business
Analycs
Voice file
Video file
Image file
Doc file
Txt file
XML file
Web Logs
Click event
Social
network
Associated
map News
Feeds
Sensor
Embedded
RFID Tags
Geographic
GPS
Event
Others
MapReduce
HDFS
HBase
HIVE
Impala
Mahout
Pig
ODBC
API
7. Big Data 新應用架構
Hadoop as a “Data Analytics
Engine”
7
RDB
Business
Intelligence
ETL
Business
Analycs
Voice file
Video file
Image file
Doc file
Txt file
XML file
Web Logs
Click event
Social
network
Associated
map News
Feeds
Sensor
Embedded
RFID Tags
Geographic
GPS
Event
Others
MapReduce
HDFS
HBase
HIVE
Impala
Mahout
Pig
9. Real-time Query (電信)
Use Case Reference Architecture
9
A10
/
F5
Etu
Hadoop
Cluster
HDFS
HBase
Syslog
/
TCP
Plug-‐in
Etu
Data
Flow
MSISDN
IP
/
Time
IP
by
MSISDN
IP
by
MSISDN
MSISDN
by
IP
Real-‐me
Query
End-‐to-‐End
5
sec
Historical
Query
Internet
10. Customer Services (電信)
Use Case Reference Architecture
10
DPI
Syslog
UDP
Internet
Border
Router
Etu
Log
Collector
200K
EPS
HDFS
FTP
1
TB
/
Day
0.7
TB
/
Day
Table
A
Table
B
Table
C
Etu
Hadoop
Cluster
for
Correla4on
(Regional)
HDFS
HBase
Etu
Hadoop
Cluster
for
Query
(180
days)
(Central)