SlideShare a Scribd company logo
1 of 54
Download to read offline
Engineering Practices in
Big Data Storage and Processing
Nov.20, 2013
Schubert (Songbo) Zhang
About me
• 张松波 (Schubert Zhang)
• Backgrounds
• Senior Engineer Tech Lead and Architect, Infrastructure Data Team, @Baidu
• VP Engineering, Cloud & Big Data R&D, @Hanborq
• Senior Engineering Manager, @UTStarcom
• 10 years of Telecom, 5 years of Cloud Storage & Big Data, 1 year of Internet

2
Categories of (Big) Data
• Rows / Records
•
•
•
•

Logs
User Profiles
Shopping Orders
…

• Files / Objects
•
•
•
•

Documents
Photos
Videos
…

• Presentation

• Presentation

• A mess -> organizing, indexing -> fast to
retrieve …
• Batch and sequential processing …

• Organizing, indexing -> fast to retrieve …
• Batch and sequential processing …

• Tables with Schema
• Data Types
• Database, Data-Warehouse

• Files in File-System
• Objects in Object-Storage-System
• With metadata …

Over the common underlayer storage and IO system: Hardware, Disk, Network …
3
Products and
Engineering Projects
Object Storage, Data Warehouse, Cluster Management, etc.
For enterprise!

4
Products Line
大数据工程 (Big Data)

云存储 (Cloud Storage)

HB-CDW产品线是基于云计算技术实现的面向大数据(PB级)存储、

HB-CSS产品线为企业或个人提供云存储解决方案及服务。提供类

查询和分析以及挖掘的大数据仓库系统。核心产品包括基于

似Amazon AWS S3的服务层API和用户体验,可扩展、安全、快速

Hadoop生态系统的大数据仓库、海量结构化数据管理系统

的云对象存储系统oNest。基于oNest,为企业和个人提供接入云

HugeTable。基于Hanborq增强并扩展的Hadoop、HBase、Hive、

存储服务的存储网关(Storage Gateway)及类似Dropbox的在线云

Pig等大数据基础软件,实现特有的数据模型、系统架构和标准

存储服务(uDrop/eDrop)。在大型互联网、教育、电信、媒体、交

的SQL/API,提供对大数据的快速加载、实时索引查询,以及基

通等行业领域有广泛的使用案例。

Hanborq
Products
系统提供灵活的扩展性和安全可靠性。在电信、电力、交通、
于MapReduce和MPP等并行计算技术的深度统计、分析和挖掘。

大型互联网等大数据行业领域有广泛的使用案例。

管理系统 (Management)
HB-ClusterMaster是大规模数据中心集群规划、操作系统及应用程序自动化安
装部署、配置管理、监控及运营维护的软件系统,实现大规模云计算集群的高
效部署和运维。目前部署和管理的最大单系统案例超过2000个物理服务器节点。
5
Cloud Object Storage System : oNest
• Web Service and API

• Amazon AWS S3 RESTful API
• S3 Data Model (User->Buckets->Objects)

• Backend Distributed Object Storage System
• Google GFS + Facebook Haystack
•
•
•
•
•

Triple copy of data trunks
Write-through, Strong consistency
Append only and Compaction
High efficient Local Index
…

SDK
(C++/Java/Python/PHP/Go…)

Web Service
(RESTful API over HTTP)

Metadata Layer

• Backend Distributed Metadata Layer
• Flexible data model
• NoSQL

Object/Trunk Storage Layer
6
Cloud Object Storage System : oNest
Logic

Physical

Rock

User

Bucket

Object/Pebble

Chunk

Part
Rock

Chunk

Object

Part

Bucket2

Bucket3

Bucket4

Chunk

Chunk

Rock

Chunk

Chunk

Chunk

Object

Part

Chunk

Object

Bucket1

Chunk

Part

Chunk

Chunk
Object

Object

Object

Object

Object

Object

Object

Object

Chunk

Object

Object

Rock
&
Chunks

Data Model and Data Organization

7
Cloud Object Storage System : RockStor-> oNest
应用系统1

……

应用系统N

SDK (Java) for Developers

HTTP接口
HTTP接口

HTTP接口

RESTful API
(Cloud Service)

HTTP接口
HTTP接口

接口层
RockStor Service Load Balancers

WEB服务

(访问请求负载均衡器,多点部署,LVS)

WEB服务

……

WEB服务
计量信息

RockMaster
AAA, CAS

RockServer

管理接口
管理接口

系统管理

负载均衡

分布式云对象存储系统

Management
Console

资源管理平台

RESTful API
(Internal)

RockServer

对象 对象访问
服务层 相关
功能 对象属性
RockServer

容器 容器访问
相关
功能 容器属性

用户
相关
功能

认证

用户控制
日志管理

鉴权
统计报表

RockServer

运维管理

分布式存储系统集群 Hadoop
(存储和管理Rock文件)

分布式数据库集群 HBase
(存储和管理元数据)

Fast/Simple Prototype Leverage Open Source

存储层

分布式存储系统

To be a Product and Service.

8
Cloud Object Storage System : oNest
Region
Console
Console
WebServer
WebServer

机房A

Console
Console
WebServer
WebServer

Console
Console
WebServer
WebServer

Console
Console
WebServer
WebServer

ClusterMaster
ClusterMaster

Master
Master

AAA
Slave

Stats
Master

Stats
Master

Stats
Slave

Stats
Slave

AAA AAA
Slave Slave

Master

Proxy

AAA AAA
Web Web
Service
Service

Stats Cluster

Master
Master

Stats
Master

(1) 支持高可靠,多副本数据存储,支持动
态环境下数据副本的自动修复

Stats
Master

Discovery Service Cluster

AZ
OAS Cluster
OAS

DataStorage Cluster
OAS

Healer Cluster
Healer

DataNode
DataNode DataNode

MetaNode Cluster
Healer

MetaNode
MetaNode
SlaveSlave

Master

Healer

MetaNode
Slave

Stats
Slave

Stats
Slave

AZ

OAS Cluster
OAS

OAS

DataStorage Cluster
OAS

Healer Cluster
Healer

DataNode
DataNode DataNode

MetaNode Cluster
Healer

Master
Master

• oNest对象云存储平台系统以对象的形式存
储数据,为互联网业务和企业用户提供可达
百PB级的云存储服务
• oNest系统提供的对象云存储服务的主要特
点:

AAA AAA
Web Web
Service
Service

Proxy

Discovery Service Cluster

OAS

机房B

AAA Cluster
AAA AAA
Slave Slave

Master

Console
Console
WebServer
WebServer

ClusterMaster
ClusterMaster

AAA Cluster
AAA
Slave

Console
Console
WebServer
WebServer

MetaNode
Slave

MetaNode
MetaNode
SlaveSlave

Master

Master
Master

(2) 支持大规模存储(容量x100PB级以上),
存储对象数量和容量的线性扩容
(3) 支持一个数据中心内和跨数据中心备份
数据
(4) 支持大规模并发访问
(5) 支持安全的数据访问

Healer

To be a more Complete Product and Service.

9
Cloud Object Storage System : oNest
创建Bucket

新建目录

上传对象

刷新列表

查看属性

操作记录

用户名

右键菜单
对象集列表

对象列表
对象基本属性描述

点击进入详细属性描述,包括对象下载地址
点击进入ACL权限管理

10
Cloud Object Storage System : oNest
教育云应用的用户

教育云App-1

SDK

教育云应用服务

REST
oNest提供统一标准的云存储接口,教育云应用可
以通过该接口存储、读取、或操作这些数据对象

教育云App-2

教育云应用即是oNest云存
储的用户。

REST

注册、登录、
Console

oNest云存储服务

BC-oNest对象云存储服务
oNest是一个弹性的对象云存储系统,可类比Amazon AWS S3。
为教育云提供视频、音频、图片、文档等数据的存储服务。

11
Dropbox-Like NetDisk Service: uDrop / eDrop
• Hack Dropbox
208.43.202.5
...
Softlayer Datacenter

keep alive (http)

login (https)
list, delete rename and sync (https)

67.228.78.114
67.228.78.116
67.228.78.117
...
Dropbox Web Server

Client
download and upload data (https)

75.101.145.128
75.101.138.84
...
Amazon S3 & EC2

• keep-alive mechanism
• Delta update
• Mechanism of shared
file block
• Dropbox client database:
Sqlite

• 数据/文件分割和指纹
• 增量上传算法
• 所谓“秒传”
12
Dropbox-Like NetDisk Service: uDrop / eDrop
PC
Client

Mobile
Client

Browser

REST AccessServer

REST AccessServer

MetaAPI

DataAPI

MetaAPI

Meta
Server

Meta
Server

DataAPI

Web Server
MetaAPI

DataAPI

Register

Meta
Server

Meta
Server

Matcher

oNest
ZooKeeper

HBase
13
Big Data Platform
Users, Applications
SQL/Scrpits/Java/Web

Backup

Smart SQL and Executi on Engine
Big
Data
Source

Big
Data
Source

Hive
HugeTable
BulkLoad
(Flume

Flive)

ETL
Data
Mini ng

MapReduce/Impala
Hcatalog
Bigtable
Bigtable

HBase
Oozie

……
……
Big
Data
Source

Pig

file

file

file

HD FS

Ganglia
Nagios
Clus terMaster
(Deplo yment)

Shared Cluster of Serv ers

14
Big Data Warehouse: HugeTable -> Horizon
• 以HDFS为基础存储平台,支持多种存储格式,可扩展
SQuirreL SQL Client
(GUI)

SQLLine
(CLI)

Web SQL Client

Apps
(Programming)

JDBC Driver

JDBC Driver

JDBC Driver

JDBC Driver

•
•
•

• 多种数据访问模型
•
•
•

Smart SQL Engine
Smart SQL Engine
智能SQL引擎
智能SQL引擎

Pig

HugeTable Data Model
数据建模

Unified Schema
统一元数据

Impala
(MPP)

MapReduce

HFile

TextFile

SequenceFile

(SSTables)

(Recorded)

(Key-Value Rows)

HDFS

HBase
MapReduce
MPP: Impala

• HugeTable特有的数据存储模型
•
•
•
•

Encodeing/Decoding
Indexing
Partitioning
…

• 统一的Data Schema Metadata管理

Hive

HBase

HBase/HFile,
行存储:TextFile, SequenceFile
列存储:RCFile/ORCFile, Rarquet, …

RCFile/
ORCFile
(Columnar)

• Smart SQL Engine and Server
•
•

高性能、高并发、高稳定性、分布式
选择不同的数据访问模型路径

• 兼容Hive和Pig
Parquet
(ColumnIO)

User-Defined
Formats ...

• 标准化JDBC客户端接口和客户端工具
• 工程辅助工具
•
•

快速批量加载 BulkLoad和导出 (提供SQL界面)
快速部署工具
15
Big Data Warehouse: HugeTable -> Horizon
JDBC and ODBC

REST

API

Management

...

SQL Engine
(Standard, Familiar, Low Learning Curve, ...)

Data Warehouse Utilities / Tools
(SpeedLoader, SpeedScan, Data
LifeCycle, ...)

Bigtable (HBase)

DFS (Hadoop HDFS)

Connectors
Integrating into Hadoop Ecosystem

Data Model
(Data Organization, Indexing,
Partitioning, Encoding,
Compressing, ...)

Oozie

HCatalog

Pig

Hive

MapReduce

16
NoSQL vs. SQL
• NoSQL, BigTable, Cassandra, etc., are just the “Storage Engine Layer” of DBMS.
• Users always like and be familiar with SQL to touch their data.
MySQL Server

Horizon

SQL Engine Layer

Distributed
SQL Engine

vs.
Storage Engine Layer
(MyISAM, InnoDB, etc.)

Distributed Storage Engine
(NoSQL, HBase)

How about to build a Distributed DBMS? Megastore, Greenplum/Pivotal/GitusDB, 17
etc.
经分大数据平台
Plan & Design
数据存储模型定义 (Schema, Types, Indexes, StorageEngine, etc.)
数据处理操作和流程定义 (SQL, Scripts, Java, WorkFlow, etc.)

BOSS
帐详单CDR数据

批量加载工具
(Files,
BulkLoad, etc.)

网络
CDR数据
(Gn/Gb/IuPS ...)
信令数据
(Iub/Iucs/mmsc ...)

日志数据
(WAP, WLAN ...)

DPI采集数据

统一大数据存储和分析平台

Client

根据实
际业务
数据进
行开发
和移植

实时加载工具
(Flume, Flive,
etc.)

离线接
口一般
无需修
改

数据库数据转
移工具
(Sqoop, etc.)

SQL

Scripts

...

Java

Hive

Horizon

ETL处理
逻辑

HBase

MapRedu
ce

Impala

Hadoop HDFS基础存储层
CRM
用户资料

MapReduce

其他工程工具

Pig

根据实
际业务
数据进
行开发
和移植

离线接
口一般
无需修
改

统计、汇总
分析、报表
类业务

即席查询
类业务
(ad-hoc)
数据挖掘
类业务

Data
Mining

其他OLAP
业务

数据处理和访问

业务功能

其他数据

大数据来源 (多样性)

数据加载和预处理

数据存储、组
织和处理平台

原则:以离线、批量分析为主,兼顾数据查询和管理
18
大数据服务平台
JDBC for Local Deployment

RESTful for Remote Deployment

Load Balancer
(LVS, with HA)

HugeTable
Web Service

Web Service

Web Service

SQL Engine
Server

SQL Engine
Server

SQL Engine
Server

LifeCycle
file

Online
Generated
Data (CDR)

(On/Offline,
DataDrop)

Connector

Flive

HugeTable Data Model

BulkLoad
file

Hive/Pig
MapReduce

Hive/Pig
MapReduce

HBase, Hadoop

(with
SpeedScan)

Analysis

ETL

原则:以实时低时延数据查询为主,兼顾数据分析
19
Cluster Management: ClusterMaster

20
Cluster Management: ClusterMaster

21
Hadoop and Open Source Ecosystem
• MapReduce
• Runtime Job/Task Schedule & Latency
•
•
•

Work Pool
Transfer Job description information
…

• Processing Engine Improvements
•
•

Shuffle: sendfile, Netty Server, Batch Fetch
Sort Avoidance: Spilling and Partitioning, Hash
Aggregation

• HBase (to be a Data Warehouse backend)
•
•
•
•
•

Low Level HFile management
Speed Bulk Load
Speed Scan for Analysis
Flexible control of Flush, Compaction, Split, Balance
Coprocessor for parallel processing

• Flume
• Support more Data Sources and Data Storages
• More flexible Command Line tool

• Hive

• Faster SQL Engine
• Support more Storage Engines
• More UDFs for database functions (such as NVL,
DECODE from Oracle.)
• More UDFs for OLAP (such as Roll-Up, Cube, Efficient
Aggregations, etc.
• More algorithms for efficient statistics and estimate
(such as LogLog-Counter for estimated DISTINCT values)

• Pig

• Support more Data Storages
• More UDFs for analysis, statistics and data mining (such
as K-Mean, ID3 for Decision Tree, etc.)

• Tools
•
•
•
•

Deployment: Hdeploy, HTCfg, ClusterMaster
Management: Integrate Ganglia, Nagios, Puppet, etc.
Light and handy command line: Hman, etc.
Benchmark Tools: Hbench, etc.
22
Know the Details of Hadoop …

23
MapReduce Runtime Optimization
• Job/Task Schedule & Latency
• Worker Pool

Job Latency (in second, lower is
better)
Total Tasks (96 maps, 4 reduces)
50

MapReduce
Client

45

RPC
(JobConf)

JobTracker

43

40
35
30
25

24

20

TaskTracker

TaskTracker

15

TaskTracker

10
5

Child
Worker

Child
Worker
Worker Pool

Child
Worker

Child
Worker

Child
Worker
Worker Pool

Child
Worker

Child
Worker

Child
Worker

Child
Worker

1

0
CDH3u2 (Cloudera) CDH3u2 (Cloudera)
(reuse.jvm disabled) (reuse.jvm enabled)

HDH3u2 (Hanborq)

Worker Pool

24
MapReduce Processing Engine Optimization
• Shuffle: Use sendfile to reduce data copy and context switch.

• Shuffle: Netty Shuffle Server (map side) and Batch Fetch (reduce side).
• Sort Avoidance.
• Spilling and Partitioning, Counting Sort, Bytes Merge, Early Reduce, etc.
• Hash Aggregation in job implementation.

Real Aggregration Jobs
(lower is better)

Sort Avoidance and Aggregation
700

2400
2200
2000
1800
1600
1400
1200
1000
800
600
400
200
0

600

2186

500

615
197 175

216 198

Case1

Case2

197

216

175

198

615

300

200

2186

HDH (Hanborq)

400

Case3

CHD3u2 (Cloudera)

time (seconds)

time (seconds)

(lower is better)

100
0

Case1-1

Case2-1

Case1-2

Case2-2

CDH3u2 (Cloudera)

238

603

136

206

HDH (Hanborq)

233

578

96

151

25
中国移动BigCloud
自2008年开始与中国移动研究院合作定义、设计和开发“大云”1.0体系结构和产品系列,目前已完成
了“大云”2.0的研发任务。
已支持“大云”系统在中国移动及其它行业用户广泛部署,提供软、硬件系统解决方案及服务。云存储
及数据仓库产品及服务,单一数据中心部署容量已超过2,000节点,管理超过20PB的存储容量。为电信
详单、日志、信令、文档、视频、图片及互联网页数据,提供存储、分析及检索服务。
 BC-HugeTable(海量结构化数据管理系统)
 大数据仓库 (分析和查询)
 大数据库 (分析和查询)

 BC-Hadoop(海量数据存储和分析平台)
 研究院发行版
 汉播发行版HDH

 BC-oNest(分布式对象存储系统)
 BC-NAS(分布式文件系统中间件)
26
CDR帐详单仓库和查询
清单量(亿条)

HB-CDW集群系统

电信运营网络

450

数据存储和分析服务器集群
HB-CDW系统
(存储,索引,分析)

OSS服
务器

400
350

300
250

200

移动核
心网

网络交换设备

报
表
查
询

实时
采集设备 批量
timeseries

PC浏览器查询

清单量(亿条)

150

100
50

Internet

0
200906 200907 200908 200909 200910 200911 200912

RDBMS和
Web服务器

查询量(次数)
8000000
7000000

6000000

集群监控管理服务器

BSS

智能手机查询

5000000
4000000
查询量

3000000
2000000

Intranet

1000000

0
200906 200907 200908 200909 200910 200911 200912

Terminals

分析报表

PC浏览器监控

方案制定时间:2009-10

智能手机监控

- CDR实时生效延迟<1分钟
- 查询响应(Latency) < 3秒(平均<0.5秒)
- 查询吞吐率:每月2亿次,忙时每秒1000
- 数据安全:数据在3个节点冗余备份
- 数据分析:每日或每月生成KPI报表

用户规模:约1亿用户
CDR详单数据量
- 每月:详单量500亿条,数据量20TB (每秒2
万条以上)
- 总存储6个月:详单量3000亿条,数据量
120TB
- 移动互联网业务详单数据量是普通业务CDR
的5倍以上
数据存储和处理集群规模
- 32台DELL PE C2100服务器
- 每台12 x 1TB数据硬盘,64GB内存

27
WorkFlow/Pipeline控制器

移动 – 经分ETL
周期(每小时)在接口机上运行Pig脚本,驱动MapReduce
Job并行从接口机读取数据,并做格式转换、编码、压缩

和清洗,写成SequenceFile到HDFS。节省存储空间,提高
输出中间汇总(细粒度)数据

后续处理效率,易扩展新的ETL功能

月180GB,存储到HDFS 31
天,待月汇总

WAP日志文件

Hadoop Node

接口机每小时拉文件
每日400GB,约4.6万个小文件

高性能/高并发/大存储

华为WAP日志服务器
(FTP Server)
#1
华为WAP日志服务器
(FTP Server)
#2

平台对外总数据接口

……

(输入/输出)

Hadoop Node

防
火
墙

大数据平台
接口机
(FTP Server)

大数据平台
(Hadoop/Hive/Pig/
HugeTable)
Hadoop Node

亚联系统

日汇总Job
(Hive SQL)

……

31天
日汇总Jobs
(Hive SQL)

日汇总
一经规整
(Pig/Scrpits)

31天

月汇总Jobs
(Hive SQL)

月汇总
一经规整
(Pig/Scrpits)

日汇总
一经规整
(Pig/Scrpits)

每日输出5GB规整
后的数据到接口机

每月输出规整后的
数据到接口机

Hadoop Node
每天更新号段维表数据
每月更新用户信息维表
数据
每日定时取前一日汇总数据
每月定时取前一月汇总数据
数据需符合一经规范

28
29
Lessons Learned
Many lessons and many feelings.

30
1. Right Design Comes from Basic Knowledge
of Computer System / Computer Science
• Computer Architecture and How
Computer Works
• Representing and Manipulating
Information and Programs
• Processor Architecture (Pipeline,
Parallel …)
• Storage Architecture
• IO System, etc.

•
•
•
•

• The core issues of database.
• File-system …
• To be distributed now.

Memory/Storage Hierarchy
Modern Operation System
Networking
Languages …
31
Basic Knowledge of CS

- Sequential vs. Random Access …
- Long latency of Disk Seek …
- Throughput
All solutions of database and big data processing system are stand on the characters of computer architecture,
especially disk, network ...
32
Basic Knowledge of CS

by Jeff Dean
33
Basic Knowledge of CS
• What every data engineer needs to know about disks
• Basic Algorithms (Sorting, Searching, Strings, Bitmap, …)
• Linux Virtual Memory, Exceptions, Concurrency, etc.
•…

34
2. Keep Simple and Straightforward
• Master-Slave vs. Decentralized (DHT, Consistent Hash)
• Almost all Google products follow Master-Slave pattern.
GFS/BigTable/MapReduce/ZooKeeper, etc..
• MapReduce: Simplified Data Processing on Large Clusters

• A simple programming model that applies to many large-scale computing problems
• Hide messy details

• Bigtable provides the simple data model, distributed B+ tree …

• Shards and Replicas

• Simple and clean API design
35
Keep Simple and Straightforward
• Example: Bigtable vs. Cassandra
Master
Master

Tablet Server

Tablet Server

Tablet Server

Tablet Server

Tablet

GFS

Bigtable

Cassandra
36
Keep Simple and Straightforward
Bigtable (++)

Cassandra (--)

• Master – Tablet Servers
• Dynamic Tablet Splits
• WAL + MemTable + SSTable
• Three Level Distributed B+Tree
• Replication in GFS
•…

•
•
•
•
•
•
•
•
•
•
•
•

Bigtable ’s architecture and data model make
more sense.

Identical Data Nodes, Gossip
Consistent Hash, Virtual Nodes
WAL + MemTable + SSTable
Hinted Handoff
DHT Ring (neighbor nodes)
Eventual consistency
Read Rapir
Merkle Tree
Clock Vector
Anti-entropy protocol (反熵)
…
好复杂:架构的错误,导致系统越来越复杂 …

http://www.slideshare.net/schubertzhang/cassandra-dynamo-paper
http://www.slideshare.net/schubertzhang/dastorcassandra-report-for-cdr-solution

37
3. There is no “one-size-fits-all” solution
• There are too many contradictory requirements in the structured data world.
• The contradiction of data processing
• Real-time or near-real-time data availability.
• Batch processing for large size of data, such as aggregation.

• The contradiction of data access:
• Low-latency fast query response, like Lookup.
• High-latency ad-hoc analytic query for historical data.

• But, there is no one-size-fits-all answer for above contradictory requirements.
• Identify common problems, and build systems to address them in a general way.

• “Important not to try to be all things to all people!” – Jeff Dean, Keynote at
LADIS’09

38
There is no “one-size-fits-all” solution
• MapReduce
• Dremel (MPP)
• Tez/Stingger
• NoSQL/Bigtable (and with
Coprocessor)
• DBMS
•…

Lambda Architecture: New data is sent to both
layers and queries merge views from both layers.

39
There is no “one-size-fits-all” solution
SQL, Scripts, Java, etc.

Hive

Pig

MapReduce

Java

Impala

GoldenOrb

Dremel

Pregel

不同的查询和分析请求,采用不同的并行执行引擎操作数据。

40
4. Monitorable and Metrizable at any time
• Sufficient Statistic, Monitoring …
• Add Sufficient Monitoring/Status/Debugging Hooks
• If your system is slow or misbehaving, can you figure out why?
• Don’t rely on logs too much, log is too costly and inefficient.
• Use real-time statistics/metrics.
• Use tools, jmxetric, JMX, Ganglia, Nagios, Noah …
41
Monitorable and Metrizable at any time
The magic matrix ??!

Captured from UTStarcom mSwitch R5 system, Guangxi Site, 2004.
42
Monitorable and Metrizable at any time
Write/Insert Operation Benchmark

Read/Query Operation Benchmark

43
Monitorable and Metrizable at any time
SLA Metrics:
•

•

Latency
o tAvgLat: Total Average Latency (ms)
o dAvgLat: Delta Average Latency (ms)
o dMaxLat : Delta Maximum Latency (ms)
o dMinLat : Delta Minimum Latency (ms)

•

percentage of read ops

Throughput
o tThrou :Total Throughput (operation
count)
o dThrou : Delta Throughput (operation
count)

Quantile %

•
•

Total : from benchmark start to present.
Delta: between each statistical interval (2
minutes here)

25.00%
20.00%
15.00%
10.00%
5.00%
0.00%
1

3

5

7

9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61
100ms

 Read Throughput: average ~140 ops/s
 Latency: average ~500ms, 97% < 2s (SLA)
 Bottleneck: disk IO (random seek) (CPU load is very low)

44
Monitorable and Metrizable at any time

45
5. Try to make data in-situ
• The ability to access data ‘in place’.
• ProtocolBuffers/Parquet encoding Real-Time Data Service
Writes
(Puts)

• Example:
• Horizon over HDFS + HBase

Reads
(Get/Scan)

Real-Time API
Schema

Meta

Bulk Load

HBase
Flush/Compaction

(Batch Input)

Coprocessor

MapReduce/
Impala
HFiles (Batch Processing)

HDFS (HFile)
HFiles
46
6. Approximated vs. Precise
• For large data sets, it can be prohibitively expensive to find the precise
result, but there are efficient estimating methods.
• Example Queries:

• How many distinct elements are in the data set (i.e. what is the cardinality of the
data set)?
• What are the most frequent elements (the terms “heavy hitters” and “top-k
elements” are also used)?
• What are the frequencies of the most frequent elements?
• How many elements belong to the specified range (range query, in SQL it looks
like SELECT count(v) WHERE v >= c1 AND v < c2)?
• Does the data set contain a particular element (membership query)?
• …

47
Approximated vs. Precise
• The algorithms are approximate: with high probability it returns
approximately the correct result. (e.g. ±2%)
• select count(distinct userid) from userlogs;
• select top(100) of count(*) from orders group by itemname;
•…
• Statistical and Probabilistic Analysis, Very interesting!
48
Approximated vs. Precise
• Usually Sample/Hash/Bitmap …
• Cardinality Estimation
• Linear Counting
• Loglog Counting …

• Frequency Estimation / Heavy Hitters
• Count-Min Sketch
• Count-Mean-Min Sketch
• Stream-Summary …

• Range Query

• Array of Count-Min Sketches …

• Membership Query
• Bloom Filter

• …
49
5. Open Source and Open Spirit
• Choose you Building Blocks in Engineering view
• Know Your Basic Building Blocks, Not just their interfaces, but understand
their implementations (at least at a high level)

• 善用开源,回馈开源,使开源更好更强大

50
6. And more …
• Description and Documents
• Avoid inventing new Interface for Users
• From simple to complete, From prototype to product
• Make the architecture robust, try it, and then improve and complete it.

• Product vs. Tech. vs. Trick
•…
51
7. Read Books – Read English Books

52
Thank You!

53
Find me outside
• SlideShare:
http://www.slideshare.net/schubertzhang
http://www.slideshare.net/hanborq

• Github:
https://github.com/schubertzhang
https://github.com/hanborq

• Email & Gtalk:
schubert.zhang@gmail.com
• Weibo:
@schubertzh

• LinkedIn:
http://cn.linkedin.com/pub/schubertzhang/6/b51/b5b/

• Blog:

• WeChat:
schubertzh

http://cloudepr.blogspot.com

• Facebook:
https://www.facebook.com/schubertzhang
54

More Related Content

What's hot

HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index StructuresHBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index StructuresCloudera, Inc.
 
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...Insight Technology, Inc.
 
Microsoft R - Data Science at Scale
Microsoft R - Data Science at ScaleMicrosoft R - Data Science at Scale
Microsoft R - Data Science at ScaleSascha Dittmann
 
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at Ooyala
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at OoyalaCassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at Ooyala
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at OoyalaDataStax Academy
 
Scalable relational database with SQL Azure
Scalable relational database with SQL AzureScalable relational database with SQL Azure
Scalable relational database with SQL AzureShy Engelberg
 
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence ArchitectureMongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence ArchitectureMongoDB
 
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...ScyllaDB
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraVictor Coustenoble
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
 
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand UsersDisney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand UsersScyllaDB
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)Sascha Dittmann
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database OverviewSteve Min
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBMongoDB
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...DataStax
 
Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Andrew Morgan
 
Analyze and visualize non-relational data with DocumentDB + Power BI
Analyze and visualize non-relational data with DocumentDB + Power BIAnalyze and visualize non-relational data with DocumentDB + Power BI
Analyze and visualize non-relational data with DocumentDB + Power BISriram Hariharan
 
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...DataStax
 
Get started with Microsoft SQL Polybase
Get started with Microsoft SQL PolybaseGet started with Microsoft SQL Polybase
Get started with Microsoft SQL PolybaseHenk van der Valk
 
Aadhaar at 5th_elephant_v3
Aadhaar at 5th_elephant_v3Aadhaar at 5th_elephant_v3
Aadhaar at 5th_elephant_v3Regunath B
 

What's hot (20)

HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index StructuresHBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
 
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...
 
Microsoft R - Data Science at Scale
Microsoft R - Data Science at ScaleMicrosoft R - Data Science at Scale
Microsoft R - Data Science at Scale
 
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at Ooyala
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at OoyalaCassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at Ooyala
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at Ooyala
 
Scalable relational database with SQL Azure
Scalable relational database with SQL AzureScalable relational database with SQL Azure
Scalable relational database with SQL Azure
 
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence ArchitectureMongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
 
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand UsersDisney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
 
Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Document validation in MongoDB 3.2
Document validation in MongoDB 3.2
 
Relational vs. Non-Relational
Relational vs. Non-RelationalRelational vs. Non-Relational
Relational vs. Non-Relational
 
Analyze and visualize non-relational data with DocumentDB + Power BI
Analyze and visualize non-relational data with DocumentDB + Power BIAnalyze and visualize non-relational data with DocumentDB + Power BI
Analyze and visualize non-relational data with DocumentDB + Power BI
 
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
 
Get started with Microsoft SQL Polybase
Get started with Microsoft SQL PolybaseGet started with Microsoft SQL Polybase
Get started with Microsoft SQL Polybase
 
Aadhaar at 5th_elephant_v3
Aadhaar at 5th_elephant_v3Aadhaar at 5th_elephant_v3
Aadhaar at 5th_elephant_v3
 

Viewers also liked

Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonCapgemini
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataAbishek V S
 
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadas
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadasParadigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadas
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadasBig-Data-Summit
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisSameer Tiwari
 
BigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceBigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceLilia Sfaxi
 
BigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big DataBigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big DataLilia Sfaxi
 
TIPS ON SCIENTIFIC WRITING & EDITING METHODOLOGIES
TIPS ON SCIENTIFIC WRITING & EDITING METHODOLOGIESTIPS ON SCIENTIFIC WRITING & EDITING METHODOLOGIES
TIPS ON SCIENTIFIC WRITING & EDITING METHODOLOGIESRanjan Mohanta
 
蘭花草歌
蘭花草歌蘭花草歌
蘭花草歌Ryan Wong
 
ニコニコ動画でのHTML5
ニコニコ動画でのHTML5ニコニコ動画でのHTML5
ニコニコ動画でのHTML5Sho KUSANO
 
ZENworks Configuration Management
ZENworks Configuration ManagementZENworks Configuration Management
ZENworks Configuration ManagementRoel van Bueren
 
Yliko pake geniko_meros_201105.20-27
Yliko pake geniko_meros_201105.20-27Yliko pake geniko_meros_201105.20-27
Yliko pake geniko_meros_201105.20-27Nikos Kaklamanos
 
A Lei 12.766/12 e o nivel de detalhamento dos estudos de engenharia em ppps
A Lei 12.766/12 e o nivel de detalhamento dos estudos de engenharia em ppps A Lei 12.766/12 e o nivel de detalhamento dos estudos de engenharia em ppps
A Lei 12.766/12 e o nivel de detalhamento dos estudos de engenharia em ppps Mauricio Portugal Ribeiro
 
Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015polo li
 
IBM Big Data References
IBM Big Data ReferencesIBM Big Data References
IBM Big Data ReferencesRob Thomas
 
Rd1 02 comunicacao
Rd1 02 comunicacaoRd1 02 comunicacao
Rd1 02 comunicacaorbraga79
 
Python chapter 2
Python chapter 2Python chapter 2
Python chapter 2Raghu nath
 

Viewers also liked (19)

Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big Data
 
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadas
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadasParadigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadas
Paradigmas de Procesamiento en Big Data: Arquitecturas y Tecnologías aplicadas
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
 
BigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceBigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-Reduce
 
BigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big DataBigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big Data
 
Final Report
Final ReportFinal Report
Final Report
 
TIPS ON SCIENTIFIC WRITING & EDITING METHODOLOGIES
TIPS ON SCIENTIFIC WRITING & EDITING METHODOLOGIESTIPS ON SCIENTIFIC WRITING & EDITING METHODOLOGIES
TIPS ON SCIENTIFIC WRITING & EDITING METHODOLOGIES
 
蘭花草歌
蘭花草歌蘭花草歌
蘭花草歌
 
ニコニコ動画でのHTML5
ニコニコ動画でのHTML5ニコニコ動画でのHTML5
ニコニコ動画でのHTML5
 
ZENworks Configuration Management
ZENworks Configuration ManagementZENworks Configuration Management
ZENworks Configuration Management
 
Yliko pake geniko_meros_201105.20-27
Yliko pake geniko_meros_201105.20-27Yliko pake geniko_meros_201105.20-27
Yliko pake geniko_meros_201105.20-27
 
A Lei 12.766/12 e o nivel de detalhamento dos estudos de engenharia em ppps
A Lei 12.766/12 e o nivel de detalhamento dos estudos de engenharia em ppps A Lei 12.766/12 e o nivel de detalhamento dos estudos de engenharia em ppps
A Lei 12.766/12 e o nivel de detalhamento dos estudos de engenharia em ppps
 
Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015
 
FibreTuff FPS 12th
FibreTuff FPS   12thFibreTuff FPS   12th
FibreTuff FPS 12th
 
IBM Big Data References
IBM Big Data ReferencesIBM Big Data References
IBM Big Data References
 
Gscm1
Gscm1Gscm1
Gscm1
 
Rd1 02 comunicacao
Rd1 02 comunicacaoRd1 02 comunicacao
Rd1 02 comunicacao
 
Python chapter 2
Python chapter 2Python chapter 2
Python chapter 2
 

Similar to Engineering practices in big data storage and processing

Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Martin Bém
 
Cerebro: Bringing together data scientists and bi users - Royal Caribbean - S...
Cerebro: Bringing together data scientists and bi users - Royal Caribbean - S...Cerebro: Bringing together data scientists and bi users - Royal Caribbean - S...
Cerebro: Bringing together data scientists and bi users - Royal Caribbean - S...Thomas W. Fry
 
Prague data management meetup 2017-01-23
Prague data management meetup 2017-01-23Prague data management meetup 2017-01-23
Prague data management meetup 2017-01-23Martin Bém
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableDenodo
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Jim Dowling
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...Felix Gessert
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Trivadis
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problemsAbhishek Gupta
 
5 Steps for Migrating Relational Databases to Next-Gen Architectures
5 Steps for Migrating Relational Databases to Next-Gen Architectures5 Steps for Migrating Relational Databases to Next-Gen Architectures
5 Steps for Migrating Relational Databases to Next-Gen ArchitecturesNuoDB
 
Presentation sql server to oracle a database migration roadmap
Presentation    sql server to oracle a database migration roadmapPresentation    sql server to oracle a database migration roadmap
Presentation sql server to oracle a database migration roadmapxKinAnx
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
So You Want to Build a Data Lake?
So You Want to Build a Data Lake?So You Want to Build a Data Lake?
So You Want to Build a Data Lake?David P. Moore
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 
44spotkaniePLSSUGWRO_CoNowegowKrainieChmur
44spotkaniePLSSUGWRO_CoNowegowKrainieChmur44spotkaniePLSSUGWRO_CoNowegowKrainieChmur
44spotkaniePLSSUGWRO_CoNowegowKrainieChmurTobias Koprowski
 

Similar to Engineering practices in big data storage and processing (20)

Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Cerebro: Bringing together data scientists and bi users - Royal Caribbean - S...
Cerebro: Bringing together data scientists and bi users - Royal Caribbean - S...Cerebro: Bringing together data scientists and bi users - Royal Caribbean - S...
Cerebro: Bringing together data scientists and bi users - Royal Caribbean - S...
 
DA_01_Intro.pptx
DA_01_Intro.pptxDA_01_Intro.pptx
DA_01_Intro.pptx
 
Prague data management meetup 2017-01-23
Prague data management meetup 2017-01-23Prague data management meetup 2017-01-23
Prague data management meetup 2017-01-23
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problems
 
5 Steps for Migrating Relational Databases to Next-Gen Architectures
5 Steps for Migrating Relational Databases to Next-Gen Architectures5 Steps for Migrating Relational Databases to Next-Gen Architectures
5 Steps for Migrating Relational Databases to Next-Gen Architectures
 
Presentation sql server to oracle a database migration roadmap
Presentation    sql server to oracle a database migration roadmapPresentation    sql server to oracle a database migration roadmap
Presentation sql server to oracle a database migration roadmap
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
So You Want to Build a Data Lake?
So You Want to Build a Data Lake?So You Want to Build a Data Lake?
So You Want to Build a Data Lake?
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
44spotkaniePLSSUGWRO_CoNowegowKrainieChmur
44spotkaniePLSSUGWRO_CoNowegowKrainieChmur44spotkaniePLSSUGWRO_CoNowegowKrainieChmur
44spotkaniePLSSUGWRO_CoNowegowKrainieChmur
 

More from Schubert Zhang

Engineering Culture and Infrastructure
Engineering Culture and InfrastructureEngineering Culture and Infrastructure
Engineering Culture and InfrastructureSchubert Zhang
 
Simple practices in performance monitoring and evaluation
Simple practices in performance monitoring and evaluationSimple practices in performance monitoring and evaluation
Simple practices in performance monitoring and evaluationSchubert Zhang
 
Scrum Agile Development
Scrum Agile DevelopmentScrum Agile Development
Scrum Agile DevelopmentSchubert Zhang
 
Bigtable数据模型解决CDR清单存储问题的资源估算
Bigtable数据模型解决CDR清单存储问题的资源估算Bigtable数据模型解决CDR清单存储问题的资源估算
Bigtable数据模型解决CDR清单存储问题的资源估算Schubert Zhang
 
Big Data Engineering Team Meeting 20120223a
Big Data Engineering Team Meeting 20120223aBig Data Engineering Team Meeting 20120223a
Big Data Engineering Team Meeting 20120223aSchubert Zhang
 
HBase Coprocessor Introduction
HBase Coprocessor IntroductionHBase Coprocessor Introduction
HBase Coprocessor IntroductionSchubert Zhang
 
Hadoop大数据实践经验
Hadoop大数据实践经验Hadoop大数据实践经验
Hadoop大数据实践经验Schubert Zhang
 
Wild Thinking of BigdataBase
Wild Thinking of BigdataBaseWild Thinking of BigdataBase
Wild Thinking of BigdataBaseSchubert Zhang
 
RockStor - A Cloud Object System based on Hadoop
RockStor -  A Cloud Object System based on HadoopRockStor -  A Cloud Object System based on Hadoop
RockStor - A Cloud Object System based on HadoopSchubert Zhang
 
Hadoop compress-stream
Hadoop compress-streamHadoop compress-stream
Hadoop compress-streamSchubert Zhang
 
Ganglia轻度使用指南
Ganglia轻度使用指南Ganglia轻度使用指南
Ganglia轻度使用指南Schubert Zhang
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionSchubert Zhang
 
Learning from google megastore (Part-1)
Learning from google megastore (Part-1)Learning from google megastore (Part-1)
Learning from google megastore (Part-1)Schubert Zhang
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aSchubert Zhang
 

More from Schubert Zhang (20)

Blockchain in Action
Blockchain in ActionBlockchain in Action
Blockchain in Action
 
科普区块链
科普区块链科普区块链
科普区块链
 
Engineering Culture and Infrastructure
Engineering Culture and InfrastructureEngineering Culture and Infrastructure
Engineering Culture and Infrastructure
 
Simple practices in performance monitoring and evaluation
Simple practices in performance monitoring and evaluationSimple practices in performance monitoring and evaluation
Simple practices in performance monitoring and evaluation
 
Scrum Agile Development
Scrum Agile DevelopmentScrum Agile Development
Scrum Agile Development
 
Career Advice
Career AdviceCareer Advice
Career Advice
 
HiveServer2
HiveServer2HiveServer2
HiveServer2
 
Bigtable数据模型解决CDR清单存储问题的资源估算
Bigtable数据模型解决CDR清单存储问题的资源估算Bigtable数据模型解决CDR清单存储问题的资源估算
Bigtable数据模型解决CDR清单存储问题的资源估算
 
Big Data Engineering Team Meeting 20120223a
Big Data Engineering Team Meeting 20120223aBig Data Engineering Team Meeting 20120223a
Big Data Engineering Team Meeting 20120223a
 
HBase Coprocessor Introduction
HBase Coprocessor IntroductionHBase Coprocessor Introduction
HBase Coprocessor Introduction
 
Hadoop大数据实践经验
Hadoop大数据实践经验Hadoop大数据实践经验
Hadoop大数据实践经验
 
Wild Thinking of BigdataBase
Wild Thinking of BigdataBaseWild Thinking of BigdataBase
Wild Thinking of BigdataBase
 
RockStor - A Cloud Object System based on Hadoop
RockStor -  A Cloud Object System based on HadoopRockStor -  A Cloud Object System based on Hadoop
RockStor - A Cloud Object System based on Hadoop
 
Fans of running gump
Fans of running gumpFans of running gump
Fans of running gump
 
Hadoop compress-stream
Hadoop compress-streamHadoop compress-stream
Hadoop compress-stream
 
Ganglia轻度使用指南
Ganglia轻度使用指南Ganglia轻度使用指南
Ganglia轻度使用指南
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solution
 
Big data and cloud
Big data and cloudBig data and cloud
Big data and cloud
 
Learning from google megastore (Part-1)
Learning from google megastore (Part-1)Learning from google megastore (Part-1)
Learning from google megastore (Part-1)
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221a
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Engineering practices in big data storage and processing

  • 1. Engineering Practices in Big Data Storage and Processing Nov.20, 2013 Schubert (Songbo) Zhang
  • 2. About me • 张松波 (Schubert Zhang) • Backgrounds • Senior Engineer Tech Lead and Architect, Infrastructure Data Team, @Baidu • VP Engineering, Cloud & Big Data R&D, @Hanborq • Senior Engineering Manager, @UTStarcom • 10 years of Telecom, 5 years of Cloud Storage & Big Data, 1 year of Internet 2
  • 3. Categories of (Big) Data • Rows / Records • • • • Logs User Profiles Shopping Orders … • Files / Objects • • • • Documents Photos Videos … • Presentation • Presentation • A mess -> organizing, indexing -> fast to retrieve … • Batch and sequential processing … • Organizing, indexing -> fast to retrieve … • Batch and sequential processing … • Tables with Schema • Data Types • Database, Data-Warehouse • Files in File-System • Objects in Object-Storage-System • With metadata … Over the common underlayer storage and IO system: Hardware, Disk, Network … 3
  • 4. Products and Engineering Projects Object Storage, Data Warehouse, Cluster Management, etc. For enterprise! 4
  • 5. Products Line 大数据工程 (Big Data) 云存储 (Cloud Storage) HB-CDW产品线是基于云计算技术实现的面向大数据(PB级)存储、 HB-CSS产品线为企业或个人提供云存储解决方案及服务。提供类 查询和分析以及挖掘的大数据仓库系统。核心产品包括基于 似Amazon AWS S3的服务层API和用户体验,可扩展、安全、快速 Hadoop生态系统的大数据仓库、海量结构化数据管理系统 的云对象存储系统oNest。基于oNest,为企业和个人提供接入云 HugeTable。基于Hanborq增强并扩展的Hadoop、HBase、Hive、 存储服务的存储网关(Storage Gateway)及类似Dropbox的在线云 Pig等大数据基础软件,实现特有的数据模型、系统架构和标准 存储服务(uDrop/eDrop)。在大型互联网、教育、电信、媒体、交 的SQL/API,提供对大数据的快速加载、实时索引查询,以及基 通等行业领域有广泛的使用案例。 Hanborq Products 系统提供灵活的扩展性和安全可靠性。在电信、电力、交通、 于MapReduce和MPP等并行计算技术的深度统计、分析和挖掘。 大型互联网等大数据行业领域有广泛的使用案例。 管理系统 (Management) HB-ClusterMaster是大规模数据中心集群规划、操作系统及应用程序自动化安 装部署、配置管理、监控及运营维护的软件系统,实现大规模云计算集群的高 效部署和运维。目前部署和管理的最大单系统案例超过2000个物理服务器节点。 5
  • 6. Cloud Object Storage System : oNest • Web Service and API • Amazon AWS S3 RESTful API • S3 Data Model (User->Buckets->Objects) • Backend Distributed Object Storage System • Google GFS + Facebook Haystack • • • • • Triple copy of data trunks Write-through, Strong consistency Append only and Compaction High efficient Local Index … SDK (C++/Java/Python/PHP/Go…) Web Service (RESTful API over HTTP) Metadata Layer • Backend Distributed Metadata Layer • Flexible data model • NoSQL Object/Trunk Storage Layer 6
  • 7. Cloud Object Storage System : oNest Logic Physical Rock User Bucket Object/Pebble Chunk Part Rock Chunk Object Part Bucket2 Bucket3 Bucket4 Chunk Chunk Rock Chunk Chunk Chunk Object Part Chunk Object Bucket1 Chunk Part Chunk Chunk Object Object Object Object Object Object Object Object Chunk Object Object Rock & Chunks Data Model and Data Organization 7
  • 8. Cloud Object Storage System : RockStor-> oNest 应用系统1 …… 应用系统N SDK (Java) for Developers HTTP接口 HTTP接口 HTTP接口 RESTful API (Cloud Service) HTTP接口 HTTP接口 接口层 RockStor Service Load Balancers WEB服务 (访问请求负载均衡器,多点部署,LVS) WEB服务 …… WEB服务 计量信息 RockMaster AAA, CAS RockServer 管理接口 管理接口 系统管理 负载均衡 分布式云对象存储系统 Management Console 资源管理平台 RESTful API (Internal) RockServer 对象 对象访问 服务层 相关 功能 对象属性 RockServer 容器 容器访问 相关 功能 容器属性 用户 相关 功能 认证 用户控制 日志管理 鉴权 统计报表 RockServer 运维管理 分布式存储系统集群 Hadoop (存储和管理Rock文件) 分布式数据库集群 HBase (存储和管理元数据) Fast/Simple Prototype Leverage Open Source 存储层 分布式存储系统 To be a Product and Service. 8
  • 9. Cloud Object Storage System : oNest Region Console Console WebServer WebServer 机房A Console Console WebServer WebServer Console Console WebServer WebServer Console Console WebServer WebServer ClusterMaster ClusterMaster Master Master AAA Slave Stats Master Stats Master Stats Slave Stats Slave AAA AAA Slave Slave Master Proxy AAA AAA Web Web Service Service Stats Cluster Master Master Stats Master (1) 支持高可靠,多副本数据存储,支持动 态环境下数据副本的自动修复 Stats Master Discovery Service Cluster AZ OAS Cluster OAS DataStorage Cluster OAS Healer Cluster Healer DataNode DataNode DataNode MetaNode Cluster Healer MetaNode MetaNode SlaveSlave Master Healer MetaNode Slave Stats Slave Stats Slave AZ OAS Cluster OAS OAS DataStorage Cluster OAS Healer Cluster Healer DataNode DataNode DataNode MetaNode Cluster Healer Master Master • oNest对象云存储平台系统以对象的形式存 储数据,为互联网业务和企业用户提供可达 百PB级的云存储服务 • oNest系统提供的对象云存储服务的主要特 点: AAA AAA Web Web Service Service Proxy Discovery Service Cluster OAS 机房B AAA Cluster AAA AAA Slave Slave Master Console Console WebServer WebServer ClusterMaster ClusterMaster AAA Cluster AAA Slave Console Console WebServer WebServer MetaNode Slave MetaNode MetaNode SlaveSlave Master Master Master (2) 支持大规模存储(容量x100PB级以上), 存储对象数量和容量的线性扩容 (3) 支持一个数据中心内和跨数据中心备份 数据 (4) 支持大规模并发访问 (5) 支持安全的数据访问 Healer To be a more Complete Product and Service. 9
  • 10. Cloud Object Storage System : oNest 创建Bucket 新建目录 上传对象 刷新列表 查看属性 操作记录 用户名 右键菜单 对象集列表 对象列表 对象基本属性描述 点击进入详细属性描述,包括对象下载地址 点击进入ACL权限管理 10
  • 11. Cloud Object Storage System : oNest 教育云应用的用户 教育云App-1 SDK 教育云应用服务 REST oNest提供统一标准的云存储接口,教育云应用可 以通过该接口存储、读取、或操作这些数据对象 教育云App-2 教育云应用即是oNest云存 储的用户。 REST 注册、登录、 Console oNest云存储服务 BC-oNest对象云存储服务 oNest是一个弹性的对象云存储系统,可类比Amazon AWS S3。 为教育云提供视频、音频、图片、文档等数据的存储服务。 11
  • 12. Dropbox-Like NetDisk Service: uDrop / eDrop • Hack Dropbox 208.43.202.5 ... Softlayer Datacenter keep alive (http) login (https) list, delete rename and sync (https) 67.228.78.114 67.228.78.116 67.228.78.117 ... Dropbox Web Server Client download and upload data (https) 75.101.145.128 75.101.138.84 ... Amazon S3 & EC2 • keep-alive mechanism • Delta update • Mechanism of shared file block • Dropbox client database: Sqlite • 数据/文件分割和指纹 • 增量上传算法 • 所谓“秒传” 12
  • 13. Dropbox-Like NetDisk Service: uDrop / eDrop PC Client Mobile Client Browser REST AccessServer REST AccessServer MetaAPI DataAPI MetaAPI Meta Server Meta Server DataAPI Web Server MetaAPI DataAPI Register Meta Server Meta Server Matcher oNest ZooKeeper HBase 13
  • 14. Big Data Platform Users, Applications SQL/Scrpits/Java/Web Backup Smart SQL and Executi on Engine Big Data Source Big Data Source Hive HugeTable BulkLoad (Flume Flive) ETL Data Mini ng MapReduce/Impala Hcatalog Bigtable Bigtable HBase Oozie …… …… Big Data Source Pig file file file HD FS Ganglia Nagios Clus terMaster (Deplo yment) Shared Cluster of Serv ers 14
  • 15. Big Data Warehouse: HugeTable -> Horizon • 以HDFS为基础存储平台,支持多种存储格式,可扩展 SQuirreL SQL Client (GUI) SQLLine (CLI) Web SQL Client Apps (Programming) JDBC Driver JDBC Driver JDBC Driver JDBC Driver • • • • 多种数据访问模型 • • • Smart SQL Engine Smart SQL Engine 智能SQL引擎 智能SQL引擎 Pig HugeTable Data Model 数据建模 Unified Schema 统一元数据 Impala (MPP) MapReduce HFile TextFile SequenceFile (SSTables) (Recorded) (Key-Value Rows) HDFS HBase MapReduce MPP: Impala • HugeTable特有的数据存储模型 • • • • Encodeing/Decoding Indexing Partitioning … • 统一的Data Schema Metadata管理 Hive HBase HBase/HFile, 行存储:TextFile, SequenceFile 列存储:RCFile/ORCFile, Rarquet, … RCFile/ ORCFile (Columnar) • Smart SQL Engine and Server • • 高性能、高并发、高稳定性、分布式 选择不同的数据访问模型路径 • 兼容Hive和Pig Parquet (ColumnIO) User-Defined Formats ... • 标准化JDBC客户端接口和客户端工具 • 工程辅助工具 • • 快速批量加载 BulkLoad和导出 (提供SQL界面) 快速部署工具 15
  • 16. Big Data Warehouse: HugeTable -> Horizon JDBC and ODBC REST API Management ... SQL Engine (Standard, Familiar, Low Learning Curve, ...) Data Warehouse Utilities / Tools (SpeedLoader, SpeedScan, Data LifeCycle, ...) Bigtable (HBase) DFS (Hadoop HDFS) Connectors Integrating into Hadoop Ecosystem Data Model (Data Organization, Indexing, Partitioning, Encoding, Compressing, ...) Oozie HCatalog Pig Hive MapReduce 16
  • 17. NoSQL vs. SQL • NoSQL, BigTable, Cassandra, etc., are just the “Storage Engine Layer” of DBMS. • Users always like and be familiar with SQL to touch their data. MySQL Server Horizon SQL Engine Layer Distributed SQL Engine vs. Storage Engine Layer (MyISAM, InnoDB, etc.) Distributed Storage Engine (NoSQL, HBase) How about to build a Distributed DBMS? Megastore, Greenplum/Pivotal/GitusDB, 17 etc.
  • 18. 经分大数据平台 Plan & Design 数据存储模型定义 (Schema, Types, Indexes, StorageEngine, etc.) 数据处理操作和流程定义 (SQL, Scripts, Java, WorkFlow, etc.) BOSS 帐详单CDR数据 批量加载工具 (Files, BulkLoad, etc.) 网络 CDR数据 (Gn/Gb/IuPS ...) 信令数据 (Iub/Iucs/mmsc ...) 日志数据 (WAP, WLAN ...) DPI采集数据 统一大数据存储和分析平台 Client 根据实 际业务 数据进 行开发 和移植 实时加载工具 (Flume, Flive, etc.) 离线接 口一般 无需修 改 数据库数据转 移工具 (Sqoop, etc.) SQL Scripts ... Java Hive Horizon ETL处理 逻辑 HBase MapRedu ce Impala Hadoop HDFS基础存储层 CRM 用户资料 MapReduce 其他工程工具 Pig 根据实 际业务 数据进 行开发 和移植 离线接 口一般 无需修 改 统计、汇总 分析、报表 类业务 即席查询 类业务 (ad-hoc) 数据挖掘 类业务 Data Mining 其他OLAP 业务 数据处理和访问 业务功能 其他数据 大数据来源 (多样性) 数据加载和预处理 数据存储、组 织和处理平台 原则:以离线、批量分析为主,兼顾数据查询和管理 18
  • 19. 大数据服务平台 JDBC for Local Deployment RESTful for Remote Deployment Load Balancer (LVS, with HA) HugeTable Web Service Web Service Web Service SQL Engine Server SQL Engine Server SQL Engine Server LifeCycle file Online Generated Data (CDR) (On/Offline, DataDrop) Connector Flive HugeTable Data Model BulkLoad file Hive/Pig MapReduce Hive/Pig MapReduce HBase, Hadoop (with SpeedScan) Analysis ETL 原则:以实时低时延数据查询为主,兼顾数据分析 19
  • 22. Hadoop and Open Source Ecosystem • MapReduce • Runtime Job/Task Schedule & Latency • • • Work Pool Transfer Job description information … • Processing Engine Improvements • • Shuffle: sendfile, Netty Server, Batch Fetch Sort Avoidance: Spilling and Partitioning, Hash Aggregation • HBase (to be a Data Warehouse backend) • • • • • Low Level HFile management Speed Bulk Load Speed Scan for Analysis Flexible control of Flush, Compaction, Split, Balance Coprocessor for parallel processing • Flume • Support more Data Sources and Data Storages • More flexible Command Line tool • Hive • Faster SQL Engine • Support more Storage Engines • More UDFs for database functions (such as NVL, DECODE from Oracle.) • More UDFs for OLAP (such as Roll-Up, Cube, Efficient Aggregations, etc. • More algorithms for efficient statistics and estimate (such as LogLog-Counter for estimated DISTINCT values) • Pig • Support more Data Storages • More UDFs for analysis, statistics and data mining (such as K-Mean, ID3 for Decision Tree, etc.) • Tools • • • • Deployment: Hdeploy, HTCfg, ClusterMaster Management: Integrate Ganglia, Nagios, Puppet, etc. Light and handy command line: Hman, etc. Benchmark Tools: Hbench, etc. 22
  • 23. Know the Details of Hadoop … 23
  • 24. MapReduce Runtime Optimization • Job/Task Schedule & Latency • Worker Pool Job Latency (in second, lower is better) Total Tasks (96 maps, 4 reduces) 50 MapReduce Client 45 RPC (JobConf) JobTracker 43 40 35 30 25 24 20 TaskTracker TaskTracker 15 TaskTracker 10 5 Child Worker Child Worker Worker Pool Child Worker Child Worker Child Worker Worker Pool Child Worker Child Worker Child Worker Child Worker 1 0 CDH3u2 (Cloudera) CDH3u2 (Cloudera) (reuse.jvm disabled) (reuse.jvm enabled) HDH3u2 (Hanborq) Worker Pool 24
  • 25. MapReduce Processing Engine Optimization • Shuffle: Use sendfile to reduce data copy and context switch. • Shuffle: Netty Shuffle Server (map side) and Batch Fetch (reduce side). • Sort Avoidance. • Spilling and Partitioning, Counting Sort, Bytes Merge, Early Reduce, etc. • Hash Aggregation in job implementation. Real Aggregration Jobs (lower is better) Sort Avoidance and Aggregation 700 2400 2200 2000 1800 1600 1400 1200 1000 800 600 400 200 0 600 2186 500 615 197 175 216 198 Case1 Case2 197 216 175 198 615 300 200 2186 HDH (Hanborq) 400 Case3 CHD3u2 (Cloudera) time (seconds) time (seconds) (lower is better) 100 0 Case1-1 Case2-1 Case1-2 Case2-2 CDH3u2 (Cloudera) 238 603 136 206 HDH (Hanborq) 233 578 96 151 25
  • 26. 中国移动BigCloud 自2008年开始与中国移动研究院合作定义、设计和开发“大云”1.0体系结构和产品系列,目前已完成 了“大云”2.0的研发任务。 已支持“大云”系统在中国移动及其它行业用户广泛部署,提供软、硬件系统解决方案及服务。云存储 及数据仓库产品及服务,单一数据中心部署容量已超过2,000节点,管理超过20PB的存储容量。为电信 详单、日志、信令、文档、视频、图片及互联网页数据,提供存储、分析及检索服务。  BC-HugeTable(海量结构化数据管理系统)  大数据仓库 (分析和查询)  大数据库 (分析和查询)  BC-Hadoop(海量数据存储和分析平台)  研究院发行版  汉播发行版HDH  BC-oNest(分布式对象存储系统)  BC-NAS(分布式文件系统中间件) 26
  • 27. CDR帐详单仓库和查询 清单量(亿条) HB-CDW集群系统 电信运营网络 450 数据存储和分析服务器集群 HB-CDW系统 (存储,索引,分析) OSS服 务器 400 350 300 250 200 移动核 心网 网络交换设备 报 表 查 询 实时 采集设备 批量 timeseries PC浏览器查询 清单量(亿条) 150 100 50 Internet 0 200906 200907 200908 200909 200910 200911 200912 RDBMS和 Web服务器 查询量(次数) 8000000 7000000 6000000 集群监控管理服务器 BSS 智能手机查询 5000000 4000000 查询量 3000000 2000000 Intranet 1000000 0 200906 200907 200908 200909 200910 200911 200912 Terminals 分析报表 PC浏览器监控 方案制定时间:2009-10 智能手机监控 - CDR实时生效延迟<1分钟 - 查询响应(Latency) < 3秒(平均<0.5秒) - 查询吞吐率:每月2亿次,忙时每秒1000 - 数据安全:数据在3个节点冗余备份 - 数据分析:每日或每月生成KPI报表 用户规模:约1亿用户 CDR详单数据量 - 每月:详单量500亿条,数据量20TB (每秒2 万条以上) - 总存储6个月:详单量3000亿条,数据量 120TB - 移动互联网业务详单数据量是普通业务CDR 的5倍以上 数据存储和处理集群规模 - 32台DELL PE C2100服务器 - 每台12 x 1TB数据硬盘,64GB内存 27
  • 28. WorkFlow/Pipeline控制器 移动 – 经分ETL 周期(每小时)在接口机上运行Pig脚本,驱动MapReduce Job并行从接口机读取数据,并做格式转换、编码、压缩 和清洗,写成SequenceFile到HDFS。节省存储空间,提高 输出中间汇总(细粒度)数据 后续处理效率,易扩展新的ETL功能 月180GB,存储到HDFS 31 天,待月汇总 WAP日志文件 Hadoop Node 接口机每小时拉文件 每日400GB,约4.6万个小文件 高性能/高并发/大存储 华为WAP日志服务器 (FTP Server) #1 华为WAP日志服务器 (FTP Server) #2 平台对外总数据接口 …… (输入/输出) Hadoop Node 防 火 墙 大数据平台 接口机 (FTP Server) 大数据平台 (Hadoop/Hive/Pig/ HugeTable) Hadoop Node 亚联系统 日汇总Job (Hive SQL) …… 31天 日汇总Jobs (Hive SQL) 日汇总 一经规整 (Pig/Scrpits) 31天 月汇总Jobs (Hive SQL) 月汇总 一经规整 (Pig/Scrpits) 日汇总 一经规整 (Pig/Scrpits) 每日输出5GB规整 后的数据到接口机 每月输出规整后的 数据到接口机 Hadoop Node 每天更新号段维表数据 每月更新用户信息维表 数据 每日定时取前一日汇总数据 每月定时取前一月汇总数据 数据需符合一经规范 28
  • 29. 29
  • 30. Lessons Learned Many lessons and many feelings. 30
  • 31. 1. Right Design Comes from Basic Knowledge of Computer System / Computer Science • Computer Architecture and How Computer Works • Representing and Manipulating Information and Programs • Processor Architecture (Pipeline, Parallel …) • Storage Architecture • IO System, etc. • • • • • The core issues of database. • File-system … • To be distributed now. Memory/Storage Hierarchy Modern Operation System Networking Languages … 31
  • 32. Basic Knowledge of CS - Sequential vs. Random Access … - Long latency of Disk Seek … - Throughput All solutions of database and big data processing system are stand on the characters of computer architecture, especially disk, network ... 32
  • 33. Basic Knowledge of CS by Jeff Dean 33
  • 34. Basic Knowledge of CS • What every data engineer needs to know about disks • Basic Algorithms (Sorting, Searching, Strings, Bitmap, …) • Linux Virtual Memory, Exceptions, Concurrency, etc. •… 34
  • 35. 2. Keep Simple and Straightforward • Master-Slave vs. Decentralized (DHT, Consistent Hash) • Almost all Google products follow Master-Slave pattern. GFS/BigTable/MapReduce/ZooKeeper, etc.. • MapReduce: Simplified Data Processing on Large Clusters • A simple programming model that applies to many large-scale computing problems • Hide messy details • Bigtable provides the simple data model, distributed B+ tree … • Shards and Replicas • Simple and clean API design 35
  • 36. Keep Simple and Straightforward • Example: Bigtable vs. Cassandra Master Master Tablet Server Tablet Server Tablet Server Tablet Server Tablet GFS Bigtable Cassandra 36
  • 37. Keep Simple and Straightforward Bigtable (++) Cassandra (--) • Master – Tablet Servers • Dynamic Tablet Splits • WAL + MemTable + SSTable • Three Level Distributed B+Tree • Replication in GFS •… • • • • • • • • • • • • Bigtable ’s architecture and data model make more sense. Identical Data Nodes, Gossip Consistent Hash, Virtual Nodes WAL + MemTable + SSTable Hinted Handoff DHT Ring (neighbor nodes) Eventual consistency Read Rapir Merkle Tree Clock Vector Anti-entropy protocol (反熵) … 好复杂:架构的错误,导致系统越来越复杂 … http://www.slideshare.net/schubertzhang/cassandra-dynamo-paper http://www.slideshare.net/schubertzhang/dastorcassandra-report-for-cdr-solution 37
  • 38. 3. There is no “one-size-fits-all” solution • There are too many contradictory requirements in the structured data world. • The contradiction of data processing • Real-time or near-real-time data availability. • Batch processing for large size of data, such as aggregation. • The contradiction of data access: • Low-latency fast query response, like Lookup. • High-latency ad-hoc analytic query for historical data. • But, there is no one-size-fits-all answer for above contradictory requirements. • Identify common problems, and build systems to address them in a general way. • “Important not to try to be all things to all people!” – Jeff Dean, Keynote at LADIS’09 38
  • 39. There is no “one-size-fits-all” solution • MapReduce • Dremel (MPP) • Tez/Stingger • NoSQL/Bigtable (and with Coprocessor) • DBMS •… Lambda Architecture: New data is sent to both layers and queries merge views from both layers. 39
  • 40. There is no “one-size-fits-all” solution SQL, Scripts, Java, etc. Hive Pig MapReduce Java Impala GoldenOrb Dremel Pregel 不同的查询和分析请求,采用不同的并行执行引擎操作数据。 40
  • 41. 4. Monitorable and Metrizable at any time • Sufficient Statistic, Monitoring … • Add Sufficient Monitoring/Status/Debugging Hooks • If your system is slow or misbehaving, can you figure out why? • Don’t rely on logs too much, log is too costly and inefficient. • Use real-time statistics/metrics. • Use tools, jmxetric, JMX, Ganglia, Nagios, Noah … 41
  • 42. Monitorable and Metrizable at any time The magic matrix ??! Captured from UTStarcom mSwitch R5 system, Guangxi Site, 2004. 42
  • 43. Monitorable and Metrizable at any time Write/Insert Operation Benchmark Read/Query Operation Benchmark 43
  • 44. Monitorable and Metrizable at any time SLA Metrics: • • Latency o tAvgLat: Total Average Latency (ms) o dAvgLat: Delta Average Latency (ms) o dMaxLat : Delta Maximum Latency (ms) o dMinLat : Delta Minimum Latency (ms) • percentage of read ops Throughput o tThrou :Total Throughput (operation count) o dThrou : Delta Throughput (operation count) Quantile % • • Total : from benchmark start to present. Delta: between each statistical interval (2 minutes here) 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 100ms  Read Throughput: average ~140 ops/s  Latency: average ~500ms, 97% < 2s (SLA)  Bottleneck: disk IO (random seek) (CPU load is very low) 44
  • 45. Monitorable and Metrizable at any time 45
  • 46. 5. Try to make data in-situ • The ability to access data ‘in place’. • ProtocolBuffers/Parquet encoding Real-Time Data Service Writes (Puts) • Example: • Horizon over HDFS + HBase Reads (Get/Scan) Real-Time API Schema Meta Bulk Load HBase Flush/Compaction (Batch Input) Coprocessor MapReduce/ Impala HFiles (Batch Processing) HDFS (HFile) HFiles 46
  • 47. 6. Approximated vs. Precise • For large data sets, it can be prohibitively expensive to find the precise result, but there are efficient estimating methods. • Example Queries: • How many distinct elements are in the data set (i.e. what is the cardinality of the data set)? • What are the most frequent elements (the terms “heavy hitters” and “top-k elements” are also used)? • What are the frequencies of the most frequent elements? • How many elements belong to the specified range (range query, in SQL it looks like SELECT count(v) WHERE v >= c1 AND v < c2)? • Does the data set contain a particular element (membership query)? • … 47
  • 48. Approximated vs. Precise • The algorithms are approximate: with high probability it returns approximately the correct result. (e.g. ±2%) • select count(distinct userid) from userlogs; • select top(100) of count(*) from orders group by itemname; •… • Statistical and Probabilistic Analysis, Very interesting! 48
  • 49. Approximated vs. Precise • Usually Sample/Hash/Bitmap … • Cardinality Estimation • Linear Counting • Loglog Counting … • Frequency Estimation / Heavy Hitters • Count-Min Sketch • Count-Mean-Min Sketch • Stream-Summary … • Range Query • Array of Count-Min Sketches … • Membership Query • Bloom Filter • … 49
  • 50. 5. Open Source and Open Spirit • Choose you Building Blocks in Engineering view • Know Your Basic Building Blocks, Not just their interfaces, but understand their implementations (at least at a high level) • 善用开源,回馈开源,使开源更好更强大 50
  • 51. 6. And more … • Description and Documents • Avoid inventing new Interface for Users • From simple to complete, From prototype to product • Make the architecture robust, try it, and then improve and complete it. • Product vs. Tech. vs. Trick •… 51
  • 52. 7. Read Books – Read English Books 52
  • 54. Find me outside • SlideShare: http://www.slideshare.net/schubertzhang http://www.slideshare.net/hanborq • Github: https://github.com/schubertzhang https://github.com/hanborq • Email & Gtalk: schubert.zhang@gmail.com • Weibo: @schubertzh • LinkedIn: http://cn.linkedin.com/pub/schubertzhang/6/b51/b5b/ • Blog: • WeChat: schubertzh http://cloudepr.blogspot.com • Facebook: https://www.facebook.com/schubertzhang 54