SlideShare ist ein Scribd-Unternehmen logo
1 von 47
Shift into High Gear:
Dramatically Improve Hadoop
and NoSQL Performance
M. C. Srivas, CTO/Founder
1
MapR History
MapR M7

Launches ultra-reliable
(struggling with Nutch)
Copies MR paper,
calls it Hadoop

06

uses Hadoop

publishes
MapReduce
paper

MapR founded

‘11

‘09

07

‘12

MapR
partners with

Uses
Hadoop

Uses Hadoop

Fastest NoSQL
on the planet

DOW drops
from 14,300
to 6,800
2

Breaks all world
records!

‘13

2500 nodes
Largest
non-Web
installation
MapR Distribution for Apache Hadoop
Tools

Web 2.0

Enterprise Apps

Analytics

Operations

Vertical Apps

OLTP

Management

Hardening, Testing, Integrating,
Innovating, Contributing as part
of the
Apache Open Source Community
NoSQL
Data Platform
Batch

99.999%
HA

Data
Protection

Interactive
Disaster
Recovery

Enterprise
Integration
3

Real Time
Scalability &
Performance

Multi-Tenancy
MapR Distribution for Apache Hadoop


100% Apache Hadoop



With significant enterprisegrade enhancements



Comprehensive
management



Industry-standard
interfaces



Higher performance

4
The Cloud Leaders Pick MapR

Amazon EMR is the largest
Hadoop provider in revenue
and # of clusters

Google chose MapR to
provide Hadoop on
Google Compute Engine

MapR partnership with Canonical and Mirantis
provides advantages for OpenStack deployments

5
What Makes MapR so Reliable?

6
How to Make a Cluster Reliable?
Make the storage reliable

1.
–

Recover from disk and node failures

Make services reliable

2.
–
–

–

Services need to checkpoint their state rapidly
Restart failed service, possibly on another node
Move check-pointed state to restarted service, using (1) above

Do it fast

3.
–

–

Instant-on … (1) and (2) must happen very, very fast
Without maintenance windows
No compactions (e.g., Cassandra, Apache HBase)
• No “anti-entropy” that periodically wipes out the cluster (e.g., Cassandra)
•

7
Reliability with Commodity Hardware


No NVRAM



Cannot assume special connectivity
–



Cannot even assume more than 1 drive per node
–



No RAID possible

Use replication, but …
–
–



No separate data paths for "online" vs. replica traffic

Cannot assume peers have equal drive sizes
Drive on first machine is 10x larger than drive on other?

No choice but to replicate for reliability
8
Reliability via Replication
Replication is easy, right?
All we have to do is send the same bits to the master and replica

Clients

Primary
Server

Clients

Primary
Server

Replica

Normal replication, primary forwards

Replica

Cassandra-style replication

9
But Crashes Occur…
When the replica comes back, it is stale
– It must be brought up-to-date
– Until then, exposed to failure
Clients

Clients
Who
re-syncs?

Primary
Server

Primary
Server

Replica

Primary re-syncs replica

Replica

Replica remains stale until
"anti-entropy" process
kicked off by administrator

10
Unless it’s HDFS …


HDFS solves the problem a third way



Makes everything read-only
–

static data, trivial to re-sync



Single writer, no reads allowed while writing



File close is the transaction that allows readers to see data
–

–



Realtime not possible with HDFS
–
–



Unclosed files are lost
Cannot write any further to closed file
To make data visible, must close file immediately after writing
Too many files is a serious problem with HDFS (a well documented
limitation)

HDFS therefore cannot do NFS, ever
–

No “close” in NFS … can lose data any time
11
This is the 21st Century…


To support normal apps, need full read/write support



Let's return to the issue: re-sync the replica when it comes back

Clients

Clients
Who
re-syncs?

Primary
Server

Primary
Server

Replica

Primary re-syncs replica

Replica

Replica remains stale until
"anti-entropy" process
kicked off by administrator

12
How Long to Re-sync?


24 TB / server
–
–



Did you say you want to do this online?
–
–



@ 1000MB/s = 7 hours
Practical terms, @ 200MB/s = 35 hours

Throttle re-sync rate to 1/10th
350 hours to re-sync (= 15 days)

What is your Mean Time To Data Loss (MTTDL)?
–
–

How long before a double disk failure?
A triple disk failure?

13
Traditional Solutions
Use dual-ported disk to side-step this problem
Clients

Servers use
NVRAM

Primary
Server

Raid-6 with idle spares

Replica

Dual Ported
Disk Array

COMMODITY HARDWARE

LARGE SCALE CLUSTERING

Large Purchase Contracts, 5-year spare-parts plan
14
Forget Performance?
Hadoop

Traditional Architecture
function

function

function

App

App

App

function

function

function

data

data

data

function

function

function

data

data

data

function

function

function

data

data

data

function

function

function

data

data

data

function

RDBMS
data

data

data

data

data

data

daa

data

data

data

data

data

SAN/NAS

Geographically dispersed also?
15
What MapR Does


Chop the data on each node to 1000s of pieces
–
–



Not millions of pieces, only 1000s
Pieces are called containers

Spread replicas of each container across the cluster

16
Why Does It Improve Things?

17
MapR Replication Example


100-node cluster



Each node holds 1/100th of every node's data



When a server dies, entire cluster re-syncs the dead node's data

18
MapR Re-sync Speed


–

–
–





99 nodes re-sync'ing in parallel

Net speed up is about 100x
–

99x number of drives
99x number of Ethernet ports
99x CPUs



Each is re-sync'ing 1/100th of the
data

19

350 hours vs. 3.5

MTTDL is

100x better
MapR Reliability


Mean Time To Data Loss (MTTDL) is far better
–



Does not require idle spare drives
–
–



Rest of cluster has sufficient spare capacity to absorb one node’s data
On a 100-node cluster, 1 node’s data == 1% of cluster capacity

Utilizes all resources
–
–



Improves as cluster size increases

no wasted “master-slave” nodes
no wasted idle spare drives … all spindles put to use

Better reliability with less resources
–

on commodity hardware!
20
Why Is This So Difficult?

21
MapR's Read-write Replication


Writes are synchronous
–



client2
client1

Visible immediately

clientN

Data is replicated in a "chain"
fashion
–

Utilizes full-duplex network
client2



client1

Meta-data is replicated in a
"star" manner
–

Response time better

22

clientN
Container Balancing
• Servers keep a bunch of containers "ready to go”
• Writes get distributed around the cluster






23

As data size increases, writes
spread more (like dropping a
pebble in a pond)
Larger pebbles spread the
ripples farther
Space balanced by moving
idle containers
MapR Container Re-sync


MapR is 100% random write
– uses distributed transactions





Complete
crash

On a complete crash,
all replicas diverge from
each other
On recovery, which one
should be master?

24
MapR Container Re-sync


MapR can detect exactly where
replicas diverged
– even at 2000 MB/s update rate



Re-sync means
– roll-back each to divergence point
– roll-forward to converge with chosen

master



Done while online
–

with very little impact on normal
operations

25

New master
after crash
MapR Does Automatic Re-sync Throttling


Re-sync traffic is “secondary”



Each node continuously measures RTT to all its peers



More throttle to slower peers
–



Idle system runs at full speed

All automatically

26
But Where Do Containers Fit In?

27
MapR's Containers
Files/directories are sharded and placed into
containers


Each container contains

Directories & files


Data blocks



28

Replicated on servers



Containers are 1632 GB segments of
disk, placed on
nodes

Automatically managed
MapR’s Architectural Params
HDFS 'block'

Unit of I/O


–

4K/8K (8K in MapR)
10^3
map-red

i/o KB



Unit of Chunking (a mapreduce split)
–



10-100's of megabytes

–

Unit of Resync (a replica)
–
–

–

–

10-100's of gigabytes
container in MapR
automatically managed

29

10^9
admin

Unit of Administration (snap,
repl, mirror, quota, backup)
–



10^6
resync

1 gigabyte - 1000's of terabytes
volume in MapR
what data is affected by my
missing blocks?
Where & How Does MapR Exploit This
Unique Advantage?

30
MapR's No NameNode HATM Architecture
HDFS Federation

MapR (Distributed Metadata)

NAS
appliance
E
E

F

E

F

E
F
C
E
D
F
B
NameNode
NameNode
NameNode
NameNode
NameNode
NameNode

A

DataNode

DataNode

DataNode

C

D

E

D

B

B

C

E

B

A

DataNode

F

A

A

D

C

F

B

F

DataNode

DataNode

• HA w/automatic failover
• Instant cluster restart
• Up to 1T files (> 5000x advantage)
• 10-20x higher performance
• 100% commodity hardware

• Multiple single points of failure

• Limited to 50-200 million files
• Performance bottleneck
• Commercial NAS required
31
Relative Performance and Scale
Other

Advantage

Rate (creates/s)

14-16K

335-360

40x

Scale (files)

6B

1.3M

4615x

File creates/s

MapR

18000

Other distribution

0

MapR distribution

16000

400
350
300
250
200
150
100
50
0

0.5

1

1.5

Files (M)

File creates/s

14000

12000

Benchmark: 100-byte file
creates

10000
8000
6000
4000

2000
Other distribution

0
0
0

1000
100

2000
200

3000
400

Files (M)

4000
600

5000
800
32

6000
1000

Hardware: 10 nodes, each
2 x 4 cores
24 GB RAM
12 x 1 TB 7200 RPM
1 GigE NIC
Where & How Does MapR Exploit This
Unique Advantage?

33
MapR’s NFS Allows Direct Deposit
Web
Server

Random Read/Write
Compression

Database
Server

Distributed HA

…
Application
Server

Connectors not needed
No extra scripts or clusters to deploy and maintain
34
Where & How Does MapR Exploit This
Unique Advantage?

35
MapR Volumes & Snapshots
/projects
/tahoe
/yosemite
/user
/msmith
/bjohnson

Volumes dramatically simplify the
management of Big Data
•
•
•
•
•
•

Replication factor
Scheduled mirroring
Scheduled snapshots
Data placement control
User access and tracking
Administrative permissions

100K volumes are
OK, create as many as
desired!

36
Where & How Does MapR Exploit This
Unique Advantage?

37
MapR M7 Tables


Binary compatible with Apache HBase
–
–



no recompilation needed to access M7 tables
Just set CLASSPATH

M7 tables accessed via pathname
–

–
–

openTable( "hello") … uses HBase
openTable( "/hello") … uses M7
openTable( "/user/srivas/hello") … uses M7

38
M7 Tables


M7 tables integrated into storage
–



Unlimited number of tables
–



Apache HBase is typically 10-20 tables (max 100)

No compactions
–



always available on every node, zero admin

update in place

Instant-on
–

Zero recovery time



5-10x better performance



Consistent low latency
–

MapR M7

At 95th & 99th percentiles

39
YCSB 50-50 Mix (throughput)

M7 (ops/sec)

Other HBase 0.94.9
latency (usec)

Other HBase 0.94.9
(ops/sec)
40

M7 Latency (usec)
YCSB 50-50 mix (latency)

Other latency (usec)
41

M7 Latency (usec)
Where & How Does MapR Exploit This
Unique Advantage?

43
MapR Makes Hadoop Truly HA


ALL Hadoop components have High Availability
–

e.g. YARN



ApplicationMaster (old JT) and TaskTracker record their state in
MapR



On node-failure, AM recovers its state from MapR
–



All jobs resume from where they were
–



Works even if entire cluster restarted
Only from MapR

Allows preemption
–
–

MapR can preempt any job, without losing its progress
ExpressLane™ feature in MapR exploits it

44
MapR Does MapReduce (Fast!)

TeraSort Record
1 TB in 54 seconds
1003 nodes

MinuteSort Record
1.5 TB in 59 seconds
2103 nodes

45
MapR Does MapReduce (Faster!)

TeraSort Record
1 TB in 54 seconds
1003 nodes

MinuteSort Record
1.65

1.5 TB in 59 seconds
2103 nodes

300

46
YOUR CODE
Can Exploit MapR’s Unique Advantages


ALL your code can easily be scale-out HA
–
–

Save service-state in MapR
Save data in MapR

Use Zookeeper to notice service failure
 Restart anywhere, data+state will move there
automatically
 That’s what we did!




Only from MapR: HA for
Impala, Hive, Oozie, Storm, MySQL, SOLR/Lucene,
HCatalog, Kafka, …
47
MapR: Unlimited Hadoop

Reliable Compute

Dependable Storage

Build cluster brick by brick, one node at a time





Use commodity hardware at rock-bottom prices
Get enterprise-class reliability: instantrestart, snapshots, mirrors, no single-point-of-failure, …
Export via NFS, ODBC, Hadoop and other standard
protocols
48

Weitere ähnliche Inhalte

Was ist angesagt?

Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Deanna Kosaraju
 
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer ShiranThe Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
MapR Technologies
 
70a monitoring & troubleshooting
70a monitoring & troubleshooting70a monitoring & troubleshooting
70a monitoring & troubleshooting
mapr-academy
 

Was ist angesagt? (20)

Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
 
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
 
Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
 
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
 
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer ShiranThe Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvement
 
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA EditionHadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
 
Yahoo's Experience Running Pig on Tez at Scale
Yahoo's Experience Running Pig on Tez at ScaleYahoo's Experience Running Pig on Tez at Scale
Yahoo's Experience Running Pig on Tez at Scale
 
Hadoop - Past, Present and Future - v1.1
Hadoop - Past, Present and Future - v1.1Hadoop - Past, Present and Future - v1.1
Hadoop - Past, Present and Future - v1.1
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
Hadoop 2
Hadoop 2Hadoop 2
Hadoop 2
 
70a monitoring & troubleshooting
70a monitoring & troubleshooting70a monitoring & troubleshooting
70a monitoring & troubleshooting
 
Hadoop Cluster With High Availability
Hadoop Cluster With High AvailabilityHadoop Cluster With High Availability
Hadoop Cluster With High Availability
 
GoodFit: Multi-Resource Packing of Tasks with Dependencies
GoodFit: Multi-Resource Packing of Tasks with DependenciesGoodFit: Multi-Resource Packing of Tasks with Dependencies
GoodFit: Multi-Resource Packing of Tasks with Dependencies
 
Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010
 
Hadoop fault-tolerance
Hadoop fault-toleranceHadoop fault-tolerance
Hadoop fault-tolerance
 
Database Research on Modern Computing Architecture
Database Research on Modern Computing ArchitectureDatabase Research on Modern Computing Architecture
Database Research on Modern Computing Architecture
 
Hadoop - Introduction to HDFS
Hadoop - Introduction to HDFSHadoop - Introduction to HDFS
Hadoop - Introduction to HDFS
 

Ähnlich wie Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance

Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
Varun Narang
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
Cisco Canada
 
02.28.13 WANdisco ApacheCon 2013
02.28.13 WANdisco ApacheCon 201302.28.13 WANdisco ApacheCon 2013
02.28.13 WANdisco ApacheCon 2013
WANdisco Plc
 

Ähnlich wie Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance (20)

Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 
Seattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRSeattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapR
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integrated
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big Data
 
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyPilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5
 
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
Redundancy for Big Hadoop Clusters is hard  - Stuart PookRedundancy for Big Hadoop Clusters is hard  - Stuart Pook
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
 
Hadoop Architecture in Depth
Hadoop Architecture in DepthHadoop Architecture in Depth
Hadoop Architecture in Depth
 
Hadoop bigdata overview
Hadoop bigdata overviewHadoop bigdata overview
Hadoop bigdata overview
 
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQLCompressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
 
Distributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewDistributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology Overview
 
getFamiliarWithHadoop
getFamiliarWithHadoopgetFamiliarWithHadoop
getFamiliarWithHadoop
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
02.28.13 WANdisco ApacheCon 2013
02.28.13 WANdisco ApacheCon 201302.28.13 WANdisco ApacheCon 2013
02.28.13 WANdisco ApacheCon 2013
 

Mehr von MapR Technologies

Mehr von MapR Technologies (20)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance

  • 1. Shift into High Gear: Dramatically Improve Hadoop and NoSQL Performance M. C. Srivas, CTO/Founder 1
  • 2. MapR History MapR M7 Launches ultra-reliable (struggling with Nutch) Copies MR paper, calls it Hadoop 06 uses Hadoop publishes MapReduce paper MapR founded ‘11 ‘09 07 ‘12 MapR partners with Uses Hadoop Uses Hadoop Fastest NoSQL on the planet DOW drops from 14,300 to 6,800 2 Breaks all world records! ‘13 2500 nodes Largest non-Web installation
  • 3. MapR Distribution for Apache Hadoop Tools Web 2.0 Enterprise Apps Analytics Operations Vertical Apps OLTP Management Hardening, Testing, Integrating, Innovating, Contributing as part of the Apache Open Source Community NoSQL Data Platform Batch 99.999% HA Data Protection Interactive Disaster Recovery Enterprise Integration 3 Real Time Scalability & Performance Multi-Tenancy
  • 4. MapR Distribution for Apache Hadoop  100% Apache Hadoop  With significant enterprisegrade enhancements  Comprehensive management  Industry-standard interfaces  Higher performance 4
  • 5. The Cloud Leaders Pick MapR Amazon EMR is the largest Hadoop provider in revenue and # of clusters Google chose MapR to provide Hadoop on Google Compute Engine MapR partnership with Canonical and Mirantis provides advantages for OpenStack deployments 5
  • 6. What Makes MapR so Reliable? 6
  • 7. How to Make a Cluster Reliable? Make the storage reliable 1. – Recover from disk and node failures Make services reliable 2. – – – Services need to checkpoint their state rapidly Restart failed service, possibly on another node Move check-pointed state to restarted service, using (1) above Do it fast 3. – – Instant-on … (1) and (2) must happen very, very fast Without maintenance windows No compactions (e.g., Cassandra, Apache HBase) • No “anti-entropy” that periodically wipes out the cluster (e.g., Cassandra) • 7
  • 8. Reliability with Commodity Hardware  No NVRAM  Cannot assume special connectivity –  Cannot even assume more than 1 drive per node –  No RAID possible Use replication, but … – –  No separate data paths for "online" vs. replica traffic Cannot assume peers have equal drive sizes Drive on first machine is 10x larger than drive on other? No choice but to replicate for reliability 8
  • 9. Reliability via Replication Replication is easy, right? All we have to do is send the same bits to the master and replica Clients Primary Server Clients Primary Server Replica Normal replication, primary forwards Replica Cassandra-style replication 9
  • 10. But Crashes Occur… When the replica comes back, it is stale – It must be brought up-to-date – Until then, exposed to failure Clients Clients Who re-syncs? Primary Server Primary Server Replica Primary re-syncs replica Replica Replica remains stale until "anti-entropy" process kicked off by administrator 10
  • 11. Unless it’s HDFS …  HDFS solves the problem a third way  Makes everything read-only – static data, trivial to re-sync  Single writer, no reads allowed while writing  File close is the transaction that allows readers to see data – –  Realtime not possible with HDFS – –  Unclosed files are lost Cannot write any further to closed file To make data visible, must close file immediately after writing Too many files is a serious problem with HDFS (a well documented limitation) HDFS therefore cannot do NFS, ever – No “close” in NFS … can lose data any time 11
  • 12. This is the 21st Century…  To support normal apps, need full read/write support  Let's return to the issue: re-sync the replica when it comes back Clients Clients Who re-syncs? Primary Server Primary Server Replica Primary re-syncs replica Replica Replica remains stale until "anti-entropy" process kicked off by administrator 12
  • 13. How Long to Re-sync?  24 TB / server – –  Did you say you want to do this online? – –  @ 1000MB/s = 7 hours Practical terms, @ 200MB/s = 35 hours Throttle re-sync rate to 1/10th 350 hours to re-sync (= 15 days) What is your Mean Time To Data Loss (MTTDL)? – – How long before a double disk failure? A triple disk failure? 13
  • 14. Traditional Solutions Use dual-ported disk to side-step this problem Clients Servers use NVRAM Primary Server Raid-6 with idle spares Replica Dual Ported Disk Array COMMODITY HARDWARE LARGE SCALE CLUSTERING Large Purchase Contracts, 5-year spare-parts plan 14
  • 16. What MapR Does  Chop the data on each node to 1000s of pieces – –  Not millions of pieces, only 1000s Pieces are called containers Spread replicas of each container across the cluster 16
  • 17. Why Does It Improve Things? 17
  • 18. MapR Replication Example  100-node cluster  Each node holds 1/100th of every node's data  When a server dies, entire cluster re-syncs the dead node's data 18
  • 19. MapR Re-sync Speed  – – –   99 nodes re-sync'ing in parallel Net speed up is about 100x – 99x number of drives 99x number of Ethernet ports 99x CPUs  Each is re-sync'ing 1/100th of the data 19 350 hours vs. 3.5 MTTDL is 100x better
  • 20. MapR Reliability  Mean Time To Data Loss (MTTDL) is far better –  Does not require idle spare drives – –  Rest of cluster has sufficient spare capacity to absorb one node’s data On a 100-node cluster, 1 node’s data == 1% of cluster capacity Utilizes all resources – –  Improves as cluster size increases no wasted “master-slave” nodes no wasted idle spare drives … all spindles put to use Better reliability with less resources – on commodity hardware! 20
  • 21. Why Is This So Difficult? 21
  • 22. MapR's Read-write Replication  Writes are synchronous –  client2 client1 Visible immediately clientN Data is replicated in a "chain" fashion – Utilizes full-duplex network client2  client1 Meta-data is replicated in a "star" manner – Response time better 22 clientN
  • 23. Container Balancing • Servers keep a bunch of containers "ready to go” • Writes get distributed around the cluster    23 As data size increases, writes spread more (like dropping a pebble in a pond) Larger pebbles spread the ripples farther Space balanced by moving idle containers
  • 24. MapR Container Re-sync  MapR is 100% random write – uses distributed transactions   Complete crash On a complete crash, all replicas diverge from each other On recovery, which one should be master? 24
  • 25. MapR Container Re-sync  MapR can detect exactly where replicas diverged – even at 2000 MB/s update rate  Re-sync means – roll-back each to divergence point – roll-forward to converge with chosen master  Done while online – with very little impact on normal operations 25 New master after crash
  • 26. MapR Does Automatic Re-sync Throttling  Re-sync traffic is “secondary”  Each node continuously measures RTT to all its peers  More throttle to slower peers –  Idle system runs at full speed All automatically 26
  • 27. But Where Do Containers Fit In? 27
  • 28. MapR's Containers Files/directories are sharded and placed into containers  Each container contains  Directories & files  Data blocks  28 Replicated on servers  Containers are 1632 GB segments of disk, placed on nodes Automatically managed
  • 29. MapR’s Architectural Params HDFS 'block' Unit of I/O  – 4K/8K (8K in MapR) 10^3 map-red i/o KB  Unit of Chunking (a mapreduce split) –  10-100's of megabytes – Unit of Resync (a replica) – – – – 10-100's of gigabytes container in MapR automatically managed 29 10^9 admin Unit of Administration (snap, repl, mirror, quota, backup) –  10^6 resync 1 gigabyte - 1000's of terabytes volume in MapR what data is affected by my missing blocks?
  • 30. Where & How Does MapR Exploit This Unique Advantage? 30
  • 31. MapR's No NameNode HATM Architecture HDFS Federation MapR (Distributed Metadata) NAS appliance E E F E F E F C E D F B NameNode NameNode NameNode NameNode NameNode NameNode A DataNode DataNode DataNode C D E D B B C E B A DataNode F A A D C F B F DataNode DataNode • HA w/automatic failover • Instant cluster restart • Up to 1T files (> 5000x advantage) • 10-20x higher performance • 100% commodity hardware • Multiple single points of failure • Limited to 50-200 million files • Performance bottleneck • Commercial NAS required 31
  • 32. Relative Performance and Scale Other Advantage Rate (creates/s) 14-16K 335-360 40x Scale (files) 6B 1.3M 4615x File creates/s MapR 18000 Other distribution 0 MapR distribution 16000 400 350 300 250 200 150 100 50 0 0.5 1 1.5 Files (M) File creates/s 14000 12000 Benchmark: 100-byte file creates 10000 8000 6000 4000 2000 Other distribution 0 0 0 1000 100 2000 200 3000 400 Files (M) 4000 600 5000 800 32 6000 1000 Hardware: 10 nodes, each 2 x 4 cores 24 GB RAM 12 x 1 TB 7200 RPM 1 GigE NIC
  • 33. Where & How Does MapR Exploit This Unique Advantage? 33
  • 34. MapR’s NFS Allows Direct Deposit Web Server Random Read/Write Compression Database Server Distributed HA … Application Server Connectors not needed No extra scripts or clusters to deploy and maintain 34
  • 35. Where & How Does MapR Exploit This Unique Advantage? 35
  • 36. MapR Volumes & Snapshots /projects /tahoe /yosemite /user /msmith /bjohnson Volumes dramatically simplify the management of Big Data • • • • • • Replication factor Scheduled mirroring Scheduled snapshots Data placement control User access and tracking Administrative permissions 100K volumes are OK, create as many as desired! 36
  • 37. Where & How Does MapR Exploit This Unique Advantage? 37
  • 38. MapR M7 Tables  Binary compatible with Apache HBase – –  no recompilation needed to access M7 tables Just set CLASSPATH M7 tables accessed via pathname – – – openTable( "hello") … uses HBase openTable( "/hello") … uses M7 openTable( "/user/srivas/hello") … uses M7 38
  • 39. M7 Tables  M7 tables integrated into storage –  Unlimited number of tables –  Apache HBase is typically 10-20 tables (max 100) No compactions –  always available on every node, zero admin update in place Instant-on – Zero recovery time  5-10x better performance  Consistent low latency – MapR M7 At 95th & 99th percentiles 39
  • 40. YCSB 50-50 Mix (throughput) M7 (ops/sec) Other HBase 0.94.9 latency (usec) Other HBase 0.94.9 (ops/sec) 40 M7 Latency (usec)
  • 41. YCSB 50-50 mix (latency) Other latency (usec) 41 M7 Latency (usec)
  • 42. Where & How Does MapR Exploit This Unique Advantage? 43
  • 43. MapR Makes Hadoop Truly HA  ALL Hadoop components have High Availability – e.g. YARN  ApplicationMaster (old JT) and TaskTracker record their state in MapR  On node-failure, AM recovers its state from MapR –  All jobs resume from where they were –  Works even if entire cluster restarted Only from MapR Allows preemption – – MapR can preempt any job, without losing its progress ExpressLane™ feature in MapR exploits it 44
  • 44. MapR Does MapReduce (Fast!) TeraSort Record 1 TB in 54 seconds 1003 nodes MinuteSort Record 1.5 TB in 59 seconds 2103 nodes 45
  • 45. MapR Does MapReduce (Faster!) TeraSort Record 1 TB in 54 seconds 1003 nodes MinuteSort Record 1.65 1.5 TB in 59 seconds 2103 nodes 300 46
  • 46. YOUR CODE Can Exploit MapR’s Unique Advantages  ALL your code can easily be scale-out HA – – Save service-state in MapR Save data in MapR Use Zookeeper to notice service failure  Restart anywhere, data+state will move there automatically  That’s what we did!   Only from MapR: HA for Impala, Hive, Oozie, Storm, MySQL, SOLR/Lucene, HCatalog, Kafka, … 47
  • 47. MapR: Unlimited Hadoop Reliable Compute Dependable Storage Build cluster brick by brick, one node at a time    Use commodity hardware at rock-bottom prices Get enterprise-class reliability: instantrestart, snapshots, mirrors, no single-point-of-failure, … Export via NFS, ODBC, Hadoop and other standard protocols 48

Hinweis der Redaktion

  1. MapR provides a complete platform for Apache Hadoop. MapR has integrated, tested and hardened a broad array of packages as part of this distribution Hive, Pig, Oozie, Sqoop, plus additional packages such as Cascading. All of these packages with MapR updates are available on Github.We have spent over a two year well funded effort to provide deep architectural improvements to create the next generation distribution for Hadoop. This is in stark contrast with the alternative distributions from Cloudera, HortonWorks, Apache which are all equivalent.
  2. The Namenode today in Hadoop is a single point of failure, a scalability limitation, and a performance bottleneck.With MapR there is no dedicated NameNode. The NameNode function is distributed across the cluster. This provides major advantages in terms of HA, data loss avoidance, scalability and performance. Other distributions you have a bottleneck regardless of the number of nodes in the cluster. With other distributions the most number of files that you can support is 200M at the maximum and that is with an extremely high end server. 50% of the processing of Hadoop in Facebook is to pack and unpack files to try to work around this limitation. MapR scales uniformly.
  3. Another major advantage with MapR is the distributed Namenode. The Namenode today in Hadoop is a single point of failure, a scalability limitation, and a performance bottleneck.With MapR there is no dedicated NameNode. The NameNode function is distributed across the cluster. This provides major advantages in terms of HA, data loss avoidance, scalability and performance. Other distributions you have a bottleneck regardless of the number of nodes in the cluster. With other distributions the most number of files that you can support is between 70-100M. 50% of the processing of Hadoop in Facebook is to pack and unpack files to try to work around this limitation. MapR scales uniformly.
  4. With MapR’s Direct Access NFS we use industry standard protocol to load data into the cluster. There are no clients that need to be deployed. No special agents. If writing over NFS is it highly available. We have that. Do you need to compress it. We do it automatically. We support random read/write.
  5. Another aspect of ease of use is the support for volumes. Only MapR provides the logical volume construct to group content. Policies can be used to enforce user access, admin permissions, data protection polices, etc. Data placement can also be controlled. So I have certain nodes that have SSD drives. Other Hadoop distros require you to have permissions at the individual files or across the entire cluster.MapR’s Hadoop distribution includes support for Volumes - a unique idea that dramatically simplifies the management of your big data. Volumes allow you to efficiently define data management policies. With other Hadoop distributions you either set policies for the entire cluster or individual files. Setting policies by file or directories is too granular and hierarchical. There are just too many files. On the other hand, cluster-wide policies are not granular enough. Slide 2A volume is a logical constructthat lets you collect and refer to data files regardless of where the file is physically located, and with no concern for the relationship between the data files.