SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Tales From the Cloudera Field
Kevin O’Dell, Kate Ting, Aleks Shulman
{kevin, kate, aleks}@cloudera.com
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Who Are We?
Kevin O’Dell
- Previously HBase Support Team Lead
- Currently Systems Engineer with a focus on HBase deployments
Kate Ting
- Technical Account Manager of Cloudera’s largest HBase deployments
- Co-author of O’Reilly’s Apache Sqoop Cookbook
Aleks Shulman
- HBase Test Engineer focused on ensuring HBase is enterprise ready
- Primary focus on building compatibility frameworks for rolling upgrades
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Cloudera Internal HBase Metrics
• Cloudera uses HBase internally for the Support Team
• We ingest Tickets, Cluster Stats, and Apache Mailing Lists
• Cloudera has ~20K HBase nodes under management
• Over 60% of my accounts use HBase
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Agenda
● Tales Getting Production Started
● Tales Fixing Production Bugs
● Tales Upgrading Production Clusters
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Agenda
● Tales Getting Production Started
● Tales Fixing Production Bugs
● Tales Upgrading Production Clusters
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
HBase Deployment Mistakes
• Cluster Sizing
• Managing Your Regions
• General Recommendations
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Why Cluster Sizing Matters
• Jobs Failing
• Writes Blocking
• Performance Issues
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Heavy Write Sizing
java_max_heap 16GB
memstore_upper .50
java_max_heap * memstore = memstore_total_size
Calculating Total Available Memstore
desired_flush_size 128MB
repl_factor 3 (default)
max_file_size 20GB
Calculating Max Regions
memstore_total_size / desired_flush_size = total_regions_per_rs
max_file_size * (total_regions_per_rs * repl_factor) = raw_storage_per_node
X-axis = Flush_Size
Y-axis = Region_Count
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Update for Known Writes Sizing
write_throughput 20MBs
total_data_size 350TB
hlog_size * number_of_hlogs = amount_of_data_before_flush
Calculating force flushes
hlog_size 128MBs
number_of_hlogs 64
(write_throughput * 60 * 60) / amount_of_data_before_flush =
number_nodes_before_flush
Calculating Max Regions
total_data_size 350TB
maxfile_size 20GB
((total_data_size * 1024) / maxfile_size) / desired_RS_count =
total_regions_per_rs
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Why is Region Management Important
• Initial loads are failing
• Region Servers are crashing from overload
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Region Management Best Practices
Region Split Policy
ConstantSize Split on Max Filesize Use when pre-splitting all
tables
UpperBoundSplitPolicy Split on smarter intervals Use when not able to pre-split
all tables
Balancer Policy
SimpleLoadBalancer Aimlessly balance regions Use with lots of tables with low
region count
ByTable Balance by table Use with few tables with high
region count
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
General Recommendations
Feature Benefit When to Enable
Short Circuit Reads (SCR) Speed up read times by bypassing
datanode layer
Always
Snappy Compression Speed up read times and lower data
consumption
On heavily accessed tables
Bloom Filters Speed up read times when numerous
HFiles are present
Row should always be used,
Row+Column is more accurate but
higher in memory usage
HLog Compression Speed up writes and recovery times Always
Data Block Encoding compress long keys to store more in
cache
Best for short/tall tables with long
like keys. Scans may be slower
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Agenda
● Tales Getting Production Started
● Tales Fixing Production Bugs
● Tales Upgrading Production Clusters
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Tales Fixing Production Bugs
● RegionServer Hotspotting
● Faulty Hardware
● Application Bug
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Tales Fixing Production Bugs
● RegionServer Hotspotting
● Faulty Hardware
● Application Bug
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Fixing #1: RegionServer Hotspotting - Solution
● Spread rows over all RS
by salting the row key
● 100’s of regions avail
but increments only
done to 10’s of regions
● While locks wait to time
out, blocked clients hold
onto handlers
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Fixing #1: RegionServer Hotspotting - Solution
● Option 1: Change row key to something that scales
○ Reduce contention by reducing connections: each client
picks one salt and writes only to one RS
● Option 2: Implement new coalescing feature in native
HBaseSink, compressing entire batch of Flume events into
single HBase RPC call
[row1, colA+=1] [row1, colB+=1] [row1, colB+=1]
=> [row1 colA+=1 colB+=2]
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Tales Fixing Production Bugs
● RegionServer Hotspotting
● Faulty Hardware
● Application Bug
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Fixing #2: Faulty Hardware
● Diagnostics run on bad hardware caused HBase failures
● HBase recoverability = RS back online + locality (compaction)
● Stress test with prod load before needed (i.e. holiday season)
● Imagine financial impact of 7 hours of downtime?
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Fixing #2: Faulty Hardware - Solution
● Recover faster by failing fast
○ Too many retries cause HBase task to exit before it can
print exception identifying stuck RS
● Decrease time needed to finish HBase major compaction
○ Run multiple threads during compaction
● Replay in parallel
○ Decrease HLog size to limit # of edits to be replayed,
increase # of HLogs, constrain WAL file size to minimize
time corresponding region is not available
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Fixing #2: Faulty Hardware - Solution
● Shorten column family names
○ Reduce scan time, skip bulk loads, reduce memory usage
● Turn off write cache
○ Node crash erases writes in memory, rebuilds block with
outdated data, causing corrupt replica
● Turn on checksum
○ Enables RS to use other replicas from the cluster instead
of failing the operation if there’s a corrupted HFile
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Tales Fixing Production Bugs
● RegionServer Hotspotting
● Faulty Hardware
● Application Bug
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Fixing #3: Application Bug
● HBase timestamps were hardcoded to be too far out - new data
written went unused
● Bug put backup system out of commission for one month
○ More vulnerable to HBase outages
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Fixing #3: Application Bug
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Fixing #3: Application Bug - Solution
● Detailed knowledge of internals required to undo damage
○ Modified the timestamp to some time in the past for all
records via custom MR jobs over one month:
■ back up data, generate new HFile with correct
timestamp, bulkload data, run MD5
● Don’t muck around with setting the timestamp yourself
● Do use always-increasing timestamps for new puts to a row
● Do use a separate timestamp attribute of the row
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Agenda
● Tales Getting Production Started
● Tales Fixing Production Bugs
● Tales Upgrading Production Clusters
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Internal Case Study
CDH4->C5 (0.94->0.96) Upgrade Automation Failed
What Happened? Root Cause
• HBase Snapshots vs. HDFS Snapshots
• Snapshot directory rename
Outcome
• All issues resolved before C5b1 was
shipped
2013-07-12 17:11:42,656 ERROR org.apache.
hadoop.hdfs.server.namenode.FSEditLogLoader:
Encountered exception on operation MkdirOp
[length=0, inodeId=0, path=/hbase/.snapshot,
timestamp=1373674083434, permissions=hbase:
supergroup:rwxr-xr-x, opCode=OP_MKDIR,
txid=614]
org.apache.hadoop.
HadoopIllegalArgumentException: ".snapshot"
is a reserved name. Please rename it before
upgrade.
Automating Upgrades
Testing the Upgrade lifecycle
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
What is Important?
The Administrator Experience Matters
● Major version upgrades
● Rolling upgrades
The Developer Experience Matters
● API Compatibility Testing
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
And Here Is Why It Is Important
Customer Continuity
• Smooth upgrades
• Curated process
• Understanding of customer cluster lifecycle
Developer Continuity
• Forward and backward compatibility
• Binary Compatibility
• Wire Compatibility
Automation
• You can only really make a guarantee about things that are automated
• Product is easier to support
• Confidence is only possible with testing
Upgrades
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Cold vs. Rolling Upgrades
C3u5 CDH4.0.x CDH4.1.x CDH4.2.x CDH4.3.x CDH4.4.x CDH4.5.x CDH4.6.x C5.0 C5.1
-- Rolling Upgrade --> -- Rolling Upgrade -- >
-- Cold Upgrade -->
-- Cold Upgrade -->
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Upgrades from HBase 0.90 -> 0.98
CDH Version HBase Version
CDH3u5 HBase 0.90.6
CDH4.1.0 HBase 0.92.1
CDH4.2.0 HBase 0.94.2
CDH4.4.0 HBase 0.94.6
CDH4.6.0 HBase 0.94.15
CDH5.0.0 HBase 0.96.1.1
CDH5.1.0 HBase 0.98.1
A
B
C
Upgrade from version A -> Version B -> Version C
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Cold Upgrade Results
● Upgrades work!
● Steps:
○ Start at CDH3u5
○ Upgrade to a version of CDH4
○ Upgrade to CDH5.0.0
● Data Integrity
○ Different bloom filters
○ Different compression formats
● Next Steps
○ CDH 5.1.0 expected to be based on 0.98.1
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Rolling Upgrade Results
● What is tested?
○ Ingest via Java API
○ MapReduce over HBase
■ Bulk load
■ RowCount/Export
● Status
○ Rolling upgrade broken (red)
in CDH <=4.1.2 due to
region_mover issue
○ Soft failure (yellow) for
starting version <CDH4.1.0 -
due to MapReduce JT/TT
version mismatch issue
○ All else green!How to Read This:
Pick a column and read down to see for which versions rolling upgrades are advised
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Improved Supportability Through Testing
Case Study: Customer Rolling Upgrade Simulation
Large Customer
● Upgrading from CDH4.1.4+patches
● Considered several CDH versions to upgrade
○ Custom patches
Automation
● Automated testing added to simulate rolling upgrade
○ CM
○ HA+QJM
○ Parcels
● Scales
○ 4 nodes, 20 nodes, 80 nodes
● Subsequently used for other customers with similar upgrade paths
©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved.
Here’s to Fewer Tales Next Year..
Automated Testing Better Cluster Mgmt Fewer Tales From the Field
©2014 Cloudera, Inc. All rights reserved.
Kevin O’Dell @kevinrodell
Kate Ting @kate_ting
Aleks Shulman @a_shulman
@clouderaTest
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

HBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and SparkHBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and SparkHBaseCon
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestCloudera, Inc.
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in HadoopBackup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadooplarsgeorge
 
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...Cloudera, Inc.
 
Rigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementRigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementDataWorks Summit
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseCloudera, Inc.
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBaseCon
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementDataWorks Summit/Hadoop Summit
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHBaseCon
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerHBaseCon
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketCloudera, Inc.
 
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Chris Nauroth
 
From docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native wayFrom docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native wayDataWorks Summit
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBaseHBaseCon
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon
 
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBaseCon
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.Cloudera, Inc.
 
Scalable HiveServer2 as a Service
Scalable HiveServer2 as a ServiceScalable HiveServer2 as a Service
Scalable HiveServer2 as a ServiceDataWorks Summit
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon
 

Was ist angesagt? (20)

HBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and SparkHBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and Spark
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at Pinterest
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in HadoopBackup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
 
Rigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementRigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance Measurement
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
 
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
 
From docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native wayFrom docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native way
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBase
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at Xiaomi
 
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial Industry
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
 
Scalable HiveServer2 as a Service
Scalable HiveServer2 as a ServiceScalable HiveServer2 as a Service
Scalable HiveServer2 as a Service
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
 

Andere mochten auch

Data Evolution in HBase
Data Evolution in HBaseData Evolution in HBase
Data Evolution in HBaseHBaseCon
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCCloudera, Inc.
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARNHBaseCon
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponCloudera, Inc.
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...Cloudera, Inc.
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBaseHBaseCon
 
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseCloudera, Inc.
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashCloudera, Inc.
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...Cloudera, Inc.
 
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseCloudera, Inc.
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesCloudera, Inc.
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...Cloudera, Inc.
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics Cloudera, Inc.
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterCloudera, Inc.
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsCloudera, Inc.
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!Cloudera, Inc.
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...Cloudera, Inc.
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaCloudera, Inc.
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseCloudera, Inc.
 

Andere mochten auch (20)

Data Evolution in HBase
Data Evolution in HBaseData Evolution in HBase
Data Evolution in HBase
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBase
 
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBase
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
 
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBase
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 Minutes
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
 
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBase
 

Ähnlich wie Tales from the Cloudera Field

Hadoop 3 (2017 hadoop taiwan workshop)
Hadoop 3 (2017 hadoop taiwan workshop)Hadoop 3 (2017 hadoop taiwan workshop)
Hadoop 3 (2017 hadoop taiwan workshop)Wei-Chiu Chuang
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Kathleen Ting
 
Operating and supporting HBase Clusters
Operating and supporting HBase ClustersOperating and supporting HBase Clusters
Operating and supporting HBase Clustersenissoz
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsDataWorks Summit/Hadoop Summit
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trencheswchevreuil
 
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in HadoopKudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoopjdcryans
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
Building an Apache Hadoop data application
Building an Apache Hadoop data applicationBuilding an Apache Hadoop data application
Building an Apache Hadoop data applicationtomwhite
 
The State of HBase Replication
The State of HBase ReplicationThe State of HBase Replication
The State of HBase ReplicationHBaseCon
 
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...Dataconomy Media
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Hadoop / Spark Conference Japan
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaSwiss Big Data User Group
 
Apache Accumulo Overview
Apache Accumulo OverviewApache Accumulo Overview
Apache Accumulo OverviewBill Havanki
 

Ähnlich wie Tales from the Cloudera Field (20)

Hadoop 3 (2017 hadoop taiwan workshop)
Hadoop 3 (2017 hadoop taiwan workshop)Hadoop 3 (2017 hadoop taiwan workshop)
Hadoop 3 (2017 hadoop taiwan workshop)
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
Yarns About Yarn
Yarns About YarnYarns About Yarn
Yarns About Yarn
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 
Hadoop Operations
Hadoop OperationsHadoop Operations
Hadoop Operations
 
Operating and supporting HBase Clusters
Operating and supporting HBase ClustersOperating and supporting HBase Clusters
Operating and supporting HBase Clusters
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and Improvements
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trenches
 
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in HadoopKudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Building an Apache Hadoop data application
Building an Apache Hadoop data applicationBuilding an Apache Hadoop data application
Building an Apache Hadoop data application
 
The State of HBase Replication
The State of HBase ReplicationThe State of HBase Replication
The State of HBase Replication
 
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...
 
Apache Hadoop 3
Apache Hadoop 3Apache Hadoop 3
Apache Hadoop 3
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
 
Empower Hive with Spark
Empower Hive with SparkEmpower Hive with Spark
Empower Hive with Spark
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
Apache Accumulo Overview
Apache Accumulo OverviewApache Accumulo Overview
Apache Accumulo Overview
 

Mehr von HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践HBaseCon
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台HBaseCon
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comHBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 

Mehr von HBaseCon (20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 

Kürzlich hochgeladen

VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineeringssuserb3a23b
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 

Kürzlich hochgeladen (20)

VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineering
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 

Tales from the Cloudera Field

  • 1. Tales From the Cloudera Field Kevin O’Dell, Kate Ting, Aleks Shulman {kevin, kate, aleks}@cloudera.com
  • 2. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Who Are We? Kevin O’Dell - Previously HBase Support Team Lead - Currently Systems Engineer with a focus on HBase deployments Kate Ting - Technical Account Manager of Cloudera’s largest HBase deployments - Co-author of O’Reilly’s Apache Sqoop Cookbook Aleks Shulman - HBase Test Engineer focused on ensuring HBase is enterprise ready - Primary focus on building compatibility frameworks for rolling upgrades
  • 3. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Cloudera Internal HBase Metrics • Cloudera uses HBase internally for the Support Team • We ingest Tickets, Cluster Stats, and Apache Mailing Lists • Cloudera has ~20K HBase nodes under management • Over 60% of my accounts use HBase
  • 4. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Agenda ● Tales Getting Production Started ● Tales Fixing Production Bugs ● Tales Upgrading Production Clusters
  • 5. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Agenda ● Tales Getting Production Started ● Tales Fixing Production Bugs ● Tales Upgrading Production Clusters
  • 6. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. HBase Deployment Mistakes • Cluster Sizing • Managing Your Regions • General Recommendations
  • 7. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Why Cluster Sizing Matters • Jobs Failing • Writes Blocking • Performance Issues
  • 8. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Heavy Write Sizing java_max_heap 16GB memstore_upper .50 java_max_heap * memstore = memstore_total_size Calculating Total Available Memstore desired_flush_size 128MB repl_factor 3 (default) max_file_size 20GB Calculating Max Regions memstore_total_size / desired_flush_size = total_regions_per_rs max_file_size * (total_regions_per_rs * repl_factor) = raw_storage_per_node X-axis = Flush_Size Y-axis = Region_Count
  • 9. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Update for Known Writes Sizing write_throughput 20MBs total_data_size 350TB hlog_size * number_of_hlogs = amount_of_data_before_flush Calculating force flushes hlog_size 128MBs number_of_hlogs 64 (write_throughput * 60 * 60) / amount_of_data_before_flush = number_nodes_before_flush Calculating Max Regions total_data_size 350TB maxfile_size 20GB ((total_data_size * 1024) / maxfile_size) / desired_RS_count = total_regions_per_rs
  • 10. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Why is Region Management Important • Initial loads are failing • Region Servers are crashing from overload
  • 11. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Region Management Best Practices Region Split Policy ConstantSize Split on Max Filesize Use when pre-splitting all tables UpperBoundSplitPolicy Split on smarter intervals Use when not able to pre-split all tables Balancer Policy SimpleLoadBalancer Aimlessly balance regions Use with lots of tables with low region count ByTable Balance by table Use with few tables with high region count
  • 12. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. General Recommendations Feature Benefit When to Enable Short Circuit Reads (SCR) Speed up read times by bypassing datanode layer Always Snappy Compression Speed up read times and lower data consumption On heavily accessed tables Bloom Filters Speed up read times when numerous HFiles are present Row should always be used, Row+Column is more accurate but higher in memory usage HLog Compression Speed up writes and recovery times Always Data Block Encoding compress long keys to store more in cache Best for short/tall tables with long like keys. Scans may be slower
  • 13. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Agenda ● Tales Getting Production Started ● Tales Fixing Production Bugs ● Tales Upgrading Production Clusters
  • 14. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Tales Fixing Production Bugs ● RegionServer Hotspotting ● Faulty Hardware ● Application Bug
  • 15. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Tales Fixing Production Bugs ● RegionServer Hotspotting ● Faulty Hardware ● Application Bug
  • 16. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Fixing #1: RegionServer Hotspotting - Solution ● Spread rows over all RS by salting the row key ● 100’s of regions avail but increments only done to 10’s of regions ● While locks wait to time out, blocked clients hold onto handlers
  • 17. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Fixing #1: RegionServer Hotspotting - Solution ● Option 1: Change row key to something that scales ○ Reduce contention by reducing connections: each client picks one salt and writes only to one RS ● Option 2: Implement new coalescing feature in native HBaseSink, compressing entire batch of Flume events into single HBase RPC call [row1, colA+=1] [row1, colB+=1] [row1, colB+=1] => [row1 colA+=1 colB+=2]
  • 18. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Tales Fixing Production Bugs ● RegionServer Hotspotting ● Faulty Hardware ● Application Bug
  • 19. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Fixing #2: Faulty Hardware ● Diagnostics run on bad hardware caused HBase failures ● HBase recoverability = RS back online + locality (compaction) ● Stress test with prod load before needed (i.e. holiday season) ● Imagine financial impact of 7 hours of downtime?
  • 20. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Fixing #2: Faulty Hardware - Solution ● Recover faster by failing fast ○ Too many retries cause HBase task to exit before it can print exception identifying stuck RS ● Decrease time needed to finish HBase major compaction ○ Run multiple threads during compaction ● Replay in parallel ○ Decrease HLog size to limit # of edits to be replayed, increase # of HLogs, constrain WAL file size to minimize time corresponding region is not available
  • 21. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Fixing #2: Faulty Hardware - Solution ● Shorten column family names ○ Reduce scan time, skip bulk loads, reduce memory usage ● Turn off write cache ○ Node crash erases writes in memory, rebuilds block with outdated data, causing corrupt replica ● Turn on checksum ○ Enables RS to use other replicas from the cluster instead of failing the operation if there’s a corrupted HFile
  • 22. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Tales Fixing Production Bugs ● RegionServer Hotspotting ● Faulty Hardware ● Application Bug
  • 23. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Fixing #3: Application Bug ● HBase timestamps were hardcoded to be too far out - new data written went unused ● Bug put backup system out of commission for one month ○ More vulnerable to HBase outages
  • 24. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Fixing #3: Application Bug
  • 25. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Fixing #3: Application Bug - Solution ● Detailed knowledge of internals required to undo damage ○ Modified the timestamp to some time in the past for all records via custom MR jobs over one month: ■ back up data, generate new HFile with correct timestamp, bulkload data, run MD5 ● Don’t muck around with setting the timestamp yourself ● Do use always-increasing timestamps for new puts to a row ● Do use a separate timestamp attribute of the row
  • 26. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Agenda ● Tales Getting Production Started ● Tales Fixing Production Bugs ● Tales Upgrading Production Clusters
  • 27. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Internal Case Study CDH4->C5 (0.94->0.96) Upgrade Automation Failed What Happened? Root Cause • HBase Snapshots vs. HDFS Snapshots • Snapshot directory rename Outcome • All issues resolved before C5b1 was shipped 2013-07-12 17:11:42,656 ERROR org.apache. hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation MkdirOp [length=0, inodeId=0, path=/hbase/.snapshot, timestamp=1373674083434, permissions=hbase: supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=614] org.apache.hadoop. HadoopIllegalArgumentException: ".snapshot" is a reserved name. Please rename it before upgrade.
  • 28. Automating Upgrades Testing the Upgrade lifecycle
  • 29. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. What is Important? The Administrator Experience Matters ● Major version upgrades ● Rolling upgrades The Developer Experience Matters ● API Compatibility Testing
  • 30. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. And Here Is Why It Is Important Customer Continuity • Smooth upgrades • Curated process • Understanding of customer cluster lifecycle Developer Continuity • Forward and backward compatibility • Binary Compatibility • Wire Compatibility Automation • You can only really make a guarantee about things that are automated • Product is easier to support • Confidence is only possible with testing
  • 32. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Cold vs. Rolling Upgrades C3u5 CDH4.0.x CDH4.1.x CDH4.2.x CDH4.3.x CDH4.4.x CDH4.5.x CDH4.6.x C5.0 C5.1 -- Rolling Upgrade --> -- Rolling Upgrade -- > -- Cold Upgrade --> -- Cold Upgrade -->
  • 33. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Upgrades from HBase 0.90 -> 0.98 CDH Version HBase Version CDH3u5 HBase 0.90.6 CDH4.1.0 HBase 0.92.1 CDH4.2.0 HBase 0.94.2 CDH4.4.0 HBase 0.94.6 CDH4.6.0 HBase 0.94.15 CDH5.0.0 HBase 0.96.1.1 CDH5.1.0 HBase 0.98.1 A B C Upgrade from version A -> Version B -> Version C
  • 34. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Cold Upgrade Results ● Upgrades work! ● Steps: ○ Start at CDH3u5 ○ Upgrade to a version of CDH4 ○ Upgrade to CDH5.0.0 ● Data Integrity ○ Different bloom filters ○ Different compression formats ● Next Steps ○ CDH 5.1.0 expected to be based on 0.98.1
  • 35. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Rolling Upgrade Results ● What is tested? ○ Ingest via Java API ○ MapReduce over HBase ■ Bulk load ■ RowCount/Export ● Status ○ Rolling upgrade broken (red) in CDH <=4.1.2 due to region_mover issue ○ Soft failure (yellow) for starting version <CDH4.1.0 - due to MapReduce JT/TT version mismatch issue ○ All else green!How to Read This: Pick a column and read down to see for which versions rolling upgrades are advised
  • 36. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Improved Supportability Through Testing Case Study: Customer Rolling Upgrade Simulation Large Customer ● Upgrading from CDH4.1.4+patches ● Considered several CDH versions to upgrade ○ Custom patches Automation ● Automated testing added to simulate rolling upgrade ○ CM ○ HA+QJM ○ Parcels ● Scales ○ 4 nodes, 20 nodes, 80 nodes ● Subsequently used for other customers with similar upgrade paths
  • 37. ©2014 Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. Here’s to Fewer Tales Next Year.. Automated Testing Better Cluster Mgmt Fewer Tales From the Field
  • 38. ©2014 Cloudera, Inc. All rights reserved. Kevin O’Dell @kevinrodell Kate Ting @kate_ting Aleks Shulman @a_shulman @clouderaTest Questions?