SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
hbase-2.0.0
Michael Stack
Release Manager
stack@apache.org
Not Yet Released!
● 2.0.0-alpha1 released June 8th, 2017
○ Preview Release, ‘rough cut’.
● Plan: Alphas => Betas => Final
Release
● hbase-2.0.0 EOY 2017EOY 2017
Goals: Compatibility
● Double-down on Semantic Versioning, semver
○ Adopted in hbase-1.0.0
○ MAJOR.MINOR.PATCH[-IDENTIFIER]
■ E.g. 2.0.0-alpha1
But….
● Semantic Versioning is about API only
○ What about….
■ Internal/External Interfaces
● Where is Client Interface when Spark/MapReduce
API Annotations
● From Hadoop…
○ InterfaceAudience.Public
■ Get/Put/Scan/Connection
○ InterfaceAudience.LimitedPublic
■ Coprocessors, Replication, etc.
○ InterfaceAudience.Private
■ Internal only
But (Again)...
● What about…
○ Source/Binary compatibility
○ Serializations
■ Wire
■ Formats in HDFS/Zookeeper
○ Dependencies
● See refguide semver section
But still….
● Grey areas…
○ Protobufs
■ Protobufs in API
■ Protobufs as Interface (c++/go clients)
Goal: Rolling Upgrade 1.x to 2.x
● SemVer for DML at least in 2.x
○ No admin of hbase-2.x with hbase-1.x client
○ Coprocessors will need to be upgraded
● hbase-1.x client can work against hbase-2.x cluster
○ Even 1.x Coprocessor Endpoints will work with an hbase-2.x cluster
● No Singularity! No downtime.
Other Goals
● Scale
○ More Regions, bigger clusters
● Performance
○ Inline read/write but also macro restart, assign, etc.
○ Better resource utilization
■ I/O, RAM
● Fix primary root of operational woes/bugs
○ Master Region Assignment
● Cleanup
○ Spark narrative
○ Interfaces
2.0.0
What’s inside hbase-2.0.0?
● Currently >4400 issues resolved
○ ~500 exclusive to 2.0.0
Inside: Prerequisites
● JDK8 only
● Hadoop-2.7.1 minimum
○ Will work against the coming Hadoop-3.x
Inside: Main Features
● New Master Core (A.K.A AMv2)
○ Prompt assign of millions of Regions, faster startup, larger scale!
○ Many small Regions instead of a few big ones
○ Promise: new degree of Resilience
● Offheap Read/Write path
○ Less JVM
● Accordion
○ In-memory compaction
○ Less write amplification
● New Async Client
● etc.
Assignment Manager
● Assignment Manager v1 root of many operational headaches
● Redo based on custom “ProcedureV2”-based State Machine
○ Scale/Performance
○ All Master ops recast as Pv2 procedures
■ E.g. Move is a Procedure composed of an Unassign and Assign subprocedure...
● Operation aggregation
● Entity locking mechanism
○ Isolate operations on tables, regions, servers
○ Parallelization
● One hbase:meta writer, the Master only
Assignment Manager
● No more intermediate state in ZK
○ At other end of an RPC...
○ Record intent in local Master ops WAL, persisted up in HDFS
○ Only final state published to hbase:meta
● No more HBCK, no more RIT
○ No more distributed state
● Faster Assign
○ PE AMv2
● Standalone Testable
Offheap
● Smaller JVM heaps, less copying
○ More accounting!
● Offheap Read Path
○ L1 on Heap for Indexes & Blooms
○ L2 off Heap via BucketCache for Data
■ Big
○ Better latency
■ Cache more
■ Less GC, less erratic
● Offheap Write Path
○ RPC=>HDFS data kept offheap (+Async WAL client)
● Follow-ons:
○ BucketCache always-on
Accordion
● In-memory LSM
○ In-memory flush from ConcurrentSkipListMap to read-only pipeline of ‘segments’
■ Memory ‘breathes’ like an accordion’s bellows...
■ Optional in-memory compaction of Segments
● Prune deletes/versions early
● Benefit depends on # of versions
■ Less flushing, write amplification
● Flavors:
○ NONE
○ BASIC (Default)
○ ENHANCED
● Symbiosis w/ Offheaping
○ MemStore SLABs offheap
○ Compact in-memory layout (CellChunkMap)
Miscellaneous: Complete
● Medium Object Blobs (MOB)
○ Documents, Images, etc.
○ 100k => 10MB
● RegionServer Groups (rsgroups)
○ Coarse Isolation
○ Tables vs RegionServers
● WALs and HFiles in different Filesystems
○ AWS EMR HBase on S3 offering (Amazon S3 Storage Mode)
● New Netty Server/Client chassis
● Filesystem Quotas on Table/Namespace sizes
○ Per Table, per Namespace -- not per User
○ Violation Policy
○ Via Shell
Miscellaneous: In progress, BLOCKERs
● Spark module
○ Cleanup of our HBase Spark story
● AsyncDFSClient
○ Done but for the testing
● Update+Relocation of core dependencies
○ hbase-thirdparty
■ Guava 0.12 => 0.22
■ Protobuf 2.5 => 3.3
■ Netty
○ Etc.
Miscellaneous: In progress, Nice-to-have
● C++Client
● Backup/Restore
● Hybrid Logical Clock
○ Leap Seconds or Errant clock
○ Overload timestamp to do time and sequence id duty
○ Metadata only for 2.x
TODOs: BLOCKERs
● Edit of default configs
○ Stale
● API compatibility review
○ And that it wholesome!
● Testing
○ Each new feature/configuration
○ Rolling upgrade
■ Different data provenance/Starting point
● Operation & Scan Timeout Narrative Cleanup
● Sequence Identifier narrative
● Tie-it-off
○ We keep adding to 2.0.0… it’ll never release.
Moving to 2.x
● API Cleanup
○ Interfaces are returned rather than Implementations
■ TableDescriptor instead of HTableDescriptor, etc.
○ Purged Protobuf and Guava classes from API
■ Purged from CLASSPATH too….(relocated)
● Evangelism:
○ Use shaded hbase-client and hbase-server instead going forward
hbase-3.0.0
● Split hbase:meta
● Refactor filesystem Interface and Implementation
○ Less dependency on HDFS
○ Scale
● HLC everywhere all the time on all tables
● Replication v2
● Dynamic Online Configuration
That’s it!
● Thank you to our host Huawei
● Running 2.0 Status: hbase-2.0.0 Status Doc
● References:
○ Matteo Bertozzi on hbase2: https://speakerdeck.com/matteobertozzi/hbase-2-dot-0-sf-meetup-oct-15
○ Enis Soztutar on hbase2: https://www.slideshare.net/enissoz/meet-hbase-20
● Images:
● Accordion: https://static.roland.com/assets/images/products/gallery/fr-1x_top_open_gal.jpg
● Version2: http://logofaves.com/wp-content/uploads/2009/06/version_m.jpg?9cf02b
● Long Time: http://www.junodownload.com/products/one-night-only-long-time-coming/1957247-02/
● Comic:
http://4.bp.blogspot.com/-upwza0_lLn4/TmXB4lKkPKI/AAAAAAAAAHY/9lA7VYCmSkI/s1600/heap_0001.jpg
● Stork: http://phicenter.org/wp-content/uploads/2012/08/stork_baby_delivery_720x540.jpg
● Apache Helicopter: https://i.ytimg.com/vi/SQp-3fUBefA/maxresdefault.jpg
● Goals: http://az616578.vo.msecnd.net/files/2016/02/29/6359235488335557721545302495_GOALS600.png

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 
HBase at Xiaomi
HBase at XiaomiHBase at Xiaomi
HBase at Xiaomi
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond Panel
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at Xiaomi
 
Date-tiered Compaction Policy for Time-series Data
Date-tiered Compaction Policy for Time-series DataDate-tiered Compaction Policy for Time-series Data
Date-tiered Compaction Policy for Time-series Data
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low Latency
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
 
HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBase
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory
 
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devicesHBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
 
Application Caching: The Hidden Microservice (SAConf)
Application Caching: The Hidden Microservice (SAConf)Application Caching: The Hidden Microservice (SAConf)
Application Caching: The Hidden Microservice (SAConf)
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
 

Ähnlich wie hbaseconasia2017: hbase-2.0.0

Large Scale NoSql DB Migration Under Fire - Ido Barkan - DevOpsDays Tel Aviv ...
Large Scale NoSql DB Migration Under Fire - Ido Barkan - DevOpsDays Tel Aviv ...Large Scale NoSql DB Migration Under Fire - Ido Barkan - DevOpsDays Tel Aviv ...
Large Scale NoSql DB Migration Under Fire - Ido Barkan - DevOpsDays Tel Aviv ...
DevOpsDays Tel Aviv
 

Ähnlich wie hbaseconasia2017: hbase-2.0.0 (20)

HBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project StatusHBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project Status
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
 
Large scale overlay networks with ovn: problems and solutions
Large scale overlay networks with ovn: problems and solutionsLarge scale overlay networks with ovn: problems and solutions
Large scale overlay networks with ovn: problems and solutions
 
M|18 Why Abstract Away the Underlying Database Infrastructure
M|18 Why Abstract Away the Underlying Database InfrastructureM|18 Why Abstract Away the Underlying Database Infrastructure
M|18 Why Abstract Away the Underlying Database Infrastructure
 
Tips & Tricks for Apache Kafka®
Tips & Tricks for Apache Kafka®Tips & Tricks for Apache Kafka®
Tips & Tricks for Apache Kafka®
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
My Sql Proxy
My Sql ProxyMy Sql Proxy
My Sql Proxy
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Percona XtraBackup - New Features and Improvements
Percona XtraBackup - New Features and ImprovementsPercona XtraBackup - New Features and Improvements
Percona XtraBackup - New Features and Improvements
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
MariaDB MaxScale: an Intelligent Database Proxy
MariaDB MaxScale:  an Intelligent Database ProxyMariaDB MaxScale:  an Intelligent Database Proxy
MariaDB MaxScale: an Intelligent Database Proxy
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
 
MariaDB MaxScale: an Intelligent Database Proxy
MariaDB MaxScale: an Intelligent Database ProxyMariaDB MaxScale: an Intelligent Database Proxy
MariaDB MaxScale: an Intelligent Database Proxy
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 
Accumulo Summit Keynote 2018
Accumulo Summit Keynote 2018Accumulo Summit Keynote 2018
Accumulo Summit Keynote 2018
 
Paper reading: HashKV and beyond
Paper reading: HashKV and beyondPaper reading: HashKV and beyond
Paper reading: HashKV and beyond
 
Build real time stream processing applications using Apache Kafka
Build real time stream processing applications using Apache KafkaBuild real time stream processing applications using Apache Kafka
Build real time stream processing applications using Apache Kafka
 
Large Scale NoSql DB Migration Under Fire - Ido Barkan - DevOpsDays Tel Aviv ...
Large Scale NoSql DB Migration Under Fire - Ido Barkan - DevOpsDays Tel Aviv ...Large Scale NoSql DB Migration Under Fire - Ido Barkan - DevOpsDays Tel Aviv ...
Large Scale NoSql DB Migration Under Fire - Ido Barkan - DevOpsDays Tel Aviv ...
 
RubiX
RubiXRubiX
RubiX
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 

Mehr von HBaseCon

HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon
 
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon
 
HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favor...
HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favor...HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favor...
HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favor...
HBaseCon
 

Mehr von HBaseCon (20)

hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
 
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
 
HBaseCon2017 HBase at Xiaomi
HBaseCon2017 HBase at XiaomiHBaseCon2017 HBase at Xiaomi
HBaseCon2017 HBase at Xiaomi
 
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
 
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraph
 
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
 
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
 
HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favor...
HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favor...HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favor...
HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favor...
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

hbaseconasia2017: hbase-2.0.0

  • 2.
  • 3. Not Yet Released! ● 2.0.0-alpha1 released June 8th, 2017 ○ Preview Release, ‘rough cut’. ● Plan: Alphas => Betas => Final
  • 5.
  • 6. Goals: Compatibility ● Double-down on Semantic Versioning, semver ○ Adopted in hbase-1.0.0 ○ MAJOR.MINOR.PATCH[-IDENTIFIER] ■ E.g. 2.0.0-alpha1
  • 7. But…. ● Semantic Versioning is about API only ○ What about…. ■ Internal/External Interfaces ● Where is Client Interface when Spark/MapReduce
  • 8. API Annotations ● From Hadoop… ○ InterfaceAudience.Public ■ Get/Put/Scan/Connection ○ InterfaceAudience.LimitedPublic ■ Coprocessors, Replication, etc. ○ InterfaceAudience.Private ■ Internal only
  • 9. But (Again)... ● What about… ○ Source/Binary compatibility ○ Serializations ■ Wire ■ Formats in HDFS/Zookeeper ○ Dependencies ● See refguide semver section
  • 10. But still…. ● Grey areas… ○ Protobufs ■ Protobufs in API ■ Protobufs as Interface (c++/go clients)
  • 11. Goal: Rolling Upgrade 1.x to 2.x ● SemVer for DML at least in 2.x ○ No admin of hbase-2.x with hbase-1.x client ○ Coprocessors will need to be upgraded ● hbase-1.x client can work against hbase-2.x cluster ○ Even 1.x Coprocessor Endpoints will work with an hbase-2.x cluster ● No Singularity! No downtime.
  • 12. Other Goals ● Scale ○ More Regions, bigger clusters ● Performance ○ Inline read/write but also macro restart, assign, etc. ○ Better resource utilization ■ I/O, RAM ● Fix primary root of operational woes/bugs ○ Master Region Assignment ● Cleanup ○ Spark narrative ○ Interfaces
  • 13. 2.0.0
  • 14. What’s inside hbase-2.0.0? ● Currently >4400 issues resolved ○ ~500 exclusive to 2.0.0
  • 15. Inside: Prerequisites ● JDK8 only ● Hadoop-2.7.1 minimum ○ Will work against the coming Hadoop-3.x
  • 16. Inside: Main Features ● New Master Core (A.K.A AMv2) ○ Prompt assign of millions of Regions, faster startup, larger scale! ○ Many small Regions instead of a few big ones ○ Promise: new degree of Resilience ● Offheap Read/Write path ○ Less JVM ● Accordion ○ In-memory compaction ○ Less write amplification ● New Async Client ● etc.
  • 17. Assignment Manager ● Assignment Manager v1 root of many operational headaches ● Redo based on custom “ProcedureV2”-based State Machine ○ Scale/Performance ○ All Master ops recast as Pv2 procedures ■ E.g. Move is a Procedure composed of an Unassign and Assign subprocedure... ● Operation aggregation ● Entity locking mechanism ○ Isolate operations on tables, regions, servers ○ Parallelization ● One hbase:meta writer, the Master only
  • 18. Assignment Manager ● No more intermediate state in ZK ○ At other end of an RPC... ○ Record intent in local Master ops WAL, persisted up in HDFS ○ Only final state published to hbase:meta ● No more HBCK, no more RIT ○ No more distributed state ● Faster Assign ○ PE AMv2 ● Standalone Testable
  • 19. Offheap ● Smaller JVM heaps, less copying ○ More accounting! ● Offheap Read Path ○ L1 on Heap for Indexes & Blooms ○ L2 off Heap via BucketCache for Data ■ Big ○ Better latency ■ Cache more ■ Less GC, less erratic ● Offheap Write Path ○ RPC=>HDFS data kept offheap (+Async WAL client) ● Follow-ons: ○ BucketCache always-on
  • 20. Accordion ● In-memory LSM ○ In-memory flush from ConcurrentSkipListMap to read-only pipeline of ‘segments’ ■ Memory ‘breathes’ like an accordion’s bellows... ■ Optional in-memory compaction of Segments ● Prune deletes/versions early ● Benefit depends on # of versions ■ Less flushing, write amplification ● Flavors: ○ NONE ○ BASIC (Default) ○ ENHANCED ● Symbiosis w/ Offheaping ○ MemStore SLABs offheap ○ Compact in-memory layout (CellChunkMap)
  • 21. Miscellaneous: Complete ● Medium Object Blobs (MOB) ○ Documents, Images, etc. ○ 100k => 10MB ● RegionServer Groups (rsgroups) ○ Coarse Isolation ○ Tables vs RegionServers ● WALs and HFiles in different Filesystems ○ AWS EMR HBase on S3 offering (Amazon S3 Storage Mode) ● New Netty Server/Client chassis ● Filesystem Quotas on Table/Namespace sizes ○ Per Table, per Namespace -- not per User ○ Violation Policy ○ Via Shell
  • 22. Miscellaneous: In progress, BLOCKERs ● Spark module ○ Cleanup of our HBase Spark story ● AsyncDFSClient ○ Done but for the testing ● Update+Relocation of core dependencies ○ hbase-thirdparty ■ Guava 0.12 => 0.22 ■ Protobuf 2.5 => 3.3 ■ Netty ○ Etc.
  • 23. Miscellaneous: In progress, Nice-to-have ● C++Client ● Backup/Restore ● Hybrid Logical Clock ○ Leap Seconds or Errant clock ○ Overload timestamp to do time and sequence id duty ○ Metadata only for 2.x
  • 24. TODOs: BLOCKERs ● Edit of default configs ○ Stale ● API compatibility review ○ And that it wholesome! ● Testing ○ Each new feature/configuration ○ Rolling upgrade ■ Different data provenance/Starting point ● Operation & Scan Timeout Narrative Cleanup ● Sequence Identifier narrative ● Tie-it-off ○ We keep adding to 2.0.0… it’ll never release.
  • 25. Moving to 2.x ● API Cleanup ○ Interfaces are returned rather than Implementations ■ TableDescriptor instead of HTableDescriptor, etc. ○ Purged Protobuf and Guava classes from API ■ Purged from CLASSPATH too….(relocated) ● Evangelism: ○ Use shaded hbase-client and hbase-server instead going forward
  • 26.
  • 27. hbase-3.0.0 ● Split hbase:meta ● Refactor filesystem Interface and Implementation ○ Less dependency on HDFS ○ Scale ● HLC everywhere all the time on all tables ● Replication v2 ● Dynamic Online Configuration
  • 28. That’s it! ● Thank you to our host Huawei ● Running 2.0 Status: hbase-2.0.0 Status Doc ● References: ○ Matteo Bertozzi on hbase2: https://speakerdeck.com/matteobertozzi/hbase-2-dot-0-sf-meetup-oct-15 ○ Enis Soztutar on hbase2: https://www.slideshare.net/enissoz/meet-hbase-20 ● Images: ● Accordion: https://static.roland.com/assets/images/products/gallery/fr-1x_top_open_gal.jpg ● Version2: http://logofaves.com/wp-content/uploads/2009/06/version_m.jpg?9cf02b ● Long Time: http://www.junodownload.com/products/one-night-only-long-time-coming/1957247-02/ ● Comic: http://4.bp.blogspot.com/-upwza0_lLn4/TmXB4lKkPKI/AAAAAAAAAHY/9lA7VYCmSkI/s1600/heap_0001.jpg ● Stork: http://phicenter.org/wp-content/uploads/2012/08/stork_baby_delivery_720x540.jpg ● Apache Helicopter: https://i.ytimg.com/vi/SQp-3fUBefA/maxresdefault.jpg ● Goals: http://az616578.vo.msecnd.net/files/2016/02/29/6359235488335557721545302495_GOALS600.png