SlideShare a Scribd company logo
1 of 21
Download to read offline
HBase Data Types 
Nick Dimiduk, Hortonworks 
@xefyr n10k.com
Agenda 
• Motivations 
• Progress thus far 
• Future work 
• Examples 
• More Examples 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
2
Why introduce types? 
• Δ(SQL, byte[]): (╯°□°)╯︵ ┻━┻ 
• Rule of least surprise 
• Interoperability across tools 
• Distill best practices 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
3
Considerations 
• Opt-in for current users 
• Easy transition for existing applications 
• Client-side only mostly 
– Filters, Split policies, Coprocessors, Block encoding 
• Avoid POJO constraints 
– No required base-class/interface 
– No magic (avoid ASM, ORM) 
• Non-Java clients 
• HBASE-8089 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
4
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
5
Inspiration 
• Orderly 
• PostgreSQL / PostGIS 
• HBASE-7221 
• HBASE-7692 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
6
Features: Encoding 
• Order preservation 
• Override direction (ASC/DSC) 
• Fixed, variable-width 
• Null-able 
• Self-identifying 
• Efficient 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
7
Features: API 
• Complex type encoding 
– Compound rowkey pattern 
– Order preservation 
– Nullable fields 
• Runtime metadata 
• User-extensible 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
8
Implementation$ 
HBASE-8089
Implementation: Encoding 
o.a.h.h.util.OrderedBytes 
• null 
• numeric, +/-Inf, NaN 
• int8, int16, int32, int64 
• float32, float64 
• variable-length text 
• variable-length blob 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
o.a.h.h.util.Bytes 
• numeric 
• boolean 
• int16, int32, int64 
• float32, float64 
• variable-length text 
2014-­‐11-­‐18 
10
Implementation: API 
interface DataType<T> 
• decode() 
• encode() 
• encodedClass() 
• encodedLength() 
• getOrder() 
• isNullable() 
• isOrderPreserving() 
• isSkippable() 
• skip() 
implements DataType 
• OrderedXXX 
• RawXXX 
• Struct 
– StructBuilder 
– StructIterator 
– TerminatedWrapper 
– FixedLengthWrapper 
• Union{2,3,4} 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
11
Up Next 
• “Default” types 
• More complex types 
– Arrays/Lists 
– Maps/Dicts 
• Tool integration 
– Apache Phoenix 
– Cloudera Kite 
• Performance audit, HBASE-8694 
• Improved metadata, 
HBASE-8863 
– isCastableTo 
– isCoercableTo 
– isComparableTo 
• TypedTable, HBASE-7941 
• Beyond Java, HBASE-10091 
– REST 
– Thrift 
– Shell 
• ImportTsv, HBASE-8593 
• User documentation 
• Coprocessors? 
• Filters? 
• CAS? 
• DataBlockEncoders? 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
12
Examples
A case for TypedTable 
Put p = new Put(Bytes.toBytes(u.user)); 
p.add(INFO_FAM, USER_COL, Bytes.toBytes(u.user)); 
p.add(INFO_FAM, NAME_COL, Bytes.toBytes(u.name)); 
p.add(INFO_FAM, EMAIL_COL, Bytes.toBytes(u.email)); 
p.add(INFO_FAM, PASS_COL, Bytes.toBytes(u.password)); 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
14
A case for TypedTable! 
static final RawString ENC_STR = new RawString();! 
static final RawLong ENC_LONG = new RawLong();! 
--! 
! 
SimplePositionedByteRange pbr =! 
new SimplePositionedByteRange(100);! 
ENC_STR.encode(pbr, u.user);! 
Put p = new Put(Bytes.copy(pbr.getBytes(), pbr.getOffset(), 
pbr.getPosition()));! 
p.add(INFO_FAM, USER_COL, Bytes.copy(pbr.getBytes(), ...);! 
pbr.setPosition(0);! 
ENC_STR.encode(pbr, u.name);! 
p.add(INFO_FAM, NAME_COL, Bytes.copy(pbr.getBytes(), ...);! 
...! 
2014-­‐11-­‐18 
15 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License.
Structs: writing 
! 
! 
! 
Struct struct = new StructBuilder()! 
.add(OrderedNumeric.ASCENDING)! 
.add(OrderedString.ASCENDING)! 
.toStruct();! 
PositionedByteRange buf1 =! 
new SimplePositionedByteRange(7);! 
struct.encode(buf1,! 
new Object[] { BigDecimal.ONE, "foo" });! 
! 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
16
Structs: reading 
! 
! 
! 
! 
buf1.setPosition(0);! 
StructIterator it = longer.iterator(buf1);! 
while (it.hasNext()) {! 
System.out.print(it.next() + ", ");! 
}! 
! 
> BigDecimal.ONE, foo! 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
17
Structs: schema migration 
Struct addedFields = new StructBuilder()! 
.add(OrderedNumeric.ASCENDING)! 
.add(OrderedString.ASCENDING)! 
.add(OrderedString.ASCENDING)! 
.add(OrderedNumeric.ASCENDING)! 
.toStruct();! 
! 
buf1.setPosition(0);! 
StructIterator it = longer.iterator(buf1);! 
while (it.hasNext()) {! 
System.out.print(it.next() + ", ");! 
}! 
> BigDecimal.ONE, foo, null, null! 
!2014-­‐11-­‐18 
18 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License.
Protobuf (HBASE-11161) 
! 
class PBKeyValue extends PBType<CellProtos.KeyValue> {! 
! 
@Override! 
public int encode(PositionedByteRange dst, KeyValue val) {! 
CodedOutputStream os = outputStreamFromByteRange(dst);! 
int before = os.spaceLeft(), after, written;! 
val.writeTo(os);! 
after = os.spaceLeft();! 
written = before - after;! 
dst.setPosition(dst.getPosition() + written);! 
return written;! 
}! 
2014-­‐11-­‐18 
19 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License.
More Examples$ 
https://gist.github.com/ndimiduk/bcf33f09cc7e4408f684
Thanks! 
M A N N I N G 
Nick Dimiduk 
Amandeep Khurana 
FOREWORD BY 
Michael Stack 
hbaseinaction.com 
Nick Dimiduk 
github.com/ndimiduk 
@xefyr 
n10k.com 
http://s.apache.org/bGN 
Licensed 
under 
a 
Crea3ve 
Commons 
A8ribu3on-­‐ShareAlike 
3.0 
Unported 
License. 
2014-­‐11-­‐18 
21

More Related Content

Viewers also liked

The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processingnathanmarz
 
The Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data SystemsThe Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data Systemsnathanmarz
 
Data Engineering Quick Guide
Data Engineering Quick GuideData Engineering Quick Guide
Data Engineering Quick GuideAsim Jalis
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseNick Dimiduk
 
11 Hard to Ignore Data Analytics Quotes
11 Hard to Ignore Data Analytics Quotes11 Hard to Ignore Data Analytics Quotes
11 Hard to Ignore Data Analytics QuotesCloudlytics
 
Demystifying Data Engineering
Demystifying Data EngineeringDemystifying Data Engineering
Demystifying Data Engineeringnathanmarz
 
Big Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBig Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBernard Marr
 
Big Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBig Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBernard Marr
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBernard Marr
 

Viewers also liked (12)

Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processing
 
Big data road map
Big data road mapBig data road map
Big data road map
 
The Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data SystemsThe Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data Systems
 
Data Engineering Quick Guide
Data Engineering Quick GuideData Engineering Quick Guide
Data Engineering Quick Guide
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
 
11 Hard to Ignore Data Analytics Quotes
11 Hard to Ignore Data Analytics Quotes11 Hard to Ignore Data Analytics Quotes
11 Hard to Ignore Data Analytics Quotes
 
Demystifying Data Engineering
Demystifying Data EngineeringDemystifying Data Engineering
Demystifying Data Engineering
 
Big Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBig Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business Needs
 
Big Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBig Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must Know
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 

Similar to HBase Data Types

OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoNathaniel Braun
 
OpenStack Swift的性能调优
OpenStack Swift的性能调优OpenStack Swift的性能调优
OpenStack Swift的性能调优Hardway Hou
 
Native Cloud-Native: Building Agile Microservices with the Micronaut Framework
Native Cloud-Native: Building Agile Microservices with the Micronaut FrameworkNative Cloud-Native: Building Agile Microservices with the Micronaut Framework
Native Cloud-Native: Building Agile Microservices with the Micronaut FrameworkZachary Klein
 
SophiaConf2010 Présentation des Retours d'expériences de la Conférence du 08 ...
SophiaConf2010 Présentation des Retours d'expériences de la Conférence du 08 ...SophiaConf2010 Présentation des Retours d'expériences de la Conférence du 08 ...
SophiaConf2010 Présentation des Retours d'expériences de la Conférence du 08 ...TelecomValley
 
Postcards from the post xss world- content exfiltration null
Postcards from the post xss world- content exfiltration nullPostcards from the post xss world- content exfiltration null
Postcards from the post xss world- content exfiltration nullPiyush Pattanayak
 
2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar SlidesDuraSpace
 
Introduction to InfluxDB 2.0 & Your First Flux Query by Sonia Gupta, Develope...
Introduction to InfluxDB 2.0 & Your First Flux Query by Sonia Gupta, Develope...Introduction to InfluxDB 2.0 & Your First Flux Query by Sonia Gupta, Develope...
Introduction to InfluxDB 2.0 & Your First Flux Query by Sonia Gupta, Develope...InfluxData
 
Deep Dive on Accelerating Content, APIs, and Applications with Amazon CloudFr...
Deep Dive on Accelerating Content, APIs, and Applications with Amazon CloudFr...Deep Dive on Accelerating Content, APIs, and Applications with Amazon CloudFr...
Deep Dive on Accelerating Content, APIs, and Applications with Amazon CloudFr...Amazon Web Services
 
Meetup 12-12-2017 - Application Isolation on Kubernetes
Meetup 12-12-2017 - Application Isolation on KubernetesMeetup 12-12-2017 - Application Isolation on Kubernetes
Meetup 12-12-2017 - Application Isolation on Kubernetesdtoledo67
 
Developing applications with Hyperledger Fabric SDK
Developing applications with Hyperledger Fabric SDKDeveloping applications with Hyperledger Fabric SDK
Developing applications with Hyperledger Fabric SDKHorea Porutiu
 
Arcomem training Specifying Crawls Beginners
Arcomem training Specifying Crawls BeginnersArcomem training Specifying Crawls Beginners
Arcomem training Specifying Crawls Beginnersarcomem
 
Three Years of Lessons Running Potentially Malicious Code Inside Containers
Three Years of Lessons Running Potentially Malicious Code Inside ContainersThree Years of Lessons Running Potentially Malicious Code Inside Containers
Three Years of Lessons Running Potentially Malicious Code Inside ContainersBen Hall
 
FIWARE Primer - Learn FIWARE in 60 Minutes
FIWARE Primer - Learn FIWARE in 60 MinutesFIWARE Primer - Learn FIWARE in 60 Minutes
FIWARE Primer - Learn FIWARE in 60 MinutesFederico Michele Facca
 
Federico Michele Facca - FIWARE Primer - Learn FIWARE in 60 Minutes
Federico Michele Facca - FIWARE Primer - Learn FIWARE in 60 MinutesFederico Michele Facca - FIWARE Primer - Learn FIWARE in 60 Minutes
Federico Michele Facca - FIWARE Primer - Learn FIWARE in 60 MinutesCodemotion
 
CCi Technology Infrastructure 2006
CCi Technology Infrastructure 2006CCi Technology Infrastructure 2006
CCi Technology Infrastructure 2006Mike Linksvayer
 
Building APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformBuilding APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformAntonio Peric-Mazar
 
AWS Public Data Sets: How to Stage Petabytes of Data for Analysis in AWS (WPS...
AWS Public Data Sets: How to Stage Petabytes of Data for Analysis in AWS (WPS...AWS Public Data Sets: How to Stage Petabytes of Data for Analysis in AWS (WPS...
AWS Public Data Sets: How to Stage Petabytes of Data for Analysis in AWS (WPS...Amazon Web Services
 
Building APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformBuilding APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformAntonio Peric-Mazar
 
OpenStack Architecture
OpenStack ArchitectureOpenStack Architecture
OpenStack ArchitectureMirantis
 

Similar to HBase Data Types (20)

OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ Criteo
 
OpenStack Swift的性能调优
OpenStack Swift的性能调优OpenStack Swift的性能调优
OpenStack Swift的性能调优
 
Native Cloud-Native: Building Agile Microservices with the Micronaut Framework
Native Cloud-Native: Building Agile Microservices with the Micronaut FrameworkNative Cloud-Native: Building Agile Microservices with the Micronaut Framework
Native Cloud-Native: Building Agile Microservices with the Micronaut Framework
 
SophiaConf2010 Présentation des Retours d'expériences de la Conférence du 08 ...
SophiaConf2010 Présentation des Retours d'expériences de la Conférence du 08 ...SophiaConf2010 Présentation des Retours d'expériences de la Conférence du 08 ...
SophiaConf2010 Présentation des Retours d'expériences de la Conférence du 08 ...
 
Postcards from the post xss world- content exfiltration null
Postcards from the post xss world- content exfiltration nullPostcards from the post xss world- content exfiltration null
Postcards from the post xss world- content exfiltration null
 
2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides
 
Introduction to InfluxDB 2.0 & Your First Flux Query by Sonia Gupta, Develope...
Introduction to InfluxDB 2.0 & Your First Flux Query by Sonia Gupta, Develope...Introduction to InfluxDB 2.0 & Your First Flux Query by Sonia Gupta, Develope...
Introduction to InfluxDB 2.0 & Your First Flux Query by Sonia Gupta, Develope...
 
Deep Dive on Accelerating Content, APIs, and Applications with Amazon CloudFr...
Deep Dive on Accelerating Content, APIs, and Applications with Amazon CloudFr...Deep Dive on Accelerating Content, APIs, and Applications with Amazon CloudFr...
Deep Dive on Accelerating Content, APIs, and Applications with Amazon CloudFr...
 
Meetup 12-12-2017 - Application Isolation on Kubernetes
Meetup 12-12-2017 - Application Isolation on KubernetesMeetup 12-12-2017 - Application Isolation on Kubernetes
Meetup 12-12-2017 - Application Isolation on Kubernetes
 
Developing applications with Hyperledger Fabric SDK
Developing applications with Hyperledger Fabric SDKDeveloping applications with Hyperledger Fabric SDK
Developing applications with Hyperledger Fabric SDK
 
Building Client-Side Attacks with HTML5 Features
Building Client-Side Attacks with HTML5 FeaturesBuilding Client-Side Attacks with HTML5 Features
Building Client-Side Attacks with HTML5 Features
 
Arcomem training Specifying Crawls Beginners
Arcomem training Specifying Crawls BeginnersArcomem training Specifying Crawls Beginners
Arcomem training Specifying Crawls Beginners
 
Three Years of Lessons Running Potentially Malicious Code Inside Containers
Three Years of Lessons Running Potentially Malicious Code Inside ContainersThree Years of Lessons Running Potentially Malicious Code Inside Containers
Three Years of Lessons Running Potentially Malicious Code Inside Containers
 
FIWARE Primer - Learn FIWARE in 60 Minutes
FIWARE Primer - Learn FIWARE in 60 MinutesFIWARE Primer - Learn FIWARE in 60 Minutes
FIWARE Primer - Learn FIWARE in 60 Minutes
 
Federico Michele Facca - FIWARE Primer - Learn FIWARE in 60 Minutes
Federico Michele Facca - FIWARE Primer - Learn FIWARE in 60 MinutesFederico Michele Facca - FIWARE Primer - Learn FIWARE in 60 Minutes
Federico Michele Facca - FIWARE Primer - Learn FIWARE in 60 Minutes
 
CCi Technology Infrastructure 2006
CCi Technology Infrastructure 2006CCi Technology Infrastructure 2006
CCi Technology Infrastructure 2006
 
Building APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformBuilding APIs in an easy way using API Platform
Building APIs in an easy way using API Platform
 
AWS Public Data Sets: How to Stage Petabytes of Data for Analysis in AWS (WPS...
AWS Public Data Sets: How to Stage Petabytes of Data for Analysis in AWS (WPS...AWS Public Data Sets: How to Stage Petabytes of Data for Analysis in AWS (WPS...
AWS Public Data Sets: How to Stage Petabytes of Data for Analysis in AWS (WPS...
 
Building APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformBuilding APIs in an easy way using API Platform
Building APIs in an easy way using API Platform
 
OpenStack Architecture
OpenStack ArchitectureOpenStack Architecture
OpenStack Architecture
 

More from Nick Dimiduk

Apache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixApache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixNick Dimiduk
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 ReleaseNick Dimiduk
 
HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014Nick Dimiduk
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101Nick Dimiduk
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low LatencyNick Dimiduk
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for ArchitectsNick Dimiduk
 
HBase Data Types (WIP)
HBase Data Types (WIP)HBase Data Types (WIP)
HBase Data Types (WIP)Nick Dimiduk
 
Bring Cartography to the Cloud
Bring Cartography to the CloudBring Cartography to the Cloud
Bring Cartography to the CloudNick Dimiduk
 
HBase for Architects
HBase for ArchitectsHBase for Architects
HBase for ArchitectsNick Dimiduk
 
HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)Nick Dimiduk
 
Pig, Making Hadoop Easy
Pig, Making Hadoop EasyPig, Making Hadoop Easy
Pig, Making Hadoop EasyNick Dimiduk
 
Introduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLIntroduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLNick Dimiduk
 

More from Nick Dimiduk (12)

Apache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixApache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - Phoenix
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
 
HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low Latency
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for Architects
 
HBase Data Types (WIP)
HBase Data Types (WIP)HBase Data Types (WIP)
HBase Data Types (WIP)
 
Bring Cartography to the Cloud
Bring Cartography to the CloudBring Cartography to the Cloud
Bring Cartography to the Cloud
 
HBase for Architects
HBase for ArchitectsHBase for Architects
HBase for Architects
 
HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)
 
Pig, Making Hadoop Easy
Pig, Making Hadoop EasyPig, Making Hadoop Easy
Pig, Making Hadoop Easy
 
Introduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLIntroduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQL
 

Recently uploaded

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

HBase Data Types

  • 1. HBase Data Types Nick Dimiduk, Hortonworks @xefyr n10k.com
  • 2. Agenda • Motivations • Progress thus far • Future work • Examples • More Examples Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 2
  • 3. Why introduce types? • Δ(SQL, byte[]): (╯°□°)╯︵ ┻━┻ • Rule of least surprise • Interoperability across tools • Distill best practices Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 3
  • 4. Considerations • Opt-in for current users • Easy transition for existing applications • Client-side only mostly – Filters, Split policies, Coprocessors, Block encoding • Avoid POJO constraints – No required base-class/interface – No magic (avoid ASM, ORM) • Non-Java clients • HBASE-8089 Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 4
  • 5. Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 5
  • 6. Inspiration • Orderly • PostgreSQL / PostGIS • HBASE-7221 • HBASE-7692 Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 6
  • 7. Features: Encoding • Order preservation • Override direction (ASC/DSC) • Fixed, variable-width • Null-able • Self-identifying • Efficient Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 7
  • 8. Features: API • Complex type encoding – Compound rowkey pattern – Order preservation – Nullable fields • Runtime metadata • User-extensible Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 8
  • 10. Implementation: Encoding o.a.h.h.util.OrderedBytes • null • numeric, +/-Inf, NaN • int8, int16, int32, int64 • float32, float64 • variable-length text • variable-length blob Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. o.a.h.h.util.Bytes • numeric • boolean • int16, int32, int64 • float32, float64 • variable-length text 2014-­‐11-­‐18 10
  • 11. Implementation: API interface DataType<T> • decode() • encode() • encodedClass() • encodedLength() • getOrder() • isNullable() • isOrderPreserving() • isSkippable() • skip() implements DataType • OrderedXXX • RawXXX • Struct – StructBuilder – StructIterator – TerminatedWrapper – FixedLengthWrapper • Union{2,3,4} Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 11
  • 12. Up Next • “Default” types • More complex types – Arrays/Lists – Maps/Dicts • Tool integration – Apache Phoenix – Cloudera Kite • Performance audit, HBASE-8694 • Improved metadata, HBASE-8863 – isCastableTo – isCoercableTo – isComparableTo • TypedTable, HBASE-7941 • Beyond Java, HBASE-10091 – REST – Thrift – Shell • ImportTsv, HBASE-8593 • User documentation • Coprocessors? • Filters? • CAS? • DataBlockEncoders? Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 12
  • 14. A case for TypedTable Put p = new Put(Bytes.toBytes(u.user)); p.add(INFO_FAM, USER_COL, Bytes.toBytes(u.user)); p.add(INFO_FAM, NAME_COL, Bytes.toBytes(u.name)); p.add(INFO_FAM, EMAIL_COL, Bytes.toBytes(u.email)); p.add(INFO_FAM, PASS_COL, Bytes.toBytes(u.password)); Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 14
  • 15. A case for TypedTable! static final RawString ENC_STR = new RawString();! static final RawLong ENC_LONG = new RawLong();! --! ! SimplePositionedByteRange pbr =! new SimplePositionedByteRange(100);! ENC_STR.encode(pbr, u.user);! Put p = new Put(Bytes.copy(pbr.getBytes(), pbr.getOffset(), pbr.getPosition()));! p.add(INFO_FAM, USER_COL, Bytes.copy(pbr.getBytes(), ...);! pbr.setPosition(0);! ENC_STR.encode(pbr, u.name);! p.add(INFO_FAM, NAME_COL, Bytes.copy(pbr.getBytes(), ...);! ...! 2014-­‐11-­‐18 15 Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License.
  • 16. Structs: writing ! ! ! Struct struct = new StructBuilder()! .add(OrderedNumeric.ASCENDING)! .add(OrderedString.ASCENDING)! .toStruct();! PositionedByteRange buf1 =! new SimplePositionedByteRange(7);! struct.encode(buf1,! new Object[] { BigDecimal.ONE, "foo" });! ! Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 16
  • 17. Structs: reading ! ! ! ! buf1.setPosition(0);! StructIterator it = longer.iterator(buf1);! while (it.hasNext()) {! System.out.print(it.next() + ", ");! }! ! > BigDecimal.ONE, foo! Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 17
  • 18. Structs: schema migration Struct addedFields = new StructBuilder()! .add(OrderedNumeric.ASCENDING)! .add(OrderedString.ASCENDING)! .add(OrderedString.ASCENDING)! .add(OrderedNumeric.ASCENDING)! .toStruct();! ! buf1.setPosition(0);! StructIterator it = longer.iterator(buf1);! while (it.hasNext()) {! System.out.print(it.next() + ", ");! }! > BigDecimal.ONE, foo, null, null! !2014-­‐11-­‐18 18 Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License.
  • 19. Protobuf (HBASE-11161) ! class PBKeyValue extends PBType<CellProtos.KeyValue> {! ! @Override! public int encode(PositionedByteRange dst, KeyValue val) {! CodedOutputStream os = outputStreamFromByteRange(dst);! int before = os.spaceLeft(), after, written;! val.writeTo(os);! after = os.spaceLeft();! written = before - after;! dst.setPosition(dst.getPosition() + written);! return written;! }! 2014-­‐11-­‐18 19 Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License.
  • 21. Thanks! M A N N I N G Nick Dimiduk Amandeep Khurana FOREWORD BY Michael Stack hbaseinaction.com Nick Dimiduk github.com/ndimiduk @xefyr n10k.com http://s.apache.org/bGN Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 21