SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
HBase
           Introduction to
           column oriented
           databases




        Luís Cipriani
        @lfcipriani (twitter, linkedin, github, ...)
        22o. GURU (2012-02-25) - Sao Paulo/Brazil

sexta-feira, 24 de fevereiro de 12
ME




sexta-feira, 24 de fevereiro de 12
intro




                                     “A BigTable HBase is a sparse,
                                         distributed, persistent
                                     multidimensional sorted map”




                                                     http://research.google.com/archive/bigtable.html
sexta-feira, 24 de fevereiro de 12
intro > data model
                           {                           <-- table
                                // ...
                                "aaaaa" : {            <--   row
                                   "A" : {             <--   column family
                                      "foo" : {        <--   column (qualifier)
                                        15: "y",       <--   timestamp, value
                                         4: "m"
                                      }
                                      "bar" : {...}
                                   },
                                   "B" : {
                                      "" : {...}
                                   }
                                },
                                "aaaab" : {
                                   "A" : {
                                      "foo" : {...},
                                      "bar" : {...},
                                      "joe" : {...}
                                   },
                                   "B" : {
                                      "" : {...}
                                   }
                                },
                                // ...
                           }
sexta-feira, 24 de fevereiro de 12
intro > data model




      (Table, RowKey, Family, Column, Timestamp) → Value




sexta-feira, 24 de fevereiro de 12
intro > hadoop stack




                          • hadoop HDFS (or not)
                          • hadoop MapReduce
                          • hadoop ZooKeeper
                          • hadoop HBase
                          • hadoop Hue, Whirr, etc...



sexta-feira, 24 de fevereiro de 12
architecture




sexta-feira, 24 de fevereiro de 12
key design > read/write model




               • randon reads (get)
               • sequential reads (scan)
                 • partial key scans
               • writes (put = update)



sexta-feira, 24 de fevereiro de 12
key design > storage model




                                     http://ofps.oreilly.com/titles/9781449396107/advanced.html
sexta-feira, 24 de fevereiro de 12
key design > strategies


                                 • tall-narrow vs flat-wide
                                 • partial key scans
                                 • pagination
                                 • time series
                                    • salting
                                    • field swap
                                    • randomization
                                 • secondary indexes

sexta-feira, 24 de fevereiro de 12
key design > example




sexta-feira, 24 de fevereiro de 12
development




            • installation modes
               • standalone, pseudo-distributed, distributed
            • JRuby console
            • Access
               • java/jruby API (more features)
               • entrypoints REST, Thrift, Avro, Protobuffers
               • there several other libs


sexta-feira, 24 de fevereiro de 12
cons




            • complex config and maintenance
            • hot regions
            • no secondary index built-in
            • no transactions built-in
            • complex schema design




sexta-feira, 24 de fevereiro de 12
pros




               • distributed
               • scalable (auto-sharding)
               • built on Hadoop stack
               • handles Big Data
               • high performance for write and read
               • no SPOF
               • fault tolerant, no data loss
               • active community



sexta-feira, 24 de fevereiro de 12
Reformulação Box de Login                                                       Abril ID
   http://engineering.abril.com.br/
   http://abr.io/hbase-intro
   https://pinboard.in/u:lfcipriani/t:hbase/
   http://hbase.apache.org/




                                     ?   http://shop.oreilly.com/product/0636920014348.do




sexta-feira, 24 de fevereiro de 12

Weitere ähnliche Inhalte

Ähnlich wie Hbase: Introduction to column oriented databases

Hive introduction 介绍
Hive  introduction 介绍Hive  introduction 介绍
Hive introduction 介绍ablozhou
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
Brust hadoopecosystem
Brust hadoopecosystemBrust hadoopecosystem
Brust hadoopecosystemAndrew Brust
 
Generalized framework for using NoSQL Databases
Generalized framework for using NoSQL DatabasesGeneralized framework for using NoSQL Databases
Generalized framework for using NoSQL DatabasesKIRAN V
 
Hadoop/Mahout/HBaseで テキスト分類器を作ったよ
Hadoop/Mahout/HBaseで テキスト分類器を作ったよHadoop/Mahout/HBaseで テキスト分類器を作ったよ
Hadoop/Mahout/HBaseで テキスト分類器を作ったよNaoki Yanai
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) BigDataEverywhere
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtMichael Stack
 
HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017larsgeorge
 
Hadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureHadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureRyan Hennig
 

Ähnlich wie Hbase: Introduction to column oriented databases (14)

Hive introduction 介绍
Hive  introduction 介绍Hive  introduction 介绍
Hive introduction 介绍
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Brust hadoopecosystem
Brust hadoopecosystemBrust hadoopecosystem
Brust hadoopecosystem
 
Generalized framework for using NoSQL Databases
Generalized framework for using NoSQL DatabasesGeneralized framework for using NoSQL Databases
Generalized framework for using NoSQL Databases
 
Hadoop/Mahout/HBaseで テキスト分類器を作ったよ
Hadoop/Mahout/HBaseで テキスト分類器を作ったよHadoop/Mahout/HBaseで テキスト分類器を作ったよ
Hadoop/Mahout/HBaseで テキスト分類器を作ったよ
 
20100128ebay
20100128ebay20100128ebay
20100128ebay
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
 
HBase, no trouble
HBase, no troubleHBase, no trouble
HBase, no trouble
 
Hadoop - Introduction to Hadoop
Hadoop - Introduction to HadoopHadoop - Introduction to Hadoop
Hadoop - Introduction to Hadoop
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
 
HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017
 
MongoDB at GUL
MongoDB at GULMongoDB at GUL
MongoDB at GUL
 
Hadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureHadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and Future
 

Mehr von Luis Cipriani

Adventures with Raspberry Pi and Twitter API
Adventures with Raspberry Pi and Twitter APIAdventures with Raspberry Pi and Twitter API
Adventures with Raspberry Pi and Twitter APILuis Cipriani
 
Capturando o pulso do planeta com as APIs de Streaming do Twitter
Capturando o pulso do planeta com as APIs de Streaming do TwitterCapturando o pulso do planeta com as APIs de Streaming do Twitter
Capturando o pulso do planeta com as APIs de Streaming do TwitterLuis Cipriani
 
Twitter e suas APIs de Streaming - Campus Party Brasil 7
Twitter e suas APIs de Streaming - Campus Party Brasil 7Twitter e suas APIs de Streaming - Campus Party Brasil 7
Twitter e suas APIs de Streaming - Campus Party Brasil 7Luis Cipriani
 
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupados
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupadosSegurança de APIs HTTP, um guia sensato para desenvolvedores preocupados
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupadosLuis Cipriani
 
API Caching, why your server needs some rest
API Caching, why your server needs some restAPI Caching, why your server needs some rest
API Caching, why your server needs some restLuis Cipriani
 
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...Explaining A Programming Model for Context-Aware Applications in Large-Scale ...
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...Luis Cipriani
 
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...Luis Cipriani
 
Como um verdadeiro sistema REST funciona: arquitetura e performance na Abril
Como um verdadeiro sistema REST funciona: arquitetura e performance na AbrilComo um verdadeiro sistema REST funciona: arquitetura e performance na Abril
Como um verdadeiro sistema REST funciona: arquitetura e performance na AbrilLuis Cipriani
 
Explaining Semantic Web
Explaining Semantic WebExplaining Semantic Web
Explaining Semantic WebLuis Cipriani
 
Fearless HTTP requests abuse
Fearless HTTP requests abuseFearless HTTP requests abuse
Fearless HTTP requests abuseLuis Cipriani
 

Mehr von Luis Cipriani (10)

Adventures with Raspberry Pi and Twitter API
Adventures with Raspberry Pi and Twitter APIAdventures with Raspberry Pi and Twitter API
Adventures with Raspberry Pi and Twitter API
 
Capturando o pulso do planeta com as APIs de Streaming do Twitter
Capturando o pulso do planeta com as APIs de Streaming do TwitterCapturando o pulso do planeta com as APIs de Streaming do Twitter
Capturando o pulso do planeta com as APIs de Streaming do Twitter
 
Twitter e suas APIs de Streaming - Campus Party Brasil 7
Twitter e suas APIs de Streaming - Campus Party Brasil 7Twitter e suas APIs de Streaming - Campus Party Brasil 7
Twitter e suas APIs de Streaming - Campus Party Brasil 7
 
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupados
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupadosSegurança de APIs HTTP, um guia sensato para desenvolvedores preocupados
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupados
 
API Caching, why your server needs some rest
API Caching, why your server needs some restAPI Caching, why your server needs some rest
API Caching, why your server needs some rest
 
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...Explaining A Programming Model for Context-Aware Applications in Large-Scale ...
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...
 
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...
 
Como um verdadeiro sistema REST funciona: arquitetura e performance na Abril
Como um verdadeiro sistema REST funciona: arquitetura e performance na AbrilComo um verdadeiro sistema REST funciona: arquitetura e performance na Abril
Como um verdadeiro sistema REST funciona: arquitetura e performance na Abril
 
Explaining Semantic Web
Explaining Semantic WebExplaining Semantic Web
Explaining Semantic Web
 
Fearless HTTP requests abuse
Fearless HTTP requests abuseFearless HTTP requests abuse
Fearless HTTP requests abuse
 

Kürzlich hochgeladen

Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 

Kürzlich hochgeladen (20)

Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Hbase: Introduction to column oriented databases

  • 1. HBase Introduction to column oriented databases Luís Cipriani @lfcipriani (twitter, linkedin, github, ...) 22o. GURU (2012-02-25) - Sao Paulo/Brazil sexta-feira, 24 de fevereiro de 12
  • 2. ME sexta-feira, 24 de fevereiro de 12
  • 3. intro “A BigTable HBase is a sparse, distributed, persistent multidimensional sorted map” http://research.google.com/archive/bigtable.html sexta-feira, 24 de fevereiro de 12
  • 4. intro > data model { <-- table // ... "aaaaa" : { <-- row "A" : { <-- column family "foo" : { <-- column (qualifier) 15: "y", <-- timestamp, value 4: "m" } "bar" : {...} }, "B" : { "" : {...} } }, "aaaab" : { "A" : { "foo" : {...}, "bar" : {...}, "joe" : {...} }, "B" : { "" : {...} } }, // ... } sexta-feira, 24 de fevereiro de 12
  • 5. intro > data model (Table, RowKey, Family, Column, Timestamp) → Value sexta-feira, 24 de fevereiro de 12
  • 6. intro > hadoop stack • hadoop HDFS (or not) • hadoop MapReduce • hadoop ZooKeeper • hadoop HBase • hadoop Hue, Whirr, etc... sexta-feira, 24 de fevereiro de 12
  • 8. key design > read/write model • randon reads (get) • sequential reads (scan) • partial key scans • writes (put = update) sexta-feira, 24 de fevereiro de 12
  • 9. key design > storage model http://ofps.oreilly.com/titles/9781449396107/advanced.html sexta-feira, 24 de fevereiro de 12
  • 10. key design > strategies • tall-narrow vs flat-wide • partial key scans • pagination • time series • salting • field swap • randomization • secondary indexes sexta-feira, 24 de fevereiro de 12
  • 11. key design > example sexta-feira, 24 de fevereiro de 12
  • 12. development • installation modes • standalone, pseudo-distributed, distributed • JRuby console • Access • java/jruby API (more features) • entrypoints REST, Thrift, Avro, Protobuffers • there several other libs sexta-feira, 24 de fevereiro de 12
  • 13. cons • complex config and maintenance • hot regions • no secondary index built-in • no transactions built-in • complex schema design sexta-feira, 24 de fevereiro de 12
  • 14. pros • distributed • scalable (auto-sharding) • built on Hadoop stack • handles Big Data • high performance for write and read • no SPOF • fault tolerant, no data loss • active community sexta-feira, 24 de fevereiro de 12
  • 15. Reformulação Box de Login Abril ID http://engineering.abril.com.br/ http://abr.io/hbase-intro https://pinboard.in/u:lfcipriani/t:hbase/ http://hbase.apache.org/ ? http://shop.oreilly.com/product/0636920014348.do sexta-feira, 24 de fevereiro de 12