SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Nick Dimiduk - @xefyr
Founder, Drawn to Scale
nick@drawntoscalehq.com

April 28, 2010
Agenda

 what NoSQL is not
 motivation
 Hadoop
 HBase
whoami
Computer Science & Engineering at Ohio State:
Artificial Intelligence, Programming Languages, Systems
Engineering
Applied Technical Systems: Hierarchical, non-relational
data storage and analysis systems (no-sql before there was
NoSQL). Information Retrieval, Wire Serialization/RPC
(before there was Thrift/Avro), Data Visualization (GB's)
Visible Technologies: Social Media Storage, Processing,
Analytics. Monitoring, Engagement, Warehousing, and BI. (TB's)
Drawn to Scale: Big Data Storage, Processing, Retrieval,
Analytics (TB's, PB's)
Agenda

 what NoSQL is not
 motivation
 Hadoop
 HBase
What NoSQL is not.

movement
What NoSQL is not.

movement - no ANSI NoSQL-2010
one-size-fits-all
It’s not Anti-RDBMS
It’s about Choice!




   http://www.flickr.com/photos/zakh/337938459/
What NoSQL is not.

movement - no ANSI NoSQL-2010
one-size-fits-all - it’s about choice
silver bullet
What NoSQL is not.

movement - no ANSI NoSQL-2010
one-size-fits-all - it’s about choice
silver bullet - guarantees are hard
Agenda

 what NoSQL is not
 motivation
 Hadoop
 HBase
motivation
more, More, MORE Data!
motivation
more, More, MORE Data!
ACID Burns
motivation
more, More, MORE Data!
ACID Burns
BASE is good enough
motivation
more, More, MORE Data!
ACID Burns
BASE is good enough
Life’s too short
motivation
more, More, MORE Data!
ACID Burns
BASE is good enough
Life’s too short
“typical” application
“typical” application
Data Server                Village People




              App Server
growing pains
Data Server                       Villages of People




              App Servers
vertical partitioning
Data Server                   Villages of People




              App Servers




                                                   Data Server                 Villages of People




                                                                 App Servers
vertical partitioning
Data Server                   Villages of People   Data Server                 Villages of People




              App Servers                                        App Servers




Data Server                   Villages of People   Data Server                 Villages of People




              App Servers                                        App Servers
vertical partitioning
Data Server                   Villages of People




              App Servers




                                                   Data Server                 Villages of People




                                                                 App Servers
“typical” application
growing pains
Data Servers                       Villages of People




               App Servers
horizontal partitioning
              Villages of People
horizontal partitioning
              Villages of People
horizontal partitioning
                     Villages of People




   Data Layer   Application Layer
Agenda

 what NoSQL is not
 motivation
 Hadoop
 HBase
“open source, reliable, distributed
          computing”
“open source, reliable, distributed
          computing”
MapReduce - API for parallel computing
MapReduce - API for parallel computing
HDFS - distributed, replicated file system
MapReduce - API for parallel computing
HDFS - distributed, replicated file system
ZooKeeper - distributed synchronization
MapReduce - API for parallel computing
HDFS - distributed, replicated file system
ZooKeeper - distributed synchronization
Avro - Data Serialization / RPC
Agenda

 what NoSQL is not
 motivation
 Hadoop
 HBase
structured, distributed database for your
         horizontally scalable FS
structured, distributed database for your
         horizontally scalable FS
random access
random access
real-time reads/writes
random access
real-time reads/writes
simple API
random access
real-time reads/writes
simple API
big table
references
           : http://www.nosql-database.org
Eventually Consistent: http://www.allthingsdistributed.com/2007/12/
eventually_consistent.html
Soft State: http://mercury.lcs.mit.edu/~jnc/tech/hard_soft.html
Accuracy and Precision: http://en.wikipedia.org/wiki/Accuracy_and_precision
Compare and Swap: http://en.wikipedia.org/wiki/Compare-and-swap
Apache Hadoop: http://hadoop.apache.org
Google MapReduce: http://labs.google.com/papers/mapreduce.html
Google FS: http://labs.google.com/papers/gfs.html
Apache Thrift: http://incubator.apache.org/thrift/
Protobuf: http://code.google.com/p/protobuf/
Google BigTable: http://labs.google.com/papers/bigtable.html
Google Chubby: http://labs.google.com/papers/chubby.html
Questions?



Nick Dimiduk - @xefyr
Founder, Drawn to Scale
nick@drawntoscalehq.com

April 28, 2010

Weitere ähnliche Inhalte

Was ist angesagt?

Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsDr. Mirko Kämpf
 
Overview of stinger interactive query for hive
Overview of stinger   interactive query for hiveOverview of stinger   interactive query for hive
Overview of stinger interactive query for hiveDavid Kaiser
 
Hive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenchesHive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenchesDataWorks Summit
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksDataWorks Summit
 
The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkThe Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkCloudera, Inc.
 
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...Agile Testing Alliance
 
Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)Databricks
 
Analysing big data with cluster service and R
Analysing big data with cluster service and RAnalysing big data with cluster service and R
Analysing big data with cluster service and RLushi Chen
 
Comparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and sparkComparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and sparkAgnihotriGhosh2
 
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Cloudera, Inc.
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Datacwensel
 
Interactive query in hadoop
Interactive query in hadoopInteractive query in hadoop
Interactive query in hadoopRommel Garcia
 
Top Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for FresherTop Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for FresherJanBask Training
 
Hadoop Big Data A big picture
Hadoop Big Data A big pictureHadoop Big Data A big picture
Hadoop Big Data A big pictureJ S Jodha
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?sudhakara st
 
Hadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduceHadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduceUwe Printz
 

Was ist angesagt? (20)

Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
 
Overview of stinger interactive query for hive
Overview of stinger   interactive query for hiveOverview of stinger   interactive query for hive
Overview of stinger interactive query for hive
 
Hive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenchesHive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenches
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkThe Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache Spark
 
Apache Spark PDF
Apache Spark PDFApache Spark PDF
Apache Spark PDF
 
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
 
Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)
 
Analysing big data with cluster service and R
Analysing big data with cluster service and RAnalysing big data with cluster service and R
Analysing big data with cluster service and R
 
Interactive query using hadoop
Interactive query using hadoopInteractive query using hadoop
Interactive query using hadoop
 
Comparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and sparkComparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and spark
 
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Data
 
Interactive query in hadoop
Interactive query in hadoopInteractive query in hadoop
Interactive query in hadoop
 
Top Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for FresherTop Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for Fresher
 
Hadoop Big Data A big picture
Hadoop Big Data A big pictureHadoop Big Data A big picture
Hadoop Big Data A big picture
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Hadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduceHadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduce
 

Andere mochten auch

HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)Nick Dimiduk
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for ArchitectsNick Dimiduk
 
Hadoop distributed computing framework for big data
Hadoop distributed computing framework for big dataHadoop distributed computing framework for big data
Hadoop distributed computing framework for big dataCyanny LIANG
 
NoSQL with Hadoop and HBase
NoSQL with Hadoop and HBaseNoSQL with Hadoop and HBase
NoSQL with Hadoop and HBaseNGDATA
 
HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014Nick Dimiduk
 
Bring Cartography to the Cloud
Bring Cartography to the CloudBring Cartography to the Cloud
Bring Cartography to the CloudNick Dimiduk
 
HBase Data Types (WIP)
HBase Data Types (WIP)HBase Data Types (WIP)
HBase Data Types (WIP)Nick Dimiduk
 
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
 Let Spark Fly: Advantages and Use Cases for Spark on Hadoop Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
Let Spark Fly: Advantages and Use Cases for Spark on HadoopMapR Technologies
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseNick Dimiduk
 
[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming OverviewStratio
 
Apache Apex & Apace Geode In-Memory Computation, Storage & Analysis
Apache Apex & Apace Geode In-Memory Computation, Storage & Analysis  Apache Apex & Apace Geode In-Memory Computation, Storage & Analysis
Apache Apex & Apace Geode In-Memory Computation, Storage & Analysis Apache Apex
 
Spark architecture
Spark architectureSpark architecture
Spark architecturedatamantra
 
Apache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixApache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixNick Dimiduk
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 ReleaseNick Dimiduk
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low LatencyNick Dimiduk
 
Apache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterApache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterDatabricks
 
Introduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingIntroduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingCloudera, Inc.
 

Andere mochten auch (20)

HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for Architects
 
Vpork Nosql
Vpork NosqlVpork Nosql
Vpork Nosql
 
Hadoop distributed computing framework for big data
Hadoop distributed computing framework for big dataHadoop distributed computing framework for big data
Hadoop distributed computing framework for big data
 
NoSQL with Hadoop and HBase
NoSQL with Hadoop and HBaseNoSQL with Hadoop and HBase
NoSQL with Hadoop and HBase
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014
 
Bring Cartography to the Cloud
Bring Cartography to the CloudBring Cartography to the Cloud
Bring Cartography to the Cloud
 
HBase Data Types (WIP)
HBase Data Types (WIP)HBase Data Types (WIP)
HBase Data Types (WIP)
 
HBase Data Types
HBase Data TypesHBase Data Types
HBase Data Types
 
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
 Let Spark Fly: Advantages and Use Cases for Spark on Hadoop Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
 
[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview
 
Apache Apex & Apace Geode In-Memory Computation, Storage & Analysis
Apache Apex & Apace Geode In-Memory Computation, Storage & Analysis  Apache Apex & Apace Geode In-Memory Computation, Storage & Analysis
Apache Apex & Apace Geode In-Memory Computation, Storage & Analysis
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
 
Apache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixApache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - Phoenix
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low Latency
 
Apache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterApache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and Smarter
 
Introduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingIntroduction to Apache Spark Developer Training
Introduction to Apache Spark Developer Training
 

Ähnlich wie Introduction to Hadoop, HBase, and NoSQL

Large Scale Data Analysis Tools
Large Scale Data Analysis ToolsLarge Scale Data Analysis Tools
Large Scale Data Analysis Toolsboorad
 
Patterns of Cloud Applications Using Microsoft Azure Services Platform
Patterns of Cloud Applications Using Microsoft Azure Services PlatformPatterns of Cloud Applications Using Microsoft Azure Services Platform
Patterns of Cloud Applications Using Microsoft Azure Services PlatformDavid Chou
 
HP Microsoft SQL Server Data Management Solutions
HP Microsoft SQL Server Data Management SolutionsHP Microsoft SQL Server Data Management Solutions
HP Microsoft SQL Server Data Management SolutionsEduardo Castro
 
Windows Azure Platform Overview
Windows Azure Platform OverviewWindows Azure Platform Overview
Windows Azure Platform OverviewRobert MacLean
 
13h00 p duff-building-applications-with-aws-final
13h00   p duff-building-applications-with-aws-final13h00   p duff-building-applications-with-aws-final
13h00 p duff-building-applications-with-aws-finalLuiz Gustavo Santos
 
[.Net Juniors Academy] Introdução ao Cloud Computing e Windows Azure Platform
[.Net Juniors Academy] Introdução ao Cloud Computing e Windows Azure Platform[.Net Juniors Academy] Introdução ao Cloud Computing e Windows Azure Platform
[.Net Juniors Academy] Introdução ao Cloud Computing e Windows Azure PlatformVitor Tomaz
 
Arquitectura dos Serviços da plataforma Windows Azure
Arquitectura dos Serviços da plataforma Windows AzureArquitectura dos Serviços da plataforma Windows Azure
Arquitectura dos Serviços da plataforma Windows AzureComunidade NetPonto
 
[NetPonto] Arquitectura dos Serviços da plataforma Windows Azure
[NetPonto] Arquitectura dos Serviços da plataforma Windows Azure[NetPonto] Arquitectura dos Serviços da plataforma Windows Azure
[NetPonto] Arquitectura dos Serviços da plataforma Windows AzureVitor Tomaz
 
VSX 2012 Desktop Virtualization 101
VSX 2012 Desktop Virtualization 101VSX 2012 Desktop Virtualization 101
VSX 2012 Desktop Virtualization 101sbramfitt
 
A Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com PlatformA Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com PlatformSalesforce Developers
 
Sap Virtualization Week 2009
Sap Virtualization Week 2009Sap Virtualization Week 2009
Sap Virtualization Week 2009Sherry Yu
 
Hydrologic Information Systems and the CUAHSI HIS Desktop Application
Hydrologic Information Systems and the CUAHSI HIS Desktop ApplicationHydrologic Information Systems and the CUAHSI HIS Desktop Application
Hydrologic Information Systems and the CUAHSI HIS Desktop ApplicationACSG Section Montréal
 
Lap around windows azure
Lap around windows azureLap around windows azure
Lap around windows azureManish Corriea
 
Microsoft PaaS Cloud Windows Azure Platform
Microsoft PaaS Cloud Windows Azure PlatformMicrosoft PaaS Cloud Windows Azure Platform
Microsoft PaaS Cloud Windows Azure PlatformEsri
 
MS TechDays 2011 - Cloud Computing with the Windows Azure Platform
MS TechDays 2011 - Cloud Computing with the Windows Azure PlatformMS TechDays 2011 - Cloud Computing with the Windows Azure Platform
MS TechDays 2011 - Cloud Computing with the Windows Azure PlatformSpiffy
 
[AzurePT] Desenvolvimento para o Windows Azure: Diferença para o developer
[AzurePT] Desenvolvimento para o Windows Azure: Diferença para o developer[AzurePT] Desenvolvimento para o Windows Azure: Diferença para o developer
[AzurePT] Desenvolvimento para o Windows Azure: Diferença para o developerVitor Tomaz
 
Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentationsharonyb
 

Ähnlich wie Introduction to Hadoop, HBase, and NoSQL (20)

Large Scale Data Analysis Tools
Large Scale Data Analysis ToolsLarge Scale Data Analysis Tools
Large Scale Data Analysis Tools
 
Patterns of Cloud Applications Using Microsoft Azure Services Platform
Patterns of Cloud Applications Using Microsoft Azure Services PlatformPatterns of Cloud Applications Using Microsoft Azure Services Platform
Patterns of Cloud Applications Using Microsoft Azure Services Platform
 
Intro To Live Framework
Intro To Live FrameworkIntro To Live Framework
Intro To Live Framework
 
HP Microsoft SQL Server Data Management Solutions
HP Microsoft SQL Server Data Management SolutionsHP Microsoft SQL Server Data Management Solutions
HP Microsoft SQL Server Data Management Solutions
 
Windows Azure Platform Overview
Windows Azure Platform OverviewWindows Azure Platform Overview
Windows Azure Platform Overview
 
13h00 p duff-building-applications-with-aws-final
13h00   p duff-building-applications-with-aws-final13h00   p duff-building-applications-with-aws-final
13h00 p duff-building-applications-with-aws-final
 
[.Net Juniors Academy] Introdução ao Cloud Computing e Windows Azure Platform
[.Net Juniors Academy] Introdução ao Cloud Computing e Windows Azure Platform[.Net Juniors Academy] Introdução ao Cloud Computing e Windows Azure Platform
[.Net Juniors Academy] Introdução ao Cloud Computing e Windows Azure Platform
 
Building Applications with AWS
Building Applications with AWSBuilding Applications with AWS
Building Applications with AWS
 
Arquitectura dos Serviços da plataforma Windows Azure
Arquitectura dos Serviços da plataforma Windows AzureArquitectura dos Serviços da plataforma Windows Azure
Arquitectura dos Serviços da plataforma Windows Azure
 
Windows Azure Overview
Windows Azure OverviewWindows Azure Overview
Windows Azure Overview
 
[NetPonto] Arquitectura dos Serviços da plataforma Windows Azure
[NetPonto] Arquitectura dos Serviços da plataforma Windows Azure[NetPonto] Arquitectura dos Serviços da plataforma Windows Azure
[NetPonto] Arquitectura dos Serviços da plataforma Windows Azure
 
VSX 2012 Desktop Virtualization 101
VSX 2012 Desktop Virtualization 101VSX 2012 Desktop Virtualization 101
VSX 2012 Desktop Virtualization 101
 
A Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com PlatformA Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com Platform
 
Sap Virtualization Week 2009
Sap Virtualization Week 2009Sap Virtualization Week 2009
Sap Virtualization Week 2009
 
Hydrologic Information Systems and the CUAHSI HIS Desktop Application
Hydrologic Information Systems and the CUAHSI HIS Desktop ApplicationHydrologic Information Systems and the CUAHSI HIS Desktop Application
Hydrologic Information Systems and the CUAHSI HIS Desktop Application
 
Lap around windows azure
Lap around windows azureLap around windows azure
Lap around windows azure
 
Microsoft PaaS Cloud Windows Azure Platform
Microsoft PaaS Cloud Windows Azure PlatformMicrosoft PaaS Cloud Windows Azure Platform
Microsoft PaaS Cloud Windows Azure Platform
 
MS TechDays 2011 - Cloud Computing with the Windows Azure Platform
MS TechDays 2011 - Cloud Computing with the Windows Azure PlatformMS TechDays 2011 - Cloud Computing with the Windows Azure Platform
MS TechDays 2011 - Cloud Computing with the Windows Azure Platform
 
[AzurePT] Desenvolvimento para o Windows Azure: Diferença para o developer
[AzurePT] Desenvolvimento para o Windows Azure: Diferença para o developer[AzurePT] Desenvolvimento para o Windows Azure: Diferença para o developer
[AzurePT] Desenvolvimento para o Windows Azure: Diferença para o developer
 
Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentation
 

Introduction to Hadoop, HBase, and NoSQL

Hinweis der Redaktion

  1. I’m Not an RDBMS Guy!
  2. squish the FUD
  3. no central point of organization no committee or standardizing body no plan/strategy/illuminati to take down the RDBMS; lots of "in-fighting"
  4. central tenant - there IS NO one-size-fits-all unlike RDBMS assumptions, each engineering effort must be evaluated for data needs
  5. is it “anti-RDBMS”?
  6. not so much
  7. will not magically solve all your data or performance problems applications won’t magically stop crashing, data corruption, etc. Big Data is still hard. These tools make it possible/affordable/approachable
  8. data persistence comes down to garantees
  9. why are we here?
  10. "web scale" more users, content, connections more trends, insight, knowledge
  11. Atomicity: fault-tolerance is moving to the application layer - smaller atomic units Consistency: yes! but not necessarily immediate - "availability" (latency, reads) is more important. Isolation: smaller atomic units (multi-step transaction vs. compare-and-swap), greater availability, denormalization => reduced dependency on isolation Durability: some things are more important that getting every last detail, i.e. latency of response, view in aggregate
  12. Basically Available: is the data layer up or not? are we serving content to our users or not? Soft State: shifting burden of "correctness" up to application layer. availability is more important than precision. accuracy (correct) vs. precision (repeatable). Eventual Consistency: all operations are recorded and ordered. played back as resources permit.
  13. agile dev moves too fast for schema and constraints - this isn’t waterfall data models change quickly up-front schema modeling is akin to waterfall development - not always practical/feasible/possible data is messy - record what you have and leave constraints up to the application
  14. at scale, data services look like a DHT anyway! isolated independent services introduced caching layers partitioned data by logical and range boundaries.
  15. webapp
  16. app servers/session self-contained - load-balanced data’s in one spot - what do you do?
  17. 37-signals approach - DHH “scaling is a good thing because scaling => users => $$$”
  18. more users, more instances. easy!
  19. doesn’t work for social applications: - users cannot interact - old MMO’s vs. new social games
  20. redesign data server as “data services” separate independent logical components
  21. knowing each service by name becomes “vexing”
  22. configuration/logistical nightmare!
  23. abstractions! wouldn’t it be nice if...
  24. Distributed Computing Made Easy Less Hard
  25. programming model/API for parallel computing Google's MapReduce paper
  26. replicated, high throughput, fairly UNIX-y (not POSIX). Google FS Paper
  27. Distributed Group Services - coordination, synchronization, configuration, naming. Google Chubby Paper
  28. efficient, cross-language messaging Facebook/Apache Thrift Google Protobufs
  29. Google BigTable
  30. Addresses limitations of Raw M/R, HDFS access
  31. request by key: vs. hdfs sequential reads
  32. low-latency, ms response times vs. m/r high-latency
  33. row/column concepts DHT semantics Java, ReST, thrift
  34. Billions of rows, millions of columns