NoSQL "Tools in Action" talk at Devoxx

NoSQL
LQS ylnO toN

Devoxx 2009

Daisy
(this is what we do)
Java

open source cms
MySQL Lucene
system- and
full-text indexes
metadata

ﬁlesystem actual content

rather nice query language
rather nice publishing features

wanted to move away from
WCMS
fast access control and facet
browser requires the active
document set to reside in
memory cache
SCALE?

Findings
we learned a lot (about content management) over the past 6
years (like: versioning, staging, multilinguality, searching, access
control, publishing)
people don’t like Cocoon / XSLT, prefer templating instead
a lot of our project specifics were about finding the correct
storage model for specific data structures (not all data was fit for
our CMS)
customers with growing ambitions → Daisy has to follow !

Overhaul
get rid of Cocoon → Kauri
quest for differentation
“the internet barrier”
combine massive storage
with useful query-ability

REST, Maven, Spring,
webapps
www.kauriproject.org

The Internet Barrier
fundamental technology gap
NoSQL
buy/use vs build
focus on architecture and
infrastructure

SQL rise of semi- or free-
structured information
layered approach
http://github.com/blog/530-how-we-made-github-fast

Conclusions

let’s start from scratch (ouch)
a different architecture / foundations
scale big and be available
modularity (pluggability)
we’re not a banking application, so consistency might be less
important

Storage challenges
sparse data structures
ﬂexible, evolving data structures
lack of good fault-tolerance setups
cope with scale
CAP vs BASE
(Google) BigTable and (Amazon) Dynamo

ACID vs BASE
atomicity consistency isolation durability basically available, soft state, eventually consistent

Strong Consistency Weak Consistency

Isolation Availability First

Focus on “commit” Best Effort

Nested Transactions Approximative Answers OK

Availability? Aggressive (optimistic)

Conservative (pessimistic) Simpler!

Difﬁcult Evolution (schema) Faster

Easier Evolution

spectrum slide: Mark Brewer

Our CAP multilemma
scale

consistency partition
availability
of data tolerance
ping means results ‘cluster splits’ should not block

fault tolerance

C?A?P?

Initial gut feeling: cAP
A was a given
C would be a function of our datastore choice
however the P seemed like a nice-to-have (aka over-ambitious
use-case)

CAPondering

HBase vs Cassandra
consistency vs SPOF ?
possible higher latency vs possibly frailer community ?
Cocoon trauma

http://www.cs.cornell.edu/projects/ladis2009/talks/ramakrishnan-keynote-ladis2009.pdf

Comparison Matrix
Partitioning Availability Replication Storage

Consistency
Sync/async

Local/geo
Hash/sort

Durability
Dynamic

Reads/
Failures
handled
Routing

writes
During
failure
Colo+ Local+ Double Buffer
PNUTS H+S Rtr
Read+ Async Timeline +
Y server write geo Eventual WAL pages

Local+ Buffer
MySQL H+S N Cli Colo+ Read Async ACID WAL
pages
server nearby
Read+ Local+ N/A (no Triple
Y
Colo+ Sync Files
HDFS Other Rtr server write nearby updates) replication

Colo+ Read+ Local+ Multi- Triple LSM/
BigTable S Y Rtr write
Sync nearby version replication SSTable
server
Colo+ Read+ Local+ Buffer
Dynamo H Y P2P write Async nearby
Eventual WAL
pages
server
Read+ Sync+ Local+ Triple LSM/
Cassandra H+S Y P2P Colo+ Eventual WAL SSTable
server write Async nearby

Megastore S Y Rtr Colo+ Read+
Sync Local+
ACID/
Triple
replication
LSM/
SSTable
server write nearby other
Azure S N Cli Server Read+ Buffer
write
Sync Local ACID WAL
pages

100 100

HBase
sorted
distributed persisted
column-oriented storage system
multi-dimensional
highly-available
high-performance adds random access reads
and writes atop HDFS

HBase
Apache Hadoop sub-project
hadoop.apache.org/hbase
0.20.1 (12/Oct 2009)

People
Inventors Project leads
Google BigTable ☺ Michael Stack (Powerset/
Jim Kellerman (Powerset/ Microsoft)
Microsoft) Jonathan Gray (Streamy.com)
Mike Cafarella (UMich) Ryan Rawson (StumbleUpon)
Jean-Daniel Cryans (SU)
Bryan Duxbury (Rapleaf)

HBase data model
Distributed multi-dimensional Keys are arbitrary strings
sparse map
Access to row data is atomic
Multi-dimensional keys:
(table, row, family:column,
timestamp) → value

Date: Thu, 12 Nov 2009 18:19:50 -0800
Message-ID: <78568af10911121819x292527b2t7f8b7d857c3650b2@mail.gmail.com>
Subject: Re: newbie: need help on understanding HBase
From: Ryan Rawson <ryanobjc@gmail.com>
To: hbase-user@hadoop.apache.org

HBase is semi-column oriented. Column families is the storage model -
everything in a column family is stored in a file linearly in HDFS.
That means accessing data from a column family is really cheap and
easy. Adding more column families adds more files - it has the
performance profile of adding new tables, except you dont actually
have additional tables, so the conceptual complexity stays low.

Data is stored at the "intersection" of the rowid, and the column
family + qualifier. This is sometimes called a "Cell" - contains a
timestamp as well. You can have multiple versions all timestamped.
The timestamp is by default the int64/java system milli time. I have
to recommend against setting the timestamp explicitly if you can avoid
it. So when you retrieve a row, you can get everything, a list of
column qualifiers or a list of families or any combo. (eg: list of
these qualifiers out of family A and everything from family B)

[...]

The terms to use are:
- Column family (or just family): the unit of locality in hbase.
Everything in a family is stored in 1 (or a set) of files. A table is
a name and a list of families with attributes for those families (eg:
compression). A family is a string.
- Column qualifier (or just qualifier): allows you to store multiple
values for the same row in 1 family. This value is a byte array and
can be anything. The API converts null => new byte[0]{}. This is the
tricky bit, since most people dont think of "column names" as being
dynamic.
- Cell - the old name for a value + timestamp. The new API (see:
class Result) doesn't use this term, instead provides a different path
to read data.

You can use HBase as a normal datastore and use static names for the
qualifiers, and that is just fine. But if you need something special
to get past the lack of relations, you can start to do fun things with
the qualifier as data. Building a secondary index for example. The
row key would be the secondary value (eg: city) and the qualifier
would be the primary key (eg: userid) and the value would be a
placeholder to indicate the value exists.

E-R

Blog

partitioning

sequential scans

Getting data in and out
Java API
Thrift multi-language API
Stargate REST connector
HBase shell (JRuby IRB-
based)
Processing: MapReduce and
Cascading

Questions?
stevenn@outerthought.org
twitter: @stevenn / @outerthought

NoSQL "Tools in Action" talk at Devoxx

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (17)

Ähnlich wie NoSQL "Tools in Action" talk at Devoxx

Ähnlich wie NoSQL "Tools in Action" talk at Devoxx (20)

Mehr von NGDATA

Mehr von NGDATA (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

NoSQL "Tools in Action" talk at Devoxx

Hinweis der Redaktion