CouchDB

CouchDB

King Chung Huang
Information Technologies
University of Calgary

Document-oriented Databases
Today’s Talk CouchDB Overview
Demonstrations

Flat
Hierarchical
Network
Relational Databases

Post-Relational Databases
Dimensional
Object
Document-oriented

Comparable to documents in the real world
•

Records are stored as schema-less documents
•
Each document is uniquely named
■

Documents are the primary unit of storage
■

Structures are not explicitly defined
•
No tables with uniform, pre-defined fields
■

Every document can have varying fields of different types
■

Documents are self contained
•
Data is not decomposed into tables with relations
■

Documents contain the context needed to understand them
■

Examples
•
Lotus Notes
■

Amazon SimpleDB
■

CouchDB
■

Key-Value Stores
•
Amazon S3
■

Dynamo: Amazon’s Highly Available Key-value Store, DeCandia, et al., 2007
■

Facebook Cassandra
■

Recently accepted as an Apache incubation project
■

Google BigTable
■

Bigtable: A Distributed Storage System for Structured Data, Chang, et al.,
■

2006

Document database server
REST API
What is CouchDB? JSON documents
Views with MapReduce
Highly Scalable

Document Database Server
Implemented in Erlang
•
Ericsson Language
■

Highly concurrent, functional programming language
■

Designed with modern web applications in mind
•

Atomic Consistent Isolated Durable (ACID)
•

“Crash-only” design
•

Supports external handlers
•
Change notification
■

Custom processing
■

•

REST HTTP API
Representational State Transfer
•
A set of principles about how resources are defined and addressed
■

World Wide Web (HTTP) is RESTful
•
Uniform interface for accessing resources
■

Resources identified by URI
■

Actions transmitted in HTTP methods
■

Status communicated in status codes
■

REST HTTP API
CRUD
Create, Read, Update, and Delete
•
• In HTTP
POST /some/resource/id
■

GET /some/resource/id
■

PUT /some/resource/id
■

DELETE /some/resource/id
■

JSON Documents
JavaScript Object Notation
•
Considered language-independent
■

CouchDB stored XML documents before version 0.8
•
Suitable if content is already in XML
■

Human readable, but can be onerous to type
■

Markup language, requires transformation from/to data structures
■

Represents primitive data types and structures
•
Strings, numbers, booleans
■

Arrays, dictionaries
■

Null
■

Documents can have attachments
•

JSON Documents
Example
{
_id: “post1”,
_rev: “123456”,
title: “A Blog Post”,
tags: [“blue”, “glue”],
post_date: 1239910768,
body: “Once upon a time…”,
is_published: true
}

JSON Documents
Example
{
_id: “post1”,
_rev: “123456”,
…
_attachments: {
“picture.png”: {
stub: true,
content_type: “image/png”,
length: 384
}
}
}

Views
Used to sort and filter through data
•

Lazily evaluated, highly efficient
•
Similar to indexing in relational databases
■

Defined in design documents
•
Documents named _design/…
■

Consist of map and reduce functions
•
Language independent
■

JavaScript supported by default
■

Mozilla Spidermonkey included
■

Data Processing with MapReduce
Programming model for processing and generating large data sets
•

Related, but not equivalent to map and reduce operations in
•
functional languages
Take and produce key/value pairs with map and reduce functions
•

Map functions
•
Take input key/value pairs and produce an intermediate set of key/value pairs
■

Reduce functions
•
Take intermediate key and set of values for the key, and merges them into a
■

possibly smaller set of values
MapReduce: Simplified Data Processing on Large Clusters
•
Jeff Dean, Sanjay Ghemawat, Google Inc.

Example
{
_id: “post1”,
_rev: “123456”,
post_date: 1239910768,
is_published: true
}

Example
“post1” = {
_id: “post1”,
_rev: “123456”,
post_date: 1239910768,
is_published: true
}

Example
“post1” = {
post_date: 1239910768,
body: “Once upon a time…”
}

Emit Posts by post_date
“post1” = {
post_date: 1239910768,
}
1239910768 = {
post_date: 1239910768,
}

Emit Posts by post_date

1208456184 {title: “A bloody long time ago”, …}

1215421546 {title: “A blue moon ago”, …}

1222654641 {title: “Just Yesterday”, …}

1239910768 {title: “A Blog Post”, …}

1246816518 {title: “That was Then”, …}

1251687980 {title: “This is Now”, …}

1264836981 {title: “When Will Then Be Now?”, …}

Emit Posts by tag
“post1” = {
post_date: 1239910768,
}

“blue” = { title: “A Blog Post”, … }
“glue” = { title: “A Blog Post”, … }

Emit Posts by tag

blue {title: “Just Yesterday”, …}

blue {title: “A Blog Post”, …}

clue {title: “Just Yesterday”, …}

flue {title: “When Will Then Be Now?”, …}

flue {title: “This is Now”, …}

glue {title: “A Blog Post”, …}

wazoo {title: “That was Then”, …}

Emit Posts by tag, Reduced
{title: “Just Yesterday”, …},
blue
{title: “A Blog Post”, …}

clue {title: “Just Yesterday”, …}

{title: “When Will Then Be Now?”, …},
flue
{title: “This is Now”, …}

glue {title: “A Blog Post”, …}

wazoo {title: “That was Then”, …}

Scalability
Incremental MapReduce
•

Multiversion Concurrency Control (MVCC)
•
Achieves serializability through multiversioning instead of locking
■

Eliminates waits to access objects
■

Updates create new documents
■

Tradeoff point: no waits, increased data storage
■

Incremental Distributed Replication
•

Eventual Consistency
•
Changes eventually propagate through distributed systems
■

Tradeoff point: increase availability and tolerancy, decreased freshness
■

CouchDB

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (7)

Ähnlich wie CouchDB

Ähnlich wie CouchDB (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

CouchDB