5. APP APP HOST/VM CLIENT
RADOSGW RBD CEPH FS
LIBRADOS
A bucket-based REST A reliable and fully- A POSIX-compliant
A library allowing gateway, compatible distributed block distributed file
apps to directly with S3 and Swift device, with a Linux system, with a Linux
access RADOS, kernel client and a kernel client and
with support for QEMU/KVM driver support for FUSE
C, C++, Java,
Python, Ruby,
and PHP
RADOS
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
5
6. Let’s Start With a Good, Old-Fashioned Origin Story
JD Hancock, Flickr / CC BY 2.0 6
7. The Evolution of Storage
A brief history of information storage technology
7
8. Cave Paintings: The Earliest Form (maybe) of Information Storage
Chico.Ferreira, Flickr / CC BY 2.0 8
9. Technology Review: Cave Painting
The good The bad
• Low cost per smudge • Limited storage capacity
• Multitouch • 10 caveman ideas per wall
• No support for CIFS
9
11. Technology Review: Books and Libraries
The good The bad
• Cost per scroll is high • No automatic replication
• Can be eased w/slave labor • Must complete backups before
Caesar’s invasion of Egypt!
11
28. DISK
DISK
DISK
COMPUTE DISK
HUMAN
R
DISK
DISK
DISK
28
29. What Happens When Two HUMANs Need Access to the Same Resource?
wFourier, Flickr / CC BY 2.0 29
30. DISK
DISK
HUMAN
DISK
COMPUTE DISK
HUMAN
R
DISK
HUMAN
DISK
DISK
30
31. HUMAN HUMAN
HUMAN
HUMAN DISK
HUMAN
HUMAN DISK
HUMAN
HUMAN DISK
DISK
HUMAN
DISK
HUMAN
HUMAN DISK
(COMPUTER)
HUMAN
DISK
HUMAN HUMAN
DISK
HUMAN
HUMAN DISK
HUMAN DISK
HUMAN DISK
HUMAN
HUMAN DISK
HUMAN
HUMAN
(actually more like this…)
31
32. COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
HUMAN R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
HUMAN R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
HUMAN R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
32
34. X
aa
ab 111010 ac
101 ba bb bc 111 010
da 110 db 011 010 000 dc
000 110 001
34
35. COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
APP R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
35
36. COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
COMPUTE R
DISK
COMPUTE
R R
DISK
DISK COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
36
37. COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
VM COMPUTE
DISK
R
COMPUTE
DISK
R
VM COMPUTE
DISK
R
COMPUTE
DISK
VM R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
37
38. The Current State of Storage
How people store information today, and why it’s still not perfect yet
38
39. Ceph
Cloud computing
Distributed storage
Shared storage
Computers
Writing
Painting
How Much Store Things All Human History!!
Time-scale: Roughly logarithmic. Content: Whatever the opposite of “scientific” is. 39
40. COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
HUMAN R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
HUMAN R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
HUMAN R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
40
41. COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
COMPUTE
DISK
R
41
42. C D
C D
C D
C D
C D
C D
C D
C D
C D
C D
C D
C D
42
43. C D
C D
C D
HUMAN
C D
C D
C D
HUMAN C D
C D
C D
HUMAN C D
C D
C D
43
50. Growing With Hardware Appliances
First PB Second PB
C D • Proprietary C D • Proprietary storage
C D
storage hardware C D
hardware
C D • Well-known C D • Same storage
C D
storage vendor C D
vendor
C D C D
C D $14 b’zillion C D Another
C D C D
$14 b’zillion
C D C D
C D C D
C D C D
C D C D
C D C D
50
56. C D
C D
C C D
C D
D
C D
C D
C++ C D
C D
C D
C D
C D
56
57. X
C D
C D
C C D
C D
D
C D
C D
C++ C D
C D
C D
C D
C D
57
58. C D
C D
C D
C D
C D
HUMAN !! C D
[DEVELOPER] C D
C D
C D
C D
C D
C D
58
59. Give More Money To The Big Proprietary Vendors
It will make them very, very happy. 59
60. Storage Should Be Better
People need storage solutions that…
• …are open
• …are easy to manage
• …satisfy their requirements
• performance
• functional
• financial
60
61. The Birth of a New Storage Solution
We think our roots are showing
61
69. Open Source is the Best Way to Spread Ideas
orchidgalore, Flickr / CC BY 2.0 69
70. philosophy design
OPEN
SOURCE
COMMUNITY-
FOCUSED
70
71. All of Us Are Smarter Than Some of Us
rturk, Linkedin Inmap 71
72. philosophy design
OPEN SCALABL
SOURCE E
COMMUNITY-
FOCUSED
72
73. Ceph
Too much for a room
Too much for a computer
Too much for a drive
Too much for a book
Too much for a cave
Ceph is Built to Scale
Time-scale: Roughly logarithmic. Content: Whatever the opposite of “scientific” is. 73
74. philosophy design
OPEN SCALABL
SOURCE E
COMMUNITY- NO SINGLE POINT OF
FOCUSED FAILURE
74
85. APP APP HOST/VM CLIENT
RADOSGW RBD CEPH FS
LIBRADOS
A bucket-based REST A reliable and fully- A POSIX-compliant
A library allowing gateway, compatible distributed block distributed file
apps to directly with S3 and Swift device, with a Linux system, with a Linux
access RADOS, kernel client and a kernel client and
with support for QEMU/KVM driver support for FUSE
C, C++, Java,
Python, Ruby,
and PHP
RADOS
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
85
86. APP APP HOST/VM CLIENT
RADOSGW RBD CEPH FS
LIBRADOS
A bucket-based REST A reliable and fully- A POSIX-compliant
A library allowing gateway, compatible distributed block distributed file
apps to directly with S3 and Swift device, with a Linux system, with a Linux
access RADOS, kernel client and a kernel client and
with support for QEMU/KVM driver support for FUSE
C, C++, Java,
Python, Ruby,
and PHP
RADOS
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
86
87. OSD OSD OSD OSD OSD
btrfs
FS FS FS FS FS
xfs
ext4
DISK DISK DISK DISK DISK
M M M
87
89. Monitors:
M
• Maintain cluster map
• Provide consensus for
distributed decision-making
• Must have an odd number
• These do not serve stored
objects to clients
OSDs:
• One per disk (recommended)
• At least three in a cluster
• Serve stored objects to
clients
• Intelligently peer to perform
replication tasks
• Supports object classes
89
90. APP APP HOST/VM CLIENT
RADOSGW RBD CEPH FS
LIBRADOS
A bucket-based REST A reliable and fully- A POSIX-compliant
A library allowing gateway, compatible distributed block distributed file
apps to directly with S3 and Swift device, with a Linux system, with a Linux
access RADOS, kernel client and a kernel client and
with support for QEMU/KVM driver support for FUSE
C, C++, Java,
Python, Ruby,
and PHP
RADOS
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
90
92. LIBRADOS
L
• Provides direct access to
RADOS for applications
• C, C++, Python, PHP, Java
• No HTTP overhead
93. APP APP HOST/VM CLIENT
RADOSGW RBD CEPH FS
LIBRADOS
A bucket-based REST A reliable and fully- A POSIX-compliant
A library allowing gateway, compatible distributed block distributed file
apps to directly with S3 and Swift device, with a Linux system, with a Linux
access RADOS, kernel client and a kernel client and
with support for QEMU/KVM driver support for FUSE
C, C++, Java,
Python, Ruby,
and PHP
RADOS
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
93
95. RADOS Gateway:
• REST-based interface to
RADOS
• Supports buckets,
accounting
• Compatible with S3 and
Swift applications
95
96. APP APP HOST/VM CLIENT
RADOSGW RBD CEPH FS
LIBRADOS
A bucket-based REST A reliable and fully- A POSIX-compliant
A library allowing gateway, compatible distributed block distributed file
apps to directly with S3 and Swift device, with a Linux system, with a Linux
access RADOS, kernel client and a kernel client and
with support for QEMU/KVM driver support for FUSE
C, C++, Java,
Python, Ruby,
and PHP
RADOS
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
96
100. RADOS Block Device:
• Storage of virtual disks in
RADOS
• Allows decoupling of VMs and
containers
• Live migration!
• Images are striped across the
cluster
• Boot support in QEMU, KVM,
and OpenStack Nova
• Mount support in the Linux
kernel
100
101. APP APP HOST/VM CLIENT
RADOSGW RBD CEPH FS
LIBRADOS
A bucket-based REST A reliable and fully- A POSIX-compliant
A library allowing gateway, compatible distributed block distributed file
apps to directly with S3 and Swift device, with a Linux system, with a Linux
access RADOS, kernel client and a kernel client and
with support for QEMU/KVM driver support for FUSE
C, C++, Java,
Python, Ruby,
and PHP
RADOS
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
101
103. Metadata Server
• Manages metadata for a
POSIX-compliant shared
filesystem
• Directory hierarchy
• File metadata (owner,
timestamps, mode, etc.)
• Stores metadata in RADOS
• Does not serve file data to
clients
• Only required for shared
filesystem
103
138. APP APP HOST/VM CLIENT
RADOSGW RBD CEPH FS
LIBRADOS
A bucket-based REST A reliable and fully- A POSIX-compliant
A library allowing gateway, compatible distributed block distributed file
apps to directly with S3 and Swift device, with a Linux system, with a Linux
access RADOS, kernel client and a kernel client and
with support for QEMU/KVM driver support for FUSE
C, C++, Java,
Python, Ruby,
and PHP AWESOME AWESOME
NEARLY
AWESOME AWESOME
RADOS AWESOME
A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
138
145. What do we want from you??
• Try Ceph! Tell us what you think. Ask if
you need help. Help others if you can!
• Are you a company? Consider dedicating
dev resources to the project.
145