SlideShare ist ein Scribd-Unternehmen logo
1 von 65
Downloaden Sie, um offline zu lesen
ceph – a unified distributed storage system


                    sage weil
           cloudopen – august 29, 2012
outline
●   why you should care
●   what is it, what it does
●   how it works
    ●   architecture
●   how you can use it
    ●   librados
    ●   radosgw
    ●   RBD
    ●   file system
●   who we are, why we do this
why should you care about another
        storage system?
requirements
●   diverse storage needs
    ●   object storage
    ●   block devices (for VMs) with snapshots, cloning
    ●   shared file system with POSIX, coherent caches
    ●   structured data... files, block devices, or objects?
●   scale
    ●   terabytes, petabytes, exabytes
    ●   heterogeneous hardware
    ●   reliability and fault tolerance
time
●   ease of administration
●   no manual data migration, load balancing
●   painless scaling
    ●   expansion and contraction
    ●   seamless migration
cost
●   linear function of size or performance
●   incremental expansion
    ●   no fork-lift upgrades
●   no vendor lock-in
    ●   choice of hardware
    ●   choice of software
●   open
what is ceph?
unified storage system
●   objects
    ●   native
    ●   RESTful
●   block
    ●   thin provisioning, snapshots, cloning
●   file
    ●   strong consistency, snapshots
APP
      APP                      APP
                               APP                  HOST/VM
                                                    HOST/VM                    CLIENT
                                                                               CLIENT



                         RADOSGW
                         RADOSGW                RBD
                                                RBD                        CEPH FS
                                                                           CEPH FS
LIBRADOS
 LIBRADOS
                          A bucket-based
                           A bucket-based         A reliable and fully-
                                                   A reliable and fully-    A POSIX-compliant
                                                                             A POSIX-compliant
   A library allowing
    A library allowing    REST gateway,
                           REST gateway,          distributed block
                                                   distributed block        distributed file
                                                                             distributed file
   apps to directly
    apps to directly      compatible with S3
                           compatible with S3     device, with aaLinux
                                                   device, with Linux       system, with aa
                                                                             system, with
   access RADOS,
    access RADOS,         and Swift
                           and Swift              kernel client and aa
                                                   kernel client and        Linux kernel client
                                                                             Linux kernel client
   with support for
    with support for                              QEMU/KVM driver
                                                   QEMU/KVM driver          and support for
                                                                             and support for
   C, C++, Java,
    C, C++, Java,                                                           FUSE
                                                                             FUSE
   Python, Ruby,
    Python, Ruby,
   and PHP
    and PHP




RADOS
RADOS

 A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
  A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
 intelligent storage nodes
  intelligent storage nodes
open source
●   LGPLv2
    ●   copyleft
    ●   ok to link to proprietary code
●   no copyright assignment
    ●   no dual licensing
    ●   no “enterprise-only” feature set
●   active community
●   commercial support
distributed storage system
●   data center scale
    ●   10s to 10,000s of machines
    ●   terabytes to exabytes
●   fault tolerant
    ●   no single point of failure
    ●   commodity hardware
●   self-managing, self-healing
ceph object model
●   pools
    ●   1s to 100s
    ●   independent namespaces or object collections
    ●   replication level, placement policy
●   objects
    ●   bazillions
    ●   blob of data (bytes to gigabytes)
    ●   attributes (e.g., “version=12”; bytes to kilobytes)
    ●   key/value bundle (bytes to gigabytes)
why start with objects?
●   more useful than (disk) blocks
    ●   names in a single flat namespace
    ●   variable size
    ●   simple API with rich semantics
●   more scalable than files
    ●   no hard-to-distribute hierarchy
    ●   update semantics do not span objects
    ●   workload is trivially parallel
DISK
                   DISK

                   DISK
                   DISK

                   DISK
                   DISK

HUMAN
HUMAN   COMPUTER
        COMPUTER   DISK
                   DISK

                   DISK
                   DISK

                   DISK
                   DISK

                   DISK
                   DISK
DISK
                   DISK

                   DISK
                   DISK
HUMAN
HUMAN
                   DISK
                   DISK

HUMAN
HUMAN   COMPUTER
        COMPUTER   DISK
                   DISK

                   DISK
                   DISK
HUMAN
HUMAN
                   DISK
                   DISK

                   DISK
                   DISK
HUMAN
    HUMAN      HUMAN
                HUMAN

                        HUMAN
                         HUMAN
 HUMAN
  HUMAN                                                       DISK
                                                              DISK
               HUMAN
                HUMAN
HUMAN
 HUMAN                                                        DISK
                                                              DISK
 HUMAN
  HUMAN        HUMAN
                HUMAN
                                                              DISK
                                                              DISK
                                                              DISK
                                                              DISK
     HUMAN
      HUMAN
                                                              DISK
                                                              DISK
            HUMAN
             HUMAN
HUMAN
 HUMAN
                                                              DISK
                                                              DISK
                                    (COMPUTER))
                                     (COMPUTER
             HUMAN
              HUMAN
                                                              DISK
                                                              DISK
  HUMAN            HUMAN
                    HUMAN
   HUMAN                                                      DISK
                                                              DISK
             HUMAN
              HUMAN
 HUMAN
  HUMAN                                                       DISK
                                                              DISK
              HUMAN
               HUMAN                                          DISK
                                                              DISK
   HUMAN
    HUMAN                                                     DISK
                                                              DISK
                     HUMAN
                      HUMAN
     HUMAN
      HUMAN                                                   DISK
                                                              DISK
                     HUMAN
                      HUMAN

          HUMAN
           HUMAN
                                 (actually more like this…)
COMPUTER
        COMPUTER   DISK
                   DISK
        COMPUTER
        COMPUTER   DISK
                   DISK
        COMPUTER
        COMPUTER   DISK
                   DISK
HUMAN
HUMAN
        COMPUTER
        COMPUTER   DISK
                   DISK
        COMPUTER
        COMPUTER   DISK
                   DISK
        COMPUTER
        COMPUTER   DISK
                   DISK
HUMAN
HUMAN
        COMPUTER
        COMPUTER   DISK
                   DISK
        COMPUTER
        COMPUTER   DISK
                   DISK
        COMPUTER
        COMPUTER   DISK
                   DISK
HUMAN
HUMAN
        COMPUTER
        COMPUTER   DISK
                   DISK
        COMPUTER
        COMPUTER   DISK
                   DISK
        COMPUTER
        COMPUTER   DISK
                   DISK
OSD    OSD    OSD    OSD    OSD




 FS     FS     FS    FS     FS     btrfs
                                   xfs
                                   ext4
DISK   DISK   DISK   DISK   DISK




  M            M             M
Monitors:
    •
        Maintain cluster membership and state



M
    •
        Provide consensus for distributed
        decision-making
    •
        Small, odd number
    •
        These do not serve stored objects to
        clients



    Object Storage Daemons (OSDs):
    •
        At least three in a cluster
    •
        One per disk or RAID group
    •
        Serve stored objects to clients
    •
        Intelligently peer to perform
        replication tasks
HUMAN




        M




M           M
data distribution
●   all objects are replicated N times
●   objects are automatically placed, balanced, migrated
    in a dynamic cluster
●   must consider physical infrastructure
    ●   ceph-osds on hosts in racks in rows in data centers

●   three approaches
    ●   pick a spot; remember where you put it
    ●   pick a spot; write down where you put it
    ●   calculate where to put it, where to find it
CRUSH
•   Pseudo-random placement
    algorithm
•   Fast calculation, no lookup
•   Repeatable, deterministic
•   Ensures even distribution
•   Stable mapping
    •   Limited data migration
•   Rule-based configuration
    •   specifiable replication
    •   infrastructure topology aware
    •   allows weighting
10 10 01 01 10 10 01 11 01 10
         10 10 01 01 10 10 01 11 01 10

                                    hash(object name) % num pg

10
 10   10
       10   01
             01   01
                   01   10
                         10   10
                               10    01
                                      01   11
                                            11   01
                                                  01   10
                                                        10




                                    CRUSH(pg, cluster state, policy)
10 10 01 01 10 10 01 11 01 10
        10 10 01 01 10 10 01 11 01 10




10
 10   10
       10   01
             01   01
                   01   10
                         10   10
                               10   01
                                     01   11
                                           11   01
                                                 01   10
                                                       10
RADOS
●   monitors publish osd map that describes cluster state
    ●   ceph-osd node status (up/down, weight, IP)               M
    ●   CRUSH function specifying desired data distribution
●   object storage daemons (OSDs)
    ●   safely replicate and store object
    ●   migrate data as the cluster changes over time
    ●   coordinate based on shared view of reality
●   decentralized, distributed approach allows
    ●   massive scales (10,000s of servers or more)
    ●   the illusion of a single copy with consistent behavior
CLIENT
CLIENT

         ??
CLIENT

         ??
APP
      APP                     APP
                              APP                   HOST/VM
                                                    HOST/VM                    CLIENT
                                                                               CLIENT



                        RADOSGW
                        RADOSGW                RBD
                                               RBD                         CEPH FS
                                                                           CEPH FS
LIBRADOS
                         A bucket-based
                          A bucket-based          A reliable and fully-
                                                   A reliable and fully-    A POSIX-compliant
                                                                             A POSIX-compliant
   A library allowing    REST gateway,
                          REST gateway,           distributed block
                                                   distributed block        distributed file
                                                                             distributed file
   apps to directly      compatible with S3
                          compatible with S3      device, with aaLinux
                                                   device, with Linux       system, with aa
                                                                             system, with
   access RADOS,         and Swift
                          and Swift               kernel client and aa
                                                   kernel client and        Linux kernel client
                                                                             Linux kernel client
   with support for                               QEMU/KVM driver
                                                   QEMU/KVM driver          and support for
                                                                             and support for
   C, C++, Java,                                                            FUSE
                                                                             FUSE
   Python, Ruby,
   and PHP




RADOS
RADOS

 A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
  A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
 intelligent storage nodes
  intelligent storage nodes
APP
     APP
    LIBRADOS
     LIBRADOS

                native




    M
    M
M
M               M
                M
LIBRADOS



L
    • Provides direct access to
      RADOS for applications
    • C, C++, Python, PHP, Java
    • No HTTP overhead
APP
      APP                      APP
                               APP                  HOST/VM
                                                    HOST/VM                    CLIENT
                                                                               CLIENT



                         RADOSGW               RBD
                                               RBD                         CEPH FS
                                                                           CEPH FS
LIBRADOS
 LIBRADOS
                          A bucket-based          A reliable and fully-
                                                   A reliable and fully-    A POSIX-compliant
                                                                             A POSIX-compliant
   A library allowing
    A library allowing    REST gateway,           distributed block
                                                   distributed block        distributed file
                                                                             distributed file
   apps to directly
    apps to directly      compatible with S3      device, with aaLinux
                                                   device, with Linux       system, with aa
                                                                             system, with
   access RADOS,
    access RADOS,         and Swift               kernel client and aa
                                                   kernel client and        Linux kernel client
                                                                             Linux kernel client
   with support for
    with support for                              QEMU/KVM driver
                                                   QEMU/KVM driver          and support for
                                                                             and support for
   C, C++, Java,
    C, C++, Java,                                                           FUSE
                                                                             FUSE
   Python, Ruby,
    Python, Ruby,
   and PHP
    and PHP




RADOS
RADOS

 A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
  A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
 intelligent storage nodes
  intelligent storage nodes
APP
  APP                 APP
                      APP
                                  REST




RADOSGW
RADOSGW           RADOSGW
                  RADOSGW
  LIBRADOS
   LIBRADOS           LIBRADOS
                       LIBRADOS


                                         native




              M
              M
        M
        M         M
                  M
RADOS Gateway:
• REST-based interface to
  RADOS
• Supports buckets,
  accounting
• Compatible with S3 and
  Swift applications
APP
      APP                      APP
                               APP                  HOST/VM
                                                    HOST/VM                    CLIENT
                                                                               CLIENT



                         RADOSGW
                         RADOSGW                RBD                       CEPH FS
                                                                          CEPH FS
LIBRADOS
 LIBRADOS
                          A bucket-based
                           A bucket-based         A reliable and fully-    A POSIX-compliant
                                                                            A POSIX-compliant
   A library allowing
    A library allowing    REST gateway,
                           REST gateway,          distributed block        distributed file
                                                                            distributed file
   apps to directly
    apps to directly      compatible with S3
                           compatible with S3     device, with a Linux     system, with aa
                                                                            system, with
   access RADOS,
    access RADOS,         and Swift
                           and Swift              kernel client and a      Linux kernel client
                                                                            Linux kernel client
   with support for
    with support for                              QEMU/KVM driver          and support for
                                                                            and support for
   C, C++, Java,
    C, C++, Java,                                                          FUSE
                                                                            FUSE
   Python, Ruby,
    Python, Ruby,
   and PHP
    and PHP




RADOS
RADOS

 A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
  A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
 intelligent storage nodes
  intelligent storage nodes
COMPUTER
                  COMPUTER   DISK
                             DISK
                  COMPUTER
                  COMPUTER   DISK
                             DISK
                  COMPUTER
                  COMPUTER   DISK
                             DISK
                  COMPUTER
                  COMPUTER   DISK
                             DISK
                  COMPUTER
                  COMPUTER   DISK
                             DISK
                  COMPUTER
                  COMPUTER   DISK
                             DISK
COMPUTER
COMPUTER          COMPUTER
                  COMPUTER   DISK
                             DISK
           DISK
           DISK
                  COMPUTER
                  COMPUTER   DISK
                             DISK
                  COMPUTER
                  COMPUTER   DISK
                             DISK
                  COMPUTER
                  COMPUTER   DISK
                             DISK
                  COMPUTER
                  COMPUTER   DISK
                             DISK
                  COMPUTER
                  COMPUTER   DISK
                             DISK
COMPUTER
     COMPUTER   DISK
                DISK
     COMPUTER
     COMPUTER   DISK
                DISK
     COMPUTER
     COMPUTER   DISK
                DISK
     COMPUTER
     COMPUTER   DISK
                DISK
VM
VM   COMPUTER
     COMPUTER   DISK
                DISK
     COMPUTER
     COMPUTER   DISK
                DISK
VM
VM   COMPUTER   DISK
     COMPUTER   DISK
     COMPUTER
     COMPUTER   DISK
                DISK
VM
VM
     COMPUTER
     COMPUTER   DISK
                DISK
     COMPUTER
     COMPUTER   DISK
                DISK
     COMPUTER
     COMPUTER   DISK
                DISK
     COMPUTER
     COMPUTER   DISK
                DISK
VM
            VM




VIRTUALIZATION CONTAINER
VIRTUALIZATION CONTAINER
            LIBRBD
             LIBRBD
         LIBRADOS
          LIBRADOS




        M
        M
    M
    M                 M
                      M
CONTAINER
CONTAINER         VM
                  VM       CONTAINER
                           CONTAINER
   LIBRBD
    LIBRBD                    LIBRBD
                               LIBRBD
  LIBRADOS
   LIBRADOS                  LIBRADOS
                              LIBRADOS




                  M
                  M
              M
              M        M
                       M
HOST
        HOST
    KRBD (KERNEL MODULE)
     KRBD (KERNEL MODULE)
         LIBRADOS
          LIBRADOS




       M
       M
M
M                       M
                        M
RADOS Block Device:
• Storage of virtual disks in RADOS
• Decouples VMs and containers
 • Live migration!
• Images are striped across the cluster
• Snapshots!
• Support in
  • Qemu/KVM
  • OpenStack, CloudStack
  • Mainline Linux kernel
HOW DO YOU
    SPIN UP
THOUSANDS OF VMs
   INSTANTLY
      AND
  EFFICIENTLY?
instant copy




144
      0        0         0   0   = 144
write
                          CLIENT
                  write


                  write


                  write




144   4   = 148
read


             read
                    CLIENT

             read




144   4   = 148
APP
      APP                      APP
                               APP                  HOST/VM
                                                    HOST/VM                    CLIENT
                                                                               CLIENT



                         RADOSGW
                         RADOSGW                RBD
                                                RBD                        CEPH FS
LIBRADOS
 LIBRADOS
                          A bucket-based
                           A bucket-based         A reliable and fully-
                                                   A reliable and fully-    A POSIX-compliant
   A library allowing
    A library allowing    REST gateway,
                           REST gateway,          distributed block
                                                   distributed block        distributed file
   apps to directly
    apps to directly      compatible with S3
                           compatible with S3     device, with aaLinux
                                                   device, with Linux       system, with a
   access RADOS,
    access RADOS,         and Swift
                           and Swift              kernel client and aa
                                                   kernel client and        Linux kernel client
   with support for
    with support for                              QEMU/KVM driver
                                                   QEMU/KVM driver          and support for
   C, C++, Java,
    C, C++, Java,                                                           FUSE
   Python, Ruby,
    Python, Ruby,
   and PHP
    and PHP




RADOS
RADOS

 A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
  A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
 intelligent storage nodes
  intelligent storage nodes
CLIENT
               CLIENT



metadata           01
                    01   data
                   10
                    10




               M
               M
           M
           M             M
                         M
M
    M
M
M       M
        M
Metadata Server
• Manages metadata for a
  POSIX-compliant shared
  filesystem
 • Directory hierarchy
 • File metadata (owner,
   timestamps, mode, etc.)
• Stores metadata in RADOS
• Does not serve file data to
  clients
• Only required for shared
  filesystem
one tree




three metadata servers


                               ??
DYNAMIC SUBTREE PARTITIONING
recursive accounting
●   ceph-mds tracks recursive directory stats
    ●   file sizes
    ●   file and directory counts
    ●   modification time
●
    virtual xattrs present full stats
●
    efficient
        $ ls ­alSh | head
        total 0
        drwxr­xr­x 1 root            root      9.7T 2011­02­04 15:51 .
        drwxr­xr­x 1 root            root      9.7T 2010­12­16 15:06 ..
        drwxr­xr­x 1 pomceph         pg4194980 9.6T 2011­02­24 08:25 pomceph
        drwxr­xr­x 1 mcg_test1       pg2419992  23G 2011­02­02 08:57 mcg_test1
        drwx­­x­­­ 1 luko            adm        19G 2011­01­21 12:17 luko
        drwx­­x­­­ 1 eest            adm        14G 2011­02­04 16:29 eest
        drwxr­xr­x 1 mcg_test2       pg2419992 3.0G 2011­02­02 09:34 mcg_test2
        drwx­­x­­­ 1 fuzyceph        adm       1.5G 2011­01­18 10:46 fuzyceph
        drwxr­xr­x 1 dallasceph      pg275     596M 2011­01­14 10:06 dallasceph
snapshots
●   volume or subvolume snapshots unusable at petabyte scale
    ●   snapshot arbitrary subdirectories
●   simple interface
    ●   hidden '.snap' directory
    ●
        no special tools


        $ mkdir foo/.snap/one      # create snapshot
        $ ls foo/.snap
        one
        $ ls foo/bar/.snap
        _one_1099511627776         # parent's snap name is mangled
        $ rm foo/myfile
        $ ls -F foo
        bar/
        $ ls -F foo/.snap/one
        myfile bar/
        $ rmdir foo/.snap/one      # remove snapshot
multiple protocols, implementations
●   Linux kernel client
    ●   mount -t ceph 1.2.3.4:/ /mnt
                                        NFS            SMB/CIFS
    ●   export (NFS), Samba (CIFS)
●   ceph-fuse                           Ganesha            Samba
                                         libcephfs          libcephfs
●   libcephfs.so
    ●   your app                        Hadoop             your app
                                         libcephfs          libcephfs
    ●   Samba (CIFS)
    ●   Ganesha (NFS)                          ceph-fuse
                                       ceph        fuse
    ●   Hadoop (map/reduce)                       kernel
APP
      APP                      APP
                               APP                  HOST/VM
                                                    HOST/VM                    CLIENT
                                                                               CLIENT



                         RADOSGW
                         RADOSGW                RBD
                                                RBD                        CEPH FS
                                                                           CEPH FS
LIBRADOS
 LIBRADOS
                          A bucket-based
                           A bucket-based         A reliable and fully-
                                                   A reliable and fully-    A POSIX-compliant
                                                                             A POSIX-compliant
   A library allowing
    A library allowing    REST gateway,
                           REST gateway,          distributed block
                                                   distributed block        distributed file
                                                                             distributed file
   apps to directly
    apps to directly      compatible with S3
                           compatible with S3     device, with aaLinux
                                                   device, with Linux       system, with aa
                                                                             system, with
   access RADOS,
    access RADOS,         and Swift
                           and Swift              kernel client and aa
                                                   kernel client and        Linux kernel client
                                                                             Linux kernel client
   with support for
    with support for                              QEMU/KVM driver
                                                   QEMU/KVM driver          and support for
                                                                             and support for
   C, C++, Java,
    C, C++, Java,                                                           FUSE
                                                                             FUSE
   Python, Ruby,
    Python, Ruby,
   and PHP
    and PHP                AWESOME                 AWESOME
                                                                              NEARLY
 AWESOME                                                                     AWESOME


RADOS
RADOS                                    AWESOME
 A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
  A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
 intelligent storage nodes
  intelligent storage nodes
why we do this
●   limited options for scalable open source storage
●   proprietary solutions
    ●   expensive
    ●   don't scale (well or out)
    ●   marry hardware and software


●   industry needs to change
who we are
●   Ceph created at UC Santa Cruz (2007)
●   supported by DreamHost (2008-2011)
●   Inktank (2012)
    ●   Los Angeles, Sunnyvale, San Francisco, remote
●   growing user and developer community
    ●   Linux distros, users, cloud stacks, SIs, OEMs


                       http://ceph.com/
thanks
BoF tonight @ 5:15




sage weil
sage@inktank.com     http://github.com/ceph
@liewegas            http://ceph.com/
why we like btrfs
●   pervasive checksumming
●   snapshots, copy-on-write
●   efficient metadata (xattrs)
●   inline data for small files
●   transparent compression
●   integrated volume management
    ●   software RAID, mirroring, error recovery
    ●   SSD-aware
●   online fsck
●   active development community

Weitere ähnliche Inhalte

Andere mochten auch

Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...Ceph Community
 
Ceph Day New York 2014: Ceph Ecosystem Update
Ceph Day New York 2014: Ceph Ecosystem UpdateCeph Day New York 2014: Ceph Ecosystem Update
Ceph Day New York 2014: Ceph Ecosystem UpdateCeph Community
 
DreamObjects - Ceph Day Nov 2012
DreamObjects - Ceph Day Nov 2012DreamObjects - Ceph Day Nov 2012
DreamObjects - Ceph Day Nov 2012Ceph Community
 
Ceph Day London 2014 - Ceph Ecosystem Overview
Ceph Day London 2014 - Ceph Ecosystem Overview Ceph Day London 2014 - Ceph Ecosystem Overview
Ceph Day London 2014 - Ceph Ecosystem Overview Ceph Community
 
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Community
 
Ceph Day Beijing: Welcome
Ceph Day Beijing: Welcome Ceph Day Beijing: Welcome
Ceph Day Beijing: Welcome Ceph Community
 
Ceph Day Amsterdam 2015 - Ceph over IPv6
Ceph Day Amsterdam 2015 - Ceph over IPv6 Ceph Day Amsterdam 2015 - Ceph over IPv6
Ceph Day Amsterdam 2015 - Ceph over IPv6 Ceph Community
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Community
 
Ceph Day Berlin: Measuring and predicting performance of Ceph clusters
Ceph Day Berlin: Measuring and predicting performance of Ceph clustersCeph Day Berlin: Measuring and predicting performance of Ceph clusters
Ceph Day Berlin: Measuring and predicting performance of Ceph clustersCeph Community
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Community
 
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...Ceph Community
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
Calentamiento Global
Calentamiento GlobalCalentamiento Global
Calentamiento Global930809
 
Summer school social media 2013 - Sociale Media vd Toekomst
Summer school social media 2013 - Sociale Media vd ToekomstSummer school social media 2013 - Sociale Media vd Toekomst
Summer school social media 2013 - Sociale Media vd ToekomstArno Visser
 
Observacion y medida; Ambitos
Observacion y medida; AmbitosObservacion y medida; Ambitos
Observacion y medida; AmbitosDaniela
 
Mumbai media book 2013
Mumbai media book 2013Mumbai media book 2013
Mumbai media book 2013C-media
 
Valentine district charrette presentation
Valentine district charrette presentationValentine district charrette presentation
Valentine district charrette presentationthedublinproject
 

Andere mochten auch (20)

Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
 
Ceph Day New York 2014: Ceph Ecosystem Update
Ceph Day New York 2014: Ceph Ecosystem UpdateCeph Day New York 2014: Ceph Ecosystem Update
Ceph Day New York 2014: Ceph Ecosystem Update
 
DreamObjects - Ceph Day Nov 2012
DreamObjects - Ceph Day Nov 2012DreamObjects - Ceph Day Nov 2012
DreamObjects - Ceph Day Nov 2012
 
Ceph Day London 2014 - Ceph Ecosystem Overview
Ceph Day London 2014 - Ceph Ecosystem Overview Ceph Day London 2014 - Ceph Ecosystem Overview
Ceph Day London 2014 - Ceph Ecosystem Overview
 
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective
 
Ceph Day Beijing: Welcome
Ceph Day Beijing: Welcome Ceph Day Beijing: Welcome
Ceph Day Beijing: Welcome
 
Ceph Day Amsterdam 2015 - Ceph over IPv6
Ceph Day Amsterdam 2015 - Ceph over IPv6 Ceph Day Amsterdam 2015 - Ceph over IPv6
Ceph Day Amsterdam 2015 - Ceph over IPv6
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
 
Ceph Day Berlin: Measuring and predicting performance of Ceph clusters
Ceph Day Berlin: Measuring and predicting performance of Ceph clustersCeph Day Berlin: Measuring and predicting performance of Ceph clusters
Ceph Day Berlin: Measuring and predicting performance of Ceph clusters
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic Cloud
 
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...
Ceph Day Berlin: Building Your Own Disaster? The Safe Way to Make Ceph Storag...
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Boletín RadioAMLO no. 7
Boletín RadioAMLO no. 7Boletín RadioAMLO no. 7
Boletín RadioAMLO no. 7
 
Calentamiento Global
Calentamiento GlobalCalentamiento Global
Calentamiento Global
 
Summer school social media 2013 - Sociale Media vd Toekomst
Summer school social media 2013 - Sociale Media vd ToekomstSummer school social media 2013 - Sociale Media vd Toekomst
Summer school social media 2013 - Sociale Media vd Toekomst
 
Observacion y medida; Ambitos
Observacion y medida; AmbitosObservacion y medida; Ambitos
Observacion y medida; Ambitos
 
Mi perfil
Mi perfilMi perfil
Mi perfil
 
Miguel Delibes
Miguel DelibesMiguel Delibes
Miguel Delibes
 
Mumbai media book 2013
Mumbai media book 2013Mumbai media book 2013
Mumbai media book 2013
 
Valentine district charrette presentation
Valentine district charrette presentationValentine district charrette presentation
Valentine district charrette presentation
 

Ähnlich wie Unified Distributed Storage System Ceph Provides Object, Block and File Storage

Storage Developer Conference - 09/19/2012
Storage Developer Conference - 09/19/2012Storage Developer Conference - 09/19/2012
Storage Developer Conference - 09/19/2012Ceph Community
 
Webinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesWebinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesCeph Community
 
Storing VMs with Cinder and Ceph RBD.pdf
Storing VMs with Cinder and Ceph RBD.pdfStoring VMs with Cinder and Ceph RBD.pdf
Storing VMs with Cinder and Ceph RBD.pdfOpenStack Foundation
 
Docker: The Blue Whale of Awesomness
Docker: The Blue Whale of AwesomnessDocker: The Blue Whale of Awesomness
Docker: The Blue Whale of AwesomnessSigfred Balatan Jr.
 
New Features for Ceph with Cinder and Beyond
New Features for Ceph with Cinder and BeyondNew Features for Ceph with Cinder and Beyond
New Features for Ceph with Cinder and BeyondOpenStack Foundation
 
Ceph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross TurkCeph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross Turkbuildacloud
 
Rugged DevOps Will help you build ur cloudz
Rugged DevOps Will help you build ur cloudzRugged DevOps Will help you build ur cloudz
Rugged DevOps Will help you build ur cloudzJames Wickett
 
Ceph Day NYC: The Future of CephFS
Ceph Day NYC: The Future of CephFSCeph Day NYC: The Future of CephFS
Ceph Day NYC: The Future of CephFSCeph Community
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdSATOSHI TAGOMORI
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationCeph Community
 
London Ceph Day: The Future of CephFS
London Ceph Day: The Future of CephFSLondon Ceph Day: The Future of CephFS
London Ceph Day: The Future of CephFSCeph Community
 
Hadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big DataHadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big DataWANdisco Plc
 
Red hat enterprise_virtualization_load
Red hat enterprise_virtualization_loadRed hat enterprise_virtualization_load
Red hat enterprise_virtualization_loadsilviucojocaru
 
Webinar - Getting Started With Ceph
Webinar - Getting Started With CephWebinar - Getting Started With Ceph
Webinar - Getting Started With CephCeph Community
 
Docker Introduction
Docker IntroductionDocker Introduction
Docker IntroductionHao Fan
 
Open Cloud Interop Public
Open Cloud Interop PublicOpen Cloud Interop Public
Open Cloud Interop Publicrvanhoe
 

Ähnlich wie Unified Distributed Storage System Ceph Provides Object, Block and File Storage (20)

Storage Developer Conference - 09/19/2012
Storage Developer Conference - 09/19/2012Storage Developer Conference - 09/19/2012
Storage Developer Conference - 09/19/2012
 
Webinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesWebinar - Advance Ceph Features
Webinar - Advance Ceph Features
 
Storing VMs with Cinder and Ceph RBD.pdf
Storing VMs with Cinder and Ceph RBD.pdfStoring VMs with Cinder and Ceph RBD.pdf
Storing VMs with Cinder and Ceph RBD.pdf
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
 
Docker: The Blue Whale of Awesomness
Docker: The Blue Whale of AwesomnessDocker: The Blue Whale of Awesomness
Docker: The Blue Whale of Awesomness
 
New Features for Ceph with Cinder and Beyond
New Features for Ceph with Cinder and BeyondNew Features for Ceph with Cinder and Beyond
New Features for Ceph with Cinder and Beyond
 
Ceph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross TurkCeph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross Turk
 
Rugged DevOps Will help you build ur cloudz
Rugged DevOps Will help you build ur cloudzRugged DevOps Will help you build ur cloudz
Rugged DevOps Will help you build ur cloudz
 
Ceph Day NYC: The Future of CephFS
Ceph Day NYC: The Future of CephFSCeph Day NYC: The Future of CephFS
Ceph Day NYC: The Future of CephFS
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentd
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
 
London Ceph Day: The Future of CephFS
London Ceph Day: The Future of CephFSLondon Ceph Day: The Future of CephFS
London Ceph Day: The Future of CephFS
 
Hadoop on VMware
Hadoop on VMwareHadoop on VMware
Hadoop on VMware
 
Hadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big DataHadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big Data
 
Red hat enterprise_virtualization_load
Red hat enterprise_virtualization_loadRed hat enterprise_virtualization_load
Red hat enterprise_virtualization_load
 
Webinar - Getting Started With Ceph
Webinar - Getting Started With CephWebinar - Getting Started With Ceph
Webinar - Getting Started With Ceph
 
Docker Introduction
Docker IntroductionDocker Introduction
Docker Introduction
 
Open Cloud Interop Public
Open Cloud Interop PublicOpen Cloud Interop Public
Open Cloud Interop Public
 
librados
libradoslibrados
librados
 
librados
libradoslibrados
librados
 

Kürzlich hochgeladen

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Unified Distributed Storage System Ceph Provides Object, Block and File Storage

  • 1. ceph – a unified distributed storage system sage weil cloudopen – august 29, 2012
  • 2. outline ● why you should care ● what is it, what it does ● how it works ● architecture ● how you can use it ● librados ● radosgw ● RBD ● file system ● who we are, why we do this
  • 3. why should you care about another storage system?
  • 4. requirements ● diverse storage needs ● object storage ● block devices (for VMs) with snapshots, cloning ● shared file system with POSIX, coherent caches ● structured data... files, block devices, or objects? ● scale ● terabytes, petabytes, exabytes ● heterogeneous hardware ● reliability and fault tolerance
  • 5. time ● ease of administration ● no manual data migration, load balancing ● painless scaling ● expansion and contraction ● seamless migration
  • 6. cost ● linear function of size or performance ● incremental expansion ● no fork-lift upgrades ● no vendor lock-in ● choice of hardware ● choice of software ● open
  • 8. unified storage system ● objects ● native ● RESTful ● block ● thin provisioning, snapshots, cloning ● file ● strong consistency, snapshots
  • 9. APP APP APP APP HOST/VM HOST/VM CLIENT CLIENT RADOSGW RADOSGW RBD RBD CEPH FS CEPH FS LIBRADOS LIBRADOS A bucket-based A bucket-based A reliable and fully- A reliable and fully- A POSIX-compliant A POSIX-compliant A library allowing A library allowing REST gateway, REST gateway, distributed block distributed block distributed file distributed file apps to directly apps to directly compatible with S3 compatible with S3 device, with aaLinux device, with Linux system, with aa system, with access RADOS, access RADOS, and Swift and Swift kernel client and aa kernel client and Linux kernel client Linux kernel client with support for with support for QEMU/KVM driver QEMU/KVM driver and support for and support for C, C++, Java, C, C++, Java, FUSE FUSE Python, Ruby, Python, Ruby, and PHP and PHP RADOS RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing, A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes intelligent storage nodes
  • 10. open source ● LGPLv2 ● copyleft ● ok to link to proprietary code ● no copyright assignment ● no dual licensing ● no “enterprise-only” feature set ● active community ● commercial support
  • 11. distributed storage system ● data center scale ● 10s to 10,000s of machines ● terabytes to exabytes ● fault tolerant ● no single point of failure ● commodity hardware ● self-managing, self-healing
  • 12. ceph object model ● pools ● 1s to 100s ● independent namespaces or object collections ● replication level, placement policy ● objects ● bazillions ● blob of data (bytes to gigabytes) ● attributes (e.g., “version=12”; bytes to kilobytes) ● key/value bundle (bytes to gigabytes)
  • 13. why start with objects? ● more useful than (disk) blocks ● names in a single flat namespace ● variable size ● simple API with rich semantics ● more scalable than files ● no hard-to-distribute hierarchy ● update semantics do not span objects ● workload is trivially parallel
  • 14. DISK DISK DISK DISK DISK DISK HUMAN HUMAN COMPUTER COMPUTER DISK DISK DISK DISK DISK DISK DISK DISK
  • 15. DISK DISK DISK DISK HUMAN HUMAN DISK DISK HUMAN HUMAN COMPUTER COMPUTER DISK DISK DISK DISK HUMAN HUMAN DISK DISK DISK DISK
  • 16. HUMAN HUMAN HUMAN HUMAN HUMAN HUMAN HUMAN HUMAN DISK DISK HUMAN HUMAN HUMAN HUMAN DISK DISK HUMAN HUMAN HUMAN HUMAN DISK DISK DISK DISK HUMAN HUMAN DISK DISK HUMAN HUMAN HUMAN HUMAN DISK DISK (COMPUTER)) (COMPUTER HUMAN HUMAN DISK DISK HUMAN HUMAN HUMAN HUMAN DISK DISK HUMAN HUMAN HUMAN HUMAN DISK DISK HUMAN HUMAN DISK DISK HUMAN HUMAN DISK DISK HUMAN HUMAN HUMAN HUMAN DISK DISK HUMAN HUMAN HUMAN HUMAN (actually more like this…)
  • 17. COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK HUMAN HUMAN COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK HUMAN HUMAN COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK HUMAN HUMAN COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK
  • 18. OSD OSD OSD OSD OSD FS FS FS FS FS btrfs xfs ext4 DISK DISK DISK DISK DISK M M M
  • 19. Monitors: • Maintain cluster membership and state M • Provide consensus for distributed decision-making • Small, odd number • These do not serve stored objects to clients Object Storage Daemons (OSDs): • At least three in a cluster • One per disk or RAID group • Serve stored objects to clients • Intelligently peer to perform replication tasks
  • 20. HUMAN M M M
  • 21. data distribution ● all objects are replicated N times ● objects are automatically placed, balanced, migrated in a dynamic cluster ● must consider physical infrastructure ● ceph-osds on hosts in racks in rows in data centers ● three approaches ● pick a spot; remember where you put it ● pick a spot; write down where you put it ● calculate where to put it, where to find it
  • 22. CRUSH • Pseudo-random placement algorithm • Fast calculation, no lookup • Repeatable, deterministic • Ensures even distribution • Stable mapping • Limited data migration • Rule-based configuration • specifiable replication • infrastructure topology aware • allows weighting
  • 23. 10 10 01 01 10 10 01 11 01 10 10 10 01 01 10 10 01 11 01 10 hash(object name) % num pg 10 10 10 10 01 01 01 01 10 10 10 10 01 01 11 11 01 01 10 10 CRUSH(pg, cluster state, policy)
  • 24. 10 10 01 01 10 10 01 11 01 10 10 10 01 01 10 10 01 11 01 10 10 10 10 10 01 01 01 01 10 10 10 10 01 01 11 11 01 01 10 10
  • 25. RADOS ● monitors publish osd map that describes cluster state ● ceph-osd node status (up/down, weight, IP) M ● CRUSH function specifying desired data distribution ● object storage daemons (OSDs) ● safely replicate and store object ● migrate data as the cluster changes over time ● coordinate based on shared view of reality ● decentralized, distributed approach allows ● massive scales (10,000s of servers or more) ● the illusion of a single copy with consistent behavior
  • 27.
  • 28.
  • 29. CLIENT ??
  • 30. APP APP APP APP HOST/VM HOST/VM CLIENT CLIENT RADOSGW RADOSGW RBD RBD CEPH FS CEPH FS LIBRADOS A bucket-based A bucket-based A reliable and fully- A reliable and fully- A POSIX-compliant A POSIX-compliant A library allowing REST gateway, REST gateway, distributed block distributed block distributed file distributed file apps to directly compatible with S3 compatible with S3 device, with aaLinux device, with Linux system, with aa system, with access RADOS, and Swift and Swift kernel client and aa kernel client and Linux kernel client Linux kernel client with support for QEMU/KVM driver QEMU/KVM driver and support for and support for C, C++, Java, FUSE FUSE Python, Ruby, and PHP RADOS RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing, A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes intelligent storage nodes
  • 31. APP APP LIBRADOS LIBRADOS native M M M M M M
  • 32. LIBRADOS L • Provides direct access to RADOS for applications • C, C++, Python, PHP, Java • No HTTP overhead
  • 33. APP APP APP APP HOST/VM HOST/VM CLIENT CLIENT RADOSGW RBD RBD CEPH FS CEPH FS LIBRADOS LIBRADOS A bucket-based A reliable and fully- A reliable and fully- A POSIX-compliant A POSIX-compliant A library allowing A library allowing REST gateway, distributed block distributed block distributed file distributed file apps to directly apps to directly compatible with S3 device, with aaLinux device, with Linux system, with aa system, with access RADOS, access RADOS, and Swift kernel client and aa kernel client and Linux kernel client Linux kernel client with support for with support for QEMU/KVM driver QEMU/KVM driver and support for and support for C, C++, Java, C, C++, Java, FUSE FUSE Python, Ruby, Python, Ruby, and PHP and PHP RADOS RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing, A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes intelligent storage nodes
  • 34. APP APP APP APP REST RADOSGW RADOSGW RADOSGW RADOSGW LIBRADOS LIBRADOS LIBRADOS LIBRADOS native M M M M M M
  • 35. RADOS Gateway: • REST-based interface to RADOS • Supports buckets, accounting • Compatible with S3 and Swift applications
  • 36. APP APP APP APP HOST/VM HOST/VM CLIENT CLIENT RADOSGW RADOSGW RBD CEPH FS CEPH FS LIBRADOS LIBRADOS A bucket-based A bucket-based A reliable and fully- A POSIX-compliant A POSIX-compliant A library allowing A library allowing REST gateway, REST gateway, distributed block distributed file distributed file apps to directly apps to directly compatible with S3 compatible with S3 device, with a Linux system, with aa system, with access RADOS, access RADOS, and Swift and Swift kernel client and a Linux kernel client Linux kernel client with support for with support for QEMU/KVM driver and support for and support for C, C++, Java, C, C++, Java, FUSE FUSE Python, Ruby, Python, Ruby, and PHP and PHP RADOS RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing, A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes intelligent storage nodes
  • 37. COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER COMPUTER COMPUTER DISK DISK DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK
  • 38. COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK VM VM COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK VM VM COMPUTER DISK COMPUTER DISK COMPUTER COMPUTER DISK DISK VM VM COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK COMPUTER COMPUTER DISK DISK
  • 39. VM VM VIRTUALIZATION CONTAINER VIRTUALIZATION CONTAINER LIBRBD LIBRBD LIBRADOS LIBRADOS M M M M M M
  • 40. CONTAINER CONTAINER VM VM CONTAINER CONTAINER LIBRBD LIBRBD LIBRBD LIBRBD LIBRADOS LIBRADOS LIBRADOS LIBRADOS M M M M M M
  • 41. HOST HOST KRBD (KERNEL MODULE) KRBD (KERNEL MODULE) LIBRADOS LIBRADOS M M M M M M
  • 42. RADOS Block Device: • Storage of virtual disks in RADOS • Decouples VMs and containers • Live migration! • Images are striped across the cluster • Snapshots! • Support in • Qemu/KVM • OpenStack, CloudStack • Mainline Linux kernel
  • 43. HOW DO YOU SPIN UP THOUSANDS OF VMs INSTANTLY AND EFFICIENTLY?
  • 44. instant copy 144 0 0 0 0 = 144
  • 45. write CLIENT write write write 144 4 = 148
  • 46. read read CLIENT read 144 4 = 148
  • 47. APP APP APP APP HOST/VM HOST/VM CLIENT CLIENT RADOSGW RADOSGW RBD RBD CEPH FS LIBRADOS LIBRADOS A bucket-based A bucket-based A reliable and fully- A reliable and fully- A POSIX-compliant A library allowing A library allowing REST gateway, REST gateway, distributed block distributed block distributed file apps to directly apps to directly compatible with S3 compatible with S3 device, with aaLinux device, with Linux system, with a access RADOS, access RADOS, and Swift and Swift kernel client and aa kernel client and Linux kernel client with support for with support for QEMU/KVM driver QEMU/KVM driver and support for C, C++, Java, C, C++, Java, FUSE Python, Ruby, Python, Ruby, and PHP and PHP RADOS RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing, A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes intelligent storage nodes
  • 48. CLIENT CLIENT metadata 01 01 data 10 10 M M M M M M
  • 49. M M M M M M
  • 50. Metadata Server • Manages metadata for a POSIX-compliant shared filesystem • Directory hierarchy • File metadata (owner, timestamps, mode, etc.) • Stores metadata in RADOS • Does not serve file data to clients • Only required for shared filesystem
  • 52.
  • 53.
  • 54.
  • 55.
  • 57. recursive accounting ● ceph-mds tracks recursive directory stats ● file sizes ● file and directory counts ● modification time ● virtual xattrs present full stats ● efficient $ ls ­alSh | head total 0 drwxr­xr­x 1 root            root      9.7T 2011­02­04 15:51 . drwxr­xr­x 1 root            root      9.7T 2010­12­16 15:06 .. drwxr­xr­x 1 pomceph         pg4194980 9.6T 2011­02­24 08:25 pomceph drwxr­xr­x 1 mcg_test1       pg2419992  23G 2011­02­02 08:57 mcg_test1 drwx­­x­­­ 1 luko            adm        19G 2011­01­21 12:17 luko drwx­­x­­­ 1 eest            adm        14G 2011­02­04 16:29 eest drwxr­xr­x 1 mcg_test2       pg2419992 3.0G 2011­02­02 09:34 mcg_test2 drwx­­x­­­ 1 fuzyceph        adm       1.5G 2011­01­18 10:46 fuzyceph drwxr­xr­x 1 dallasceph      pg275     596M 2011­01­14 10:06 dallasceph
  • 58. snapshots ● volume or subvolume snapshots unusable at petabyte scale ● snapshot arbitrary subdirectories ● simple interface ● hidden '.snap' directory ● no special tools $ mkdir foo/.snap/one # create snapshot $ ls foo/.snap one $ ls foo/bar/.snap _one_1099511627776 # parent's snap name is mangled $ rm foo/myfile $ ls -F foo bar/ $ ls -F foo/.snap/one myfile bar/ $ rmdir foo/.snap/one # remove snapshot
  • 59. multiple protocols, implementations ● Linux kernel client ● mount -t ceph 1.2.3.4:/ /mnt NFS SMB/CIFS ● export (NFS), Samba (CIFS) ● ceph-fuse Ganesha Samba libcephfs libcephfs ● libcephfs.so ● your app Hadoop your app libcephfs libcephfs ● Samba (CIFS) ● Ganesha (NFS) ceph-fuse ceph fuse ● Hadoop (map/reduce) kernel
  • 60. APP APP APP APP HOST/VM HOST/VM CLIENT CLIENT RADOSGW RADOSGW RBD RBD CEPH FS CEPH FS LIBRADOS LIBRADOS A bucket-based A bucket-based A reliable and fully- A reliable and fully- A POSIX-compliant A POSIX-compliant A library allowing A library allowing REST gateway, REST gateway, distributed block distributed block distributed file distributed file apps to directly apps to directly compatible with S3 compatible with S3 device, with aaLinux device, with Linux system, with aa system, with access RADOS, access RADOS, and Swift and Swift kernel client and aa kernel client and Linux kernel client Linux kernel client with support for with support for QEMU/KVM driver QEMU/KVM driver and support for and support for C, C++, Java, C, C++, Java, FUSE FUSE Python, Ruby, Python, Ruby, and PHP and PHP AWESOME AWESOME NEARLY AWESOME AWESOME RADOS RADOS AWESOME A reliable, autonomous, distributed object store comprised of self-healing, self-managing, A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes intelligent storage nodes
  • 61. why we do this ● limited options for scalable open source storage ● proprietary solutions ● expensive ● don't scale (well or out) ● marry hardware and software ● industry needs to change
  • 62. who we are ● Ceph created at UC Santa Cruz (2007) ● supported by DreamHost (2008-2011) ● Inktank (2012) ● Los Angeles, Sunnyvale, San Francisco, remote ● growing user and developer community ● Linux distros, users, cloud stacks, SIs, OEMs http://ceph.com/
  • 63. thanks BoF tonight @ 5:15 sage weil sage@inktank.com http://github.com/ceph @liewegas http://ceph.com/
  • 64.
  • 65. why we like btrfs ● pervasive checksumming ● snapshots, copy-on-write ● efficient metadata (xattrs) ● inline data for small files ● transparent compression ● integrated volume management ● software RAID, mirroring, error recovery ● SSD-aware ● online fsck ● active development community