SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Ceph for Big Science
Dan van der Ster, CERN IT
Cephalocon APAC 2018
23 March 2018 | Beijing
27kmcircumference
5
~30MHz interactions filtered to ~1kHz recorded collisions
6
~30MHz interactions filtered to ~1kHz recorded collisions
ATLAS Detector, 100m underground
Higgs Boson Candidate
300 petabytes storage, 230 000 CPU cores
Worldwide LHC Compute Grid
Worldwide LHC Compute Grid
Beijing IHEP
WLCG Centre
Ceph at CERN: Yesterday & Today
12
14
First production cluster built mid to late 2013
for OpenStack Cinder block storage.
3 PB, 48x24x3TB drives, 200 journaling SSDs
Ceph dumpling v0.67 on Scientific Linux 6
We were very cautious: 4 replicas! (now 3)
History
• March 2013: 300TB proof of concept
• Dec 2013: 3PB in prod for RBD
• 2014-15: EC, radosstriper, radosfs
• 2016: 3PB to 6PB, no downtime
• 2017: 8 prod clusters
15
CERN Ceph Clusters Size Version
OpenStack Cinder/Glance Production 5.5PB jewel
Satellite data centre (1000km away) 0.4PB luminous
CephFS (HPC+Manila) Production 0.8PB luminous
Manila testing cluster 0.4PB luminous
Hyperconverged HPC 0.4PB luminous
CASTOR/XRootD Production 4.2PB luminous
CERN Tape Archive 0.8PB luminous
S3+SWIFT Production 0.9PB luminous
16
CephFS
17
CephFS: Filer Evolution
• Virtual NFS filers are stable and
perform well:
• nfsd, ZFS, zrep, OpenStack VMs,
Cinder/RBD
• We have ~60TB on ~30 servers
• High performance, but not scalable:
• Quota management tedious
• Labour-intensive to create new filers
• Can’t scale performance horizontally
18
CephFS: Filer Evolution
• OpenStack Manila (with CephFS) has most of the needed features:
• Multi-tenant with security isolation + quotas
• Easy self-service share provisioning
• Scalable performance (add more MDSs or OSDs as needed)
• Successful testing with preproduction users since mid-2017.
• Single MDS was seen as a bottleneck. Luminous has stable multi-MDS.
• Manila + CephFS now in production:
• One user already asked for 2000 shares
• Also using for Kubernetes: we are working on a new CSI CephFS plugin
• Really need kernel quota support!
19
Multi-MDS in Production
• ~20 tenants on our pre-prod
environment for several
months
• 2 active MDSs since luminous
• Enabled multi-MDS on our
production cluster on Jan 24
• Currently have 3 active MDSs
• default balancer and pinning
20
HPC on CephFS?
• CERN is mostly a high
throughput computing lab:
• File-based parallelism
• Several smaller HPC use-
cases exist within our lab:
• Beams, plasma, CFD,
QCD, ASICs
• Need full POSIX,
consistency, parallel IO
21
“Software Defined HPC”
• CERN’s approach is to build HPC clusters with commodity
parts: “Software Defined HPC”
• Compute side is solved with HTCondor & SLURM
• Typical HPC storage is not very attractive (missing expertise + budget)
• 200-300 HPC nodes accessing ~1PB CephFS since mid-2016:
• Manila + HPC use-cases on the same clusters. HPC is just another
user.
• Quite stable but not super high performance
22
IO-500
• Storage benchmark announced by John Bent on ceph-users ML
(from SuperComputing 2017)
• « goal is to improve parallel file systems by ensuring that sites
publish results of both "hero" and "anti-hero" runs and by
sharing the tuning and configuration »
• We have just started testing on our CephFS clusters:
• IOR throughput tests, mdtest + find metadata test
• Easy/hard mode for shared/unique file tests
23
IO-500 First Look…No tuning!
Test Result
ior_easy_write 2.595 GB/s
ior_hard_write 0.761 GB/s
ior_easy_read 4.951 GB/s
ior_hard_read 0.944 GB/s
24
Test Result
mdtest_easy_write 1.774 kiops
mdtest_hard_write 1.512 kiops
find 50.00 kiops
mdtest_easy_stat 8.722 kiops
mdtest_hard_stat 7.076 kiops
mdtest_easy_delete 0.747 kiops
mdtest_hard_read 2.521 kiops
mdtest_hard_delete 1.351 kiops
Luminous v12.2.4 -- Tested March 2018
411 OSDs: 800TB SSDs, 2 per server
OSDs running on same HW as clients
2 active MDSs running on VMs
[SCORE] Bandwidth 1.74 GB/s : IOPS 3.47 kiops : TOTAL 2.46
IO-500 First Look…No tuning!
Test Result
ior_easy_write 2.595 GB/s
ior_hard_write 0.761 GB/s
ior_easy_read 4.951 GB/s
ior_hard_read 0.944 GB/s
25
Test Result
mdtest_easy_write 1.774 kiops
mdtest_hard_write 1.512 kiops
find 50.000 kiops
mdtest_easy_stat 8.722 kiops
mdtest_hard_stat 7.076 kiops
mdtest_easy_delete 0.747 kiops
mdtest_hard_read 2.521 kiops
mdtest_hard_delete 1.351 kiops
Luminous v12.2.4 -- Tested March 2018
411 OSDs: 800TB SSDs, 2 per server
OSDs running on same HW as clients
2 active MDSs running on VMs
[SCORE] Bandwidth 1.74 GB/s : IOPS 3.47 kiops : TOTAL 2.46
RGW
26
S3 @ CERN
• Ceph luminous cluster with VM gateways. Single region.
• 4+2 erasure coding. Physics data for small objects, volunteer computing, some backups.
• Pre-signed URLs and object expiration working well.
• HAProxy is very useful:
• High-availability & mapping special buckets to dedicated gateways
27
RBD
28
RBD: Ceph + OpenStack
29
Cinder Volume Types
Volume Type Size (TB) Count
standard 871 4,758
io1 440 608
cp1 97 118
cpio1 269 107
wig-cp1 26 19
wig-cpio1 106 13
io-test10k 20 1
Totals: 1,811 5,624
30
RBD @ CERN
• OpenStack Cinder + Glance use-cases continue to be highly
reliable:
• QoS via IOPS/BW throttles is essential.
• Spectre/Meltdown reboots updated all clients to luminous!
• Ongoing work:
• Recently finished an expanded rbd trash feature
• Just starting work on a persistent cache for librbd
• CERN Openlab collaboration with Rackspace!
• Writing a backup driver for glance (RBD to S3)
31
Hyperconverged Ceph/Cloud
• Experimenting with co-located ceph-osd on HVs and HPC:
• New cluster with 384 SSDs on HPC nodes
• Minor issues related to server isolation:
• cgroups or NUMA pinning are options but not yet used.
• Issues are related to our operations culture:
• We (Ceph team) don’t own the servers – need to co-operate with the
cloud/HPC teams.
• E.g. When is it ok to reboot a node? how to drain a node? Software upgrade
procedures.
32
User Feedback:
From Jewel to Luminous
33
Jewel to Luminous Upgrade Notes
• In general upgrades went well with no big problems.
• New/replaced OSDs are BlueStore (ceph-volume lvm)
• Preparing a FileStore conversion script for our infrastructure
• ceph-mgr balancer is very interesting:
• Actively testing the crush-compat mode
• Python module can be patched in place for quick fixes
34
How to replace many OSDs?
35
Fully replaced 3PB of block storage
with 6PB new hardware over several
weeks, transparent to users.
cernceph/ceph-scripts
Current Challenges
• RBD / OpenStack Cinder:
• Ops: how to identify active volumes?
• “rbd top”
• Performance: µs latencies and kHz IOPS.
• Need persistent SSD caches.
• On the wire encryption, client-side volume encryption
• OpenStack: volume type / availability zone coupling for
hyper-converged clusters
36
Current Challenges
• CephFS HPC:
• HPC: parallel MPI I/O and single-MDS metadata perf (IO-500!)
• Copying data across /cephfs: need "rsync --ceph"
• CephFS general use-case:
• Scaling to 10,000 (or 100,000!) clients:
• client throttles, tools to block/disconnect noisy users/clients.
• Need “ceph client top”
• native Kerberos (without NFS gateway), group accounting and quotas
• HA CIFS and NFS gateways for non-Linux clients
• How to backup a 10 billion file CephFS?
• e.g. how about binary diff between snaps, similar to ZFS send/receive?
37
Current Challenges
• RADOS:
• How to phase in new features on old clusters
• e.g. we have 3PB of RBD data with hammer tunables
• Pool-level object backup (convert from replicated to EC, copy to non-Ceph)
• rados export the diff between two pool snaphots?
• Areas we cannot use Ceph yet:
• Storage for large enterprise databases (are we close?)
• Large scale batch processing
• Single filesystems spanning multiple sites
• HSM use-cases (CephFS with tape backend?, Glacier for S3?)
38
Future…
39
HEP Computing for the 2020s
• Run-2 (2015-18):
~50-80PB/year
• Run-3 (2020-23):
~150PB/year
• Run-4: ~600PB/year?!
“Data Lakes” – globally distributed, flexible placement, ubiquitous access
https://arxiv.org/abs/1712.06982
Ceph Bigbang Scale Testing
• Bigbang scale tests mutually benefit
CERN & Ceph project
• Bigbang I: 30PB, 7200 OSDs, Ceph
hammer. Several osdmap limitations
• Bigbang II: Similar size, Ceph jewel.
Scalability limited by OSD/MON
messaging. Motivated ceph-mgr
• Bigbang III: 65PB, 10800 OSDs
41
https://ceph.com/community/new-luminous-scalability/
Thanks…
42
Thanks to my CERN Colleagues
• Ceph team at CERN
• Hervé Rousseau, Teo Mouratidis, Roberto Valverde, Paul Musset, Julien Collet
• Massimo Lamanna / Alberto Pace (Storage Group Leadership)
• Andreas-Joachim Peters (Intel EC)
• Sebastien Ponce (radosstriper)
• OpenStack & Containers teams at CERN
• Tim Bell, Jan van Eldik, Arne Wiebalck (also co-initiator of Ceph at CERN),
Belmiro Moreira, Ricardo Rocha, Jose Castro Leon
• HPC team at CERN
• Nils Hoimyr, Carolina Lindqvist, Pablo Llopis
43
!

Weitere ähnliche Inhalte

Was ist angesagt?

OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaKamesh Pemmaraju
 
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Community
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleJames Saint-Rossy
 
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanBasic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanCeph Community
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhCeph Community
 
Which Hypervisor is Best?
Which Hypervisor is Best?Which Hypervisor is Best?
Which Hypervisor is Best?Kyle Bader
 
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiRADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiCeph Community
 
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Community
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph Community
 
Evaluation of RBD replication options @CERN
Evaluation of RBD replication options @CERNEvaluation of RBD replication options @CERN
Evaluation of RBD replication options @CERNCeph Community
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Community
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiCeph Community
 
RBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanRBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanCeph Community
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauCeph Community
 
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...Ceph Community
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephRongze Zhu
 

Was ist angesagt? (20)

OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of Alabama
 
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at Scale
 
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanBasic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon Oh
 
MySQL on Ceph
MySQL on CephMySQL on Ceph
MySQL on Ceph
 
Which Hypervisor is Best?
Which Hypervisor is Best?Which Hypervisor is Best?
Which Hypervisor is Best?
 
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiRADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
 
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-Gene
 
Evaluation of RBD replication options @CERN
Evaluation of RBD replication options @CERNEvaluation of RBD replication options @CERN
Evaluation of RBD replication options @CERN
 
ceph-barcelona-v-1.2
ceph-barcelona-v-1.2ceph-barcelona-v-1.2
ceph-barcelona-v-1.2
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You Ji
 
RBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanRBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason Dillaman
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
 
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
 

Ähnlich wie Ceph for Big Science - Dan van der Ster

The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)Nicolas Poggi
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Ceph Community
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Dave Holland
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stackNikos Kormpakis
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureCeph Community
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterPatrick Quairoli
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 
Ceph in 2023 and Beyond.pdf
Ceph in 2023 and Beyond.pdfCeph in 2023 and Beyond.pdf
Ceph in 2023 and Beyond.pdfClyso GmbH
 
TUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterTUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterEttore Simone
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFShapeBlue
 
Big data talk barcelona - jsr - jc
Big data talk   barcelona - jsr - jcBig data talk   barcelona - jsr - jc
Big data talk barcelona - jsr - jcJames Saint-Rossy
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation Yahoo Developer Network
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications OpenEBS
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightRed_Hat_Storage
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightColleen Corrice
 

Ähnlich wie Ceph for Big Science - Dan van der Ster (20)

The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stack
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
Ceph
CephCeph
Ceph
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Ceph in 2023 and Beyond.pdf
Ceph in 2023 and Beyond.pdfCeph in 2023 and Beyond.pdf
Ceph in 2023 and Beyond.pdf
 
TUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterTUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data Center
 
Manycores for the Masses
Manycores for the MassesManycores for the Masses
Manycores for the Masses
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoF
 
Big data talk barcelona - jsr - jc
Big data talk   barcelona - jsr - jcBig data talk   barcelona - jsr - jc
Big data talk barcelona - jsr - jc
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer Spotlight
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer Spotlight
 

Kürzlich hochgeladen

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Kürzlich hochgeladen (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Ceph for Big Science - Dan van der Ster

  • 1.
  • 2. Ceph for Big Science Dan van der Ster, CERN IT Cephalocon APAC 2018 23 March 2018 | Beijing
  • 3.
  • 5. 5 ~30MHz interactions filtered to ~1kHz recorded collisions
  • 6. 6 ~30MHz interactions filtered to ~1kHz recorded collisions
  • 7. ATLAS Detector, 100m underground
  • 9. 300 petabytes storage, 230 000 CPU cores
  • 11. Worldwide LHC Compute Grid Beijing IHEP WLCG Centre
  • 12. Ceph at CERN: Yesterday & Today 12
  • 13.
  • 14. 14 First production cluster built mid to late 2013 for OpenStack Cinder block storage. 3 PB, 48x24x3TB drives, 200 journaling SSDs Ceph dumpling v0.67 on Scientific Linux 6 We were very cautious: 4 replicas! (now 3)
  • 15. History • March 2013: 300TB proof of concept • Dec 2013: 3PB in prod for RBD • 2014-15: EC, radosstriper, radosfs • 2016: 3PB to 6PB, no downtime • 2017: 8 prod clusters 15
  • 16. CERN Ceph Clusters Size Version OpenStack Cinder/Glance Production 5.5PB jewel Satellite data centre (1000km away) 0.4PB luminous CephFS (HPC+Manila) Production 0.8PB luminous Manila testing cluster 0.4PB luminous Hyperconverged HPC 0.4PB luminous CASTOR/XRootD Production 4.2PB luminous CERN Tape Archive 0.8PB luminous S3+SWIFT Production 0.9PB luminous 16
  • 18. CephFS: Filer Evolution • Virtual NFS filers are stable and perform well: • nfsd, ZFS, zrep, OpenStack VMs, Cinder/RBD • We have ~60TB on ~30 servers • High performance, but not scalable: • Quota management tedious • Labour-intensive to create new filers • Can’t scale performance horizontally 18
  • 19. CephFS: Filer Evolution • OpenStack Manila (with CephFS) has most of the needed features: • Multi-tenant with security isolation + quotas • Easy self-service share provisioning • Scalable performance (add more MDSs or OSDs as needed) • Successful testing with preproduction users since mid-2017. • Single MDS was seen as a bottleneck. Luminous has stable multi-MDS. • Manila + CephFS now in production: • One user already asked for 2000 shares • Also using for Kubernetes: we are working on a new CSI CephFS plugin • Really need kernel quota support! 19
  • 20. Multi-MDS in Production • ~20 tenants on our pre-prod environment for several months • 2 active MDSs since luminous • Enabled multi-MDS on our production cluster on Jan 24 • Currently have 3 active MDSs • default balancer and pinning 20
  • 21. HPC on CephFS? • CERN is mostly a high throughput computing lab: • File-based parallelism • Several smaller HPC use- cases exist within our lab: • Beams, plasma, CFD, QCD, ASICs • Need full POSIX, consistency, parallel IO 21
  • 22. “Software Defined HPC” • CERN’s approach is to build HPC clusters with commodity parts: “Software Defined HPC” • Compute side is solved with HTCondor & SLURM • Typical HPC storage is not very attractive (missing expertise + budget) • 200-300 HPC nodes accessing ~1PB CephFS since mid-2016: • Manila + HPC use-cases on the same clusters. HPC is just another user. • Quite stable but not super high performance 22
  • 23. IO-500 • Storage benchmark announced by John Bent on ceph-users ML (from SuperComputing 2017) • « goal is to improve parallel file systems by ensuring that sites publish results of both "hero" and "anti-hero" runs and by sharing the tuning and configuration » • We have just started testing on our CephFS clusters: • IOR throughput tests, mdtest + find metadata test • Easy/hard mode for shared/unique file tests 23
  • 24. IO-500 First Look…No tuning! Test Result ior_easy_write 2.595 GB/s ior_hard_write 0.761 GB/s ior_easy_read 4.951 GB/s ior_hard_read 0.944 GB/s 24 Test Result mdtest_easy_write 1.774 kiops mdtest_hard_write 1.512 kiops find 50.00 kiops mdtest_easy_stat 8.722 kiops mdtest_hard_stat 7.076 kiops mdtest_easy_delete 0.747 kiops mdtest_hard_read 2.521 kiops mdtest_hard_delete 1.351 kiops Luminous v12.2.4 -- Tested March 2018 411 OSDs: 800TB SSDs, 2 per server OSDs running on same HW as clients 2 active MDSs running on VMs [SCORE] Bandwidth 1.74 GB/s : IOPS 3.47 kiops : TOTAL 2.46
  • 25. IO-500 First Look…No tuning! Test Result ior_easy_write 2.595 GB/s ior_hard_write 0.761 GB/s ior_easy_read 4.951 GB/s ior_hard_read 0.944 GB/s 25 Test Result mdtest_easy_write 1.774 kiops mdtest_hard_write 1.512 kiops find 50.000 kiops mdtest_easy_stat 8.722 kiops mdtest_hard_stat 7.076 kiops mdtest_easy_delete 0.747 kiops mdtest_hard_read 2.521 kiops mdtest_hard_delete 1.351 kiops Luminous v12.2.4 -- Tested March 2018 411 OSDs: 800TB SSDs, 2 per server OSDs running on same HW as clients 2 active MDSs running on VMs [SCORE] Bandwidth 1.74 GB/s : IOPS 3.47 kiops : TOTAL 2.46
  • 27. S3 @ CERN • Ceph luminous cluster with VM gateways. Single region. • 4+2 erasure coding. Physics data for small objects, volunteer computing, some backups. • Pre-signed URLs and object expiration working well. • HAProxy is very useful: • High-availability & mapping special buckets to dedicated gateways 27
  • 29. RBD: Ceph + OpenStack 29
  • 30. Cinder Volume Types Volume Type Size (TB) Count standard 871 4,758 io1 440 608 cp1 97 118 cpio1 269 107 wig-cp1 26 19 wig-cpio1 106 13 io-test10k 20 1 Totals: 1,811 5,624 30
  • 31. RBD @ CERN • OpenStack Cinder + Glance use-cases continue to be highly reliable: • QoS via IOPS/BW throttles is essential. • Spectre/Meltdown reboots updated all clients to luminous! • Ongoing work: • Recently finished an expanded rbd trash feature • Just starting work on a persistent cache for librbd • CERN Openlab collaboration with Rackspace! • Writing a backup driver for glance (RBD to S3) 31
  • 32. Hyperconverged Ceph/Cloud • Experimenting with co-located ceph-osd on HVs and HPC: • New cluster with 384 SSDs on HPC nodes • Minor issues related to server isolation: • cgroups or NUMA pinning are options but not yet used. • Issues are related to our operations culture: • We (Ceph team) don’t own the servers – need to co-operate with the cloud/HPC teams. • E.g. When is it ok to reboot a node? how to drain a node? Software upgrade procedures. 32
  • 33. User Feedback: From Jewel to Luminous 33
  • 34. Jewel to Luminous Upgrade Notes • In general upgrades went well with no big problems. • New/replaced OSDs are BlueStore (ceph-volume lvm) • Preparing a FileStore conversion script for our infrastructure • ceph-mgr balancer is very interesting: • Actively testing the crush-compat mode • Python module can be patched in place for quick fixes 34
  • 35. How to replace many OSDs? 35 Fully replaced 3PB of block storage with 6PB new hardware over several weeks, transparent to users. cernceph/ceph-scripts
  • 36. Current Challenges • RBD / OpenStack Cinder: • Ops: how to identify active volumes? • “rbd top” • Performance: µs latencies and kHz IOPS. • Need persistent SSD caches. • On the wire encryption, client-side volume encryption • OpenStack: volume type / availability zone coupling for hyper-converged clusters 36
  • 37. Current Challenges • CephFS HPC: • HPC: parallel MPI I/O and single-MDS metadata perf (IO-500!) • Copying data across /cephfs: need "rsync --ceph" • CephFS general use-case: • Scaling to 10,000 (or 100,000!) clients: • client throttles, tools to block/disconnect noisy users/clients. • Need “ceph client top” • native Kerberos (without NFS gateway), group accounting and quotas • HA CIFS and NFS gateways for non-Linux clients • How to backup a 10 billion file CephFS? • e.g. how about binary diff between snaps, similar to ZFS send/receive? 37
  • 38. Current Challenges • RADOS: • How to phase in new features on old clusters • e.g. we have 3PB of RBD data with hammer tunables • Pool-level object backup (convert from replicated to EC, copy to non-Ceph) • rados export the diff between two pool snaphots? • Areas we cannot use Ceph yet: • Storage for large enterprise databases (are we close?) • Large scale batch processing • Single filesystems spanning multiple sites • HSM use-cases (CephFS with tape backend?, Glacier for S3?) 38
  • 40. HEP Computing for the 2020s • Run-2 (2015-18): ~50-80PB/year • Run-3 (2020-23): ~150PB/year • Run-4: ~600PB/year?! “Data Lakes” – globally distributed, flexible placement, ubiquitous access https://arxiv.org/abs/1712.06982
  • 41. Ceph Bigbang Scale Testing • Bigbang scale tests mutually benefit CERN & Ceph project • Bigbang I: 30PB, 7200 OSDs, Ceph hammer. Several osdmap limitations • Bigbang II: Similar size, Ceph jewel. Scalability limited by OSD/MON messaging. Motivated ceph-mgr • Bigbang III: 65PB, 10800 OSDs 41 https://ceph.com/community/new-luminous-scalability/
  • 43. Thanks to my CERN Colleagues • Ceph team at CERN • Hervé Rousseau, Teo Mouratidis, Roberto Valverde, Paul Musset, Julien Collet • Massimo Lamanna / Alberto Pace (Storage Group Leadership) • Andreas-Joachim Peters (Intel EC) • Sebastien Ponce (radosstriper) • OpenStack & Containers teams at CERN • Tim Bell, Jan van Eldik, Arne Wiebalck (also co-initiator of Ceph at CERN), Belmiro Moreira, Ricardo Rocha, Jose Castro Leon • HPC team at CERN • Nils Hoimyr, Carolina Lindqvist, Pablo Llopis 43
  • 44. !