CERN, the European Laboratory for Particle Physics, provides the infrastructure and resources to thousands of scientists all around the world to uncover the mysteries of the Universe. In the quest to build a private Cloud Infrastructure to support its users, CERN started early evaluating the OpenStack project, building several prototypes and engaging with the community. Finally, in 2013 CERN released its production Cloud Infrastructure using OpenStack. Since then we moved from a few hundred cores to a multi-cell deployment spread between different regions. After 7 years deploying and managing OpenStack in production at a large scale, we now look back and discuss the challenges of building a massive scale infrastructure from 0 to +300K cores. In this talk we will dive into the history, architecture, tools and technical decisions behind the CERN Cloud Infrastructure over the years.
27. Nova - Cells
â Allows Nova to scale to thousands of compute nodes
â Biggest Nova Cells deployment
â Moved from 2 cells to +80 cells
â Upgrade from CellsV1 to CellsV2 in 2018
27
28. Ceilometer - The Rise & Fall
â OpenStack Ceilometer deployed
â Removed after run it for 3 years. Not scalable and difficult to retrieve data
28
29. Storage - Cinder, Manila, S3
â OpenStack Cinder with Ceph backend (2014)
â Several volume types available
â OpenStack Manila (Fileshare service). Backed by CephFS (2017)
â S3 available (end 2018)
29
30. Container Orchestration - Magnum
â OpenStack Magnum service available since 2016
â Extremely popular service, +500 clusters
30
32. Baremetal Provisioning - Ironic
â In production since 2018
â All new hardware is enrolled using Ironic. +5000 nodes managed by Ironic
â Existing hardware will be enrolled into Ironic during 2020
32
35. Operations
â Experience growing/managing the Infrastructure during the last 7 years
â Several upgrades during this journey
â OpenStack release cycle is every 6 months!
â SLC6 to CC7 upgrade
â CC7 upgrades
â CC7 to C8 upgrade?
â Supported for few years KVM and HyperV in the same infrastructure
â Migrated CVI VMs to OpenStack HyperV and then to OpenStack KVM
â Security updates required reboot of all cloud
â Most user management operations are automated
â project creation; quotas; ...
35
37. â Public Clouds
â Based on different pricing/SLA considering resource availability
â Reserved instances vs spot-market
â Private Clouds
â Quotas are hard limits. Leads to a reduction in resource utilization
â Preemptible instances
â Projects that exhausted their quota can continue to create instances
â Opportunistic workloads
â Low SLA
â Preemptible Instances Workflow in OpenStack Nova
â The creation of a non preemptible VM fails because there arenât available resources
â Instances that fail with âNova Valid Hostâ, go to âPENDINGâ state instead of âERRORâ
â The Reaper service is notified and it tries to free the requested resources
â Rebuild the instance
â Or change instance state to âERRORâ
Preemptible Instances
37
https://techblog.web.cern.ch/techblog/post/preemptible-instances/
41. Challenges
â Leveraging Container Orchestration to deploy OpenStack control plane
â Re-enroll existing physical resources into OpenStack Ironic
â Introduction of GPU resources
â Move all resources from nova-network to Neutron
â Exploring how to provide ML platforms and Functions as a Service to our
users.
41
43. Summary
â During the last 10 years, resource management and deployment model
changed completely
â From Virtualization and Server consolidation to a Cloud Infrastructure
â From Baremetal to VMs, to managed Baremetal to Containers
â Continue to adapt the Infrastructure to the new technologies and
requirements
â Iterative approach to introduce new services, new functionality
â Continue to explore new approaches to deploy/manage a large infrastructure
â Control Plane managed by kubernetes
â New regions
â Preemptible instances
43
44. Hall of Fame
Stefano Zilli
Wataru Takase
Thomas Hartland
Mihai Patrascoiu
Belmiro Moreira
Mateusz Kowalski
Thomas Oulevey
Arne Wiebalck
Jan van Eldik
Jose Castro
Spyridon Trigazis
Daniel Abad
Luis Pigueiras
Vitor Araujo
Luis Fernandez Alvarez
Daniel Fernandez Rodriguez
Gary McGilvary
Marek Denis
Andrea Giardini
Bruno Bompastor
44
Joe Harrison
Thodoris Tsioutias
Clenimar Filemon
Markus Sommer
Vipin Rathi
Ran Du
Cris Cordeiro
Luca Tartarini
Dinika Saxena
Shweta Oak
Jakub
Pavel
Antonio Marino
Marcos Fermin Lobo
Davide Michelino
Parin Pocheba
Nitin Aggarwal
Sean Crosby
Ignacio Dominguez
Martinez-Casanueva
Monika Talach
Mathieu Velten
Bertrand Noel
Konstantinos Samaras-Tsakiris
Surya Seetharaman
Robert Vasek
Ricardo Brito da Rocha
Costin Gament
Domenico Giordano
Iago Santos Pardo
Victor Araujo
Chirag Arora
Cas van der Laan
Zygimantas Matonis
Patrycja Gorniak
Elizaveta Svitanko
Venkata Ravicharan Nudurupati
Fedor Kitashov
Juan Dupuis
Serena Ziviani
Diogo Guerra
Evangelia Santorinaiou
Henni Mohamed
Roberto Soares
Theodoros Tsioutsias
Dheeraj Gupta
Vineet Menon
Lalit Dagre
Pranav Gaur