This document summarizes CERN's migration from CellsV1 to CellsV2 when upgrading their OpenStack deployment from Ocata to Queens. Key aspects included setting up local placement APIs in each cell initially, then consolidating to a central placement service. Challenges included scheduling limitations in CellsV2 and increased database load after the upgrade. Overall the migration was successful and CERN's cloud is now running Nova Queens with CellsV2 at large scale.
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Â
Moving from CellsV1 to CellsV2 at CERN
1.
2. Moving from
CellsV1 to CellsV2
at CERN
OpenStack Summit - Vancouver 2018
Belmiro Moreira
belmiro.moreira@cern.ch @belmiromoreira
3.
4.
5. CERN - Cloud resources status board - 11/05/2018@09:23
6. Cells at CERN
â—Ź CERN uses cells since 2013
â—Ź Why cells?
â—‹ Single endpoint. Scale transparently between different Data Centres
â—‹ Availability and Resilience
â—‹ Isolate failure domains
â—‹ Dedicate cells to projects
â—‹ Hardware type per cell
â—‹ Easy to introduce new configurations
7. Cells at CERN
â—Ź Disadvantages
â—‹ Unmaintained upstream
â—‹ Only few deployments using Cells
â—‹ Several functionality missing
â– Flavor propagation
â– Aggregates
â– Server groups
â– Security groups
â– ...
8. CellsV1 architecture at CERN
Nova API Servers
CellA compute nodes
CellA controller
CellB compute nodes
CellB controller
TOP Cell controllers
nova DB
nova DB
nova DB
RabbitMQ
RabbitMQ
RabbitMQ
9. CellsV1 architecture at CERN (Newton)
compute node
nova-compute
child cell controller
nova-api
nova-conductor
nova-scheduler
nova-network
nova-cells
RabbitMQ
top cell controller
nova-novncproxy
nova-consoleauth
nova-cellsnova DB
API nodes
nova-api
nova DB
x10
RabbitMQ cluster
x4
x200
x1
x70
nova_api DB
11. Before Ocata Upgrade
â—Ź Enable Placement
â—‹ Introduced in Newton release
â—‹ Required in Ocata
â—‹ nova-scheduler runs per cell in cellsV1
â—Ź How to deploy Placement with cellsV1 in a large production environment?
â—‹ Placement retrieves the allocation candidates to the scheduler
â—‹ Placement is not cell aware
â—‹ Global vs Local (in the Cell)
â– Global: scheduler gets all allocation candidates available in the cloud
â– Local: scheduler gets only the allocation candidates available in the cloud
12. Setup Placement per cell
â—Ź Create a region per cell
â—Ź Create a placement endpoint per region
● Configure a “nova_api” DB per cell
â—Ź Run a placement service per cell in each cell controller
â—Ź Configure the compute nodes of the cell to use the cell placement
13. CellsV1 architecture with local placement
compute node
nova-compute
child cell controller
nova-api
nova-conductor
nova-scheduler
nova-network
nova-cells
RabbitMQ
top cell controller
nova-novncproxy
nova-consoleauth
nova-cellsnova DB
API nodes
nova-api
nova DB
RabbitMQ cluster
placement-api
nova_api DB
nova_api DB
14. â—Ź Issues
○ “build_requests” not deleted in the top “nova_api”
â—‹ https://review.openstack.org/#/c/523187/
â—Ź Keystone needs to scale accordingly
Enable placement per cell
Keystone - number of requests when enabling placement
15. Upgrade to Ocata
â—Ź Data migrations
â—‹ flavors, keypairs, aggregates moved to nova_api (Top cell DB)
â—‹ migrate_instance_keypairs required to run in cells DBs
â– However keypairs only exist in Top cell DB
â– https://bugs.launchpad.net/nova/+bug/1761197
■Migration tool that populates cells “instance_extra” table from “nova_api” DB
â—‹ No data migrations required in cells DBs
○ “db sync” in child cells fails because there are flavors not moved to nova_api (local)
â—Ź DB schema
â—‹ migration 346 can take a lot of time (remove 'schedule_at' column from instances table)
â– consider archive and then truncate shadow tables
○ “api_db sync” fails if cells not defined even if running cellsV1
16. Upgrade to Ocata
● Add cells mapping in all “nova_api” DBs
â—‹ cell0 (will not be used) and Top cell
â—‹ Other cells mapping are not required
● “use_local” removed in Ocata
â—‹ Changed nova-network to continue to support it!
â—Ź Inventory min_unit, max_unit and step_size constraints are enforced in Ocata
â—‹ https://bugs.launchpad.net/nova/+bug/1638681
â—‹ Problematic if not all compute nodes are upgraded to Ocata
17. Consolidate Placement
compute node
nova-compute
child cell controller
nova-api
nova-conductor
nova-scheduler
nova-network
nova-cells
RabbitMQ
top cell controller
nova-novncproxy
nova-consoleauth
nova-cellsnova DB
API nodes
nova-api
nova DB
RabbitMQ cluster
nova_api DB
top cell controller
placement-api
nova_api DB
placement-api
18. Consolidate Placement
● Change endpoints to “central” placement
○ “placement_region” and “nova_api”
â—‹ Applied per cell (few cells per day)
â– Need to learning how to scale placement-api
â—‹ Scheduling time expected to go up
19. Consolidate Placement
â—Ź Local placement disabled in all cells
○ Moved last 15 cells to “central” placement
â—‹ Scheduler time increased
â—‹ Placement request time also increased
20. Consolidate Placement
â—Ź Fell apart during the night...
● Memcached reached the “max_connections”
○ Increased “max_connections”
○ Increased the number of “placement-api” servers
21. Consolidate Placement
â—Ź Moved 70 local Placements to the central Placement
○ Didn’t copied the data from the local nova_api DBs
â—‹ resource_providers, inventory and allocations are recreated
â—Ź Running on apache WSGI
â—Ź 10 servers (VMs 4 vcpus/8 GiB)
â—‹ 4 processes/20 threads
○ Increased number of connections on “nova_api” DB
â—Ź ~1000 compute nodes per placement-api server
â—Ź memcached cluster for keystone auth_token
22. Scheduling time after Consolidate Placement
â—Ź Scheduling time in few cells was better than expected
â—Ź Ocata scheduler only uses Placement after all compute nodes are upgraded
if service_version < 16:
LOG.debug("Skipping call to placement, as upgrade in progress.")
23. CellsV2 in Queens
â—Ź Advantages
○ Finally using the “loved” code
â—‹ Can remove all internal cellsV1 patches
â—Ź Concerns
â—‹ Is someone else running cellsV2 with more than one cell?
â—‹ Scheduling limitations
â—‹ Availability/Resilience issues
24. Scheduling
â—Ź How to dedicate cells to projects?
â—‹ No cell_filters equivalent in cellsV2
â—Ź Scheduler is global
â—‹ Scheduler doesn't know about cells
○ Placement doesn’t know about cells
â—‹ Scheduler needs to receive all available allocation candidates from placement
â– https://review.openstack.org/#/c/531517/ (scheduler/max_placement_results)
â—‹ Availability zone selection is a scheduler filter
● Can’t enable/disable scheduler filters per cell
● Can’t enable/disable a cell
â—‹ https://review.openstack.org/#/c/546684/
25. Scheduling
â—Ź Placement request-filter
â—‹ https://review.openstack.org/#/c/544585/
â—Ź Initial work already done for Rocky
â—Ź CERN backported it for Queens
â—Ź Created our own filters
â—‹ AVZ support
â—‹ project-cell mapping
â—‹ flavor-cell mapping
â—Ź Few commits you may want to consider to backport to Queens
â—‹ https://review.openstack.org/#/q/project:openstack/nova+branch:master+topic:bp/placement-req-filter
26. Scheduling
â—Ź Placement request-filter uses aggregates
â—‹ Create an aggregate per cell
â—‹ Add hosts to the aggregates
â—‹ Add the aggregate metadata for the request-filter
â—‹ Placement aggregates are created and resource providers mapped
â– Mirror host aggregates to placement: https://review.openstack.org/#/c/545057/
â—Ź Difficult to manage in large deployments
○ “Forgotten” nodes will not receive instances
â—‹ Mistakes can lead to wrong scheduling
○ Deleting a cell doesn’t delete resource_providers, resource_provider_aggregates,
aggregate_hosts
â– https://bugs.launchpad.net/nova/+bug/1749734
27. Availability
â—Ź If a cell/DB is down all cloud is affected
○ Can’t list instances
○ Can’t create instances
○ …
â—Ź Looking back we only had few issues with DBs
â—‹ Felt confident to move to CellsV2
â—Ź Upstream discussion on how to fix/improve the availability problem
â—‹ https://review.openstack.org/#/c/557369/
28. Upgrade to Queens
● “Shutdown the cloud”
â—Ź Steps we followed for the upgrade
â—‹ Upgrade packages
â—‹ Data migrations / DB schema
â– Pike/Queens data migrations
â—Ź Quotas, service UUIDs, block_device UUIDs, migrations UUIDs
â– Top cell DB will be removed
â—‹ Create cells in nova_api DB
â—‹ Delete current instance_mappings
â—‹ Recreate instance_mappings per cell
â—‹ Discover hosts
29. Upgrade to Queens
â—‹ Create aggregates per cell and populate aggregate_hosts, aggregate_metadata
â—‹ Create placement aggregates and populate resource_provider_aggregates
â—‹ Setup AVZs
â—‹ Enable nova-scheduler and nova-conductor services in the top control plane
â—‹ Remove nova-cells service from parent and child cells
â—‹ Remove nova-scheduler from child cells controllers
â—‹ Upgrade compute nodes
â—Ź Start the cloud
32. What changed in Placement?
â—Ź Refresh aggregates, traits and aggregate-associated sharing providers
â—‹ ASSOCIATION_REFRESH = 5m
â—‹ Made the option configurable:
â– Master: https://review.openstack.org/#/c/565526/
â– Backported to Queens: https://review.openstack.org/#/c/566288/
â—‹ Set it to a very large value
â—Ź However it still runs when nova-compute restarts
â—‹ Problematic with Ironic
â—‹ At the end we removed this code path
36. Database load pattern
â—Ź Number of queries in Cell DBs more than double after the upgrade
â—‹ APIs only available to few users
â—Ź Connection rate increased
â—‹ Clients could not connect. API calls failed
â—‹ Reviewed DB configuration. Related with ulimits of mysql processes
37. nova list / nova boot
â—Ź To list instances the request goes to all cells DBs
â—‹ Problematic if a group of DBs is slow or has connection issues
â—‹ Fails if a DB is down
â—Ź DBs for Wigner data centre cells are located in Wigner
â—‹ API servers are located in Geneva
â—Ź To minimize the impact deployed few patches
â—‹ Nova list only queries the cells DBs where the project has instances
â– https://review.openstack.org/#/c/509003
â—‹ Quota calculation only queries the cells DBs where the project has instances
â– https://bugs.launchpad.net/nova/+bug/1771810
38. Minor issues
â—Ź Availability zones in api-metadata
â—‹ https://bugs.launchpad.net/nova/+bug/1768876
â—Ź nova-compute (ironic) creates new resource provider when failover
â—‹ resource_provider_aggregate lost
â—‹ https://bugs.launchpad.net/nova/+bug/1771806
â—Ź Scheduler host_manager gathering info
â—‹ Makes it parallel. Ignore cells down: https://review.openstack.org/#/c/539617/
â—Ź Service list
â—‹ Not parallel. Fails if a cell is down: https://bugs.launchpad.net/nova/+bug/1726310
● nova-network doesn’t start when using cellsV2
40. CellsV2 architecture at CERN (Queens)
compute node
nova-compute
child cell controller
nova-api
nova-conductor
nova-network
RabbitMQ
top cell controller
nova-scheduler
nova-conductor
nova-placementnova DB
API nodes
nova-api
nova DB
RabbitMQ cluster
nova_api DB
x20
x16
x200
x1
x70
nova-novncproxy
nova-consoleauth
41. Summary
CERN cloud is running Nova Queens with CellsV2
â—Ź Moving from CellsV1 is not a trivial upgrade
â—Ź CellsV2 works at scale
â—Ź Availability/Resilience issues
Thanks to all Nova Team!