OpenStack Compute (Nova), has been a core component of OpenStack since the original Austin release in 2010. In the intervening years development has proceeded at a rapid pace adding support for new virtualization technologies and exposing additional features. Learn how Compute fits into the OpenStack architecture, and how it interacts with other OpenStack components and the hypervisors it manages.
4. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
What is OpenStack?
● A group of related projects that when combined form an
Open Source cloud infrastructure platform for providing
Infrastructure-as-a-Service.
● Intended to be “massively scalable”, scales horizontally
not vertically, on commodity hardware.
● Modular architecture allows consumers of the platform
to deploy only what they need.
6. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
What is OpenStack Compute (Nova)?
● One of the two original OpenStack projects, along with
Object Storage (Swift).
● Exposes a rich API for defining compute instances and
managing their lifecycle.
● Pluggable support for multiple common hypervisor
platforms, relatively solution agnostic.
7. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Compute Components
● RESTful nova-api
interface exposed on TCP
port 8774.
● AMQP message queue
used for RPC
communications.
● nova-scheduler handles
hypervisor selection for
instance placement.
8. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Components (cont.)
● nova-compute acts as the
Compute agent, interacting
with the relevant
hypervisor APIs to
launch/manage guests.
● nova-conductor handles
database access (no-db-
compute)
9. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Other Components
● Metadata service - nova-metadata-api
● Traditional networking model - nova-network
● L2 agent - e.g.:
○ neutron-openvswitch-agent
○ neutron-linuxbridge-agent
● Ceilometer agent:
○ openstack-ceilometer-compute
● EC2 API: nova-ec2, nova-cert
● Console Auth and Proxies: noVNC, SPICE, etc.
12. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Instance Creation
● Instance creation achieved using nova boot command.
● Minimal set of arguments include selecting a flavor and
image:
$ nova boot --flavor <flavor> --image <image>
[--nic net-id=<net-id>] <name>
● Flavor determines the “size” of an instance.
● Image determines the disk image used to boot the
instance.
13. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Image Selection
$ glance image-list
+--------------------------------------+-------------------------------+-------------+------------------+...
| ID | Name | Disk Format | Container Format |...
+--------------------------------------+-------------------------------+-------------+------------------+...
| 834c3cbd-8be0-4d4a-b9e8-48ba61d6a999 | cirros | qcow2 | bare |...
| 3a752292-4484-469c-a716-de2542b5742f | rhel-guest-image-7.1-20150224 | qcow2 | bare |...
+--------------------------------------+-------------------------------+-------------+------------------+...
15. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Flavor Selection
● Simplify process of packing
instances onto physical hosts.
● Largest flavor is typically twice
the size (CPU, RAM, Disk) of
next largest flavor and so on.
● Admin may want to customize
depending on workload
patterns.
http://bit.ly/1QPNVaZ
20. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
What just happened?
● Retrieved token and endpoints from Keystone API
○ Compute end-point of the form: http[s]://<ip>:8774/v2/%(tenant_id)s
● Confirm image identifier:
○ Retrieved list of available images from Nova API
■ http://93.184.216.34:8774/v2/fc50f6843ba644baaae2af0398e7f04e/images
○ Retrieved specific image detail from Nova API
■ .../v2/fc50f6843ba644baaae2af0398e7f04e/images/3a752292-4484-469c-a716-de2542b5742f
● Confirm flavor identifier:
○ Retrieved list of available flavors from Nova API
■ ../v2/fc50f6843ba644baaae2af0398e7f04e/flavors
○ Retrieved specific flavor detail from Nova API
■ ../v2/fc50f6843ba644baaae2af0398e7f04e/flavors/2
21. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
What just happened? (cont.)
● User request was sent to the compute endpoint in
JSON format:
{"server":
{"name": "test-instance",
"imageRef": "3a752292-4484-469c-a716-de2542b5742f",
"flavorRef": "2", "max_count": 1, "min_count": 1,
"networks": [{"uuid": "7a9a376d-88cc-41ae-a08f-e3ca274f88cd"}]
}
}
● Request is picked up by nova-api service.
22. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
What just happened? (cont.)
● nova-api:
○ Extracts parameters for basic validation.
○ Retrieves a reference to the selected flavor.
○ Retrieves a reference to selected boot media:
■ Image using Glance client (in this example); OR
■ Volume using Cinder client (boot from volume)
○ Saves initial instance state to database.
○ Puts a message on the message queue for the conductor.
● API call returns at this point, with instance status of
BUILD, task state SCHEDULING.
23. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Scheduling
● Conductor asks the schedule where to build the
instance
● Default implementation is a filter scheduler
● Applies filters and weights based on configuration
○ Filter examples:
■ ComputeFilter - is this host on?
■ CoreFilter - is this host exposing enough free vCPUs?
■ RamFilter - is this host exposing enough free vRAM?
■ ImagePropertiesFilter - does this host conform to selected image properties
(architecture, hypervisor type, etc.).
○ Weight examples:
■ RAM Weigher - give preference to hosts with more or less RAM free.
● Can also take user provided hints
26. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Scheduling (cont.)
● Updates instance state in database.
● Returns to conductor, conductor places message on the
queue for openstack-nova-compute (the compute
agent) on the selected compute node.
27. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Compute Agent
● Prepares for instance launch:
○ Calls Glance and/or Cinder to retrieve boot media info (image or
volume).
○ Calls Neutron or nova-network to get network and security group
information and “plug” virtual interfaces.
○ Calls Cinder to attach volume if necessary.
○ Sets up configuration drive if necessary.
● Uses hypervisor APIs to create virtual machine!
● Updates virtual machine state in DB (using conductor).
29. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Driver Selection
● Two tools to help guide operators:
○ Driver testing status
■ “Is this driver tested using unit and/or functional tests in the gate?”
○ Hypervisor support matrix
■ “Does this driver support actions x, y, and z?”
30. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Driver Testing Status
● Multi-tiered:
○ Group A - Fully supported.
■ Coverage includes unit and functional tests in the gate.
○ Group B - Middle ground.
■ Test coverage includes unit tests that gate commits, functional testing by an external
system that does not gate but does comment on patches.
○ Group C - Drivers that have limited testing, use at own risk.
■ Test coverage includes (potentially) unit tests that gate commits and no public
functional testing.
● https://wiki.openstack.
org/wiki/HypervisorSupportMatrix#Driver_Testing_Statu
s
31. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Hypervisor Support Matrix
● Lists mandatory and optional driver capabilities:
○ http://docs.openstack.org/developer/nova/support-matrix.html
● Examples of capabilities:
○ Launch instance (mandatory)
○ Attach block volume to instance (optional)
34. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Scaling Compute
● Compute services scale
horizontally (simply add
more).
● Scheduler needs to be
scaled a little more
carefully.
● Message queue and
database can be
clustered.
35. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Cells
● Divide multiple compute
installations into “cells”.
● API cell handles incoming
requests, schedules to a compute
cell.
● Each cell has an instance of
nova-cells, its own message
queue and database.
36. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Cells
● Pros:
○ Maintain a single compute endpoint.
○ Relieve pressure on queues/database at
scale.
○ Introduce additional layer of scheduling.
37. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Cells
● Cons:
○ Lack of “cell awareness” in other projects
(e.g. Neutron).
○ Minimal test coverage in the gate.
○ Some standard functionality remains
broken with cells (Security Groups, Host
Aggregates).
● CellsV2, currently under
development, offers more promise
for the future.
39. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Why Segregate Compute Resources?
● Expose logical groupings:
○ Geographical region, data center, rack, power source, network, etc.
● Expose special capabilities:
○ Faster NICs, storage, special devices, etc.
● The divisions mean whatever you want them to mean!
40. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Regions
● Complete OpenStack deployments
○ Share as many or as few services as
needed.
○ Implement their own targetable API
endpoints, networks, and compute.
● By default all services in one region:
$ keystone endpoint-create --region
“RegionTwo” ...
● Target actions at a regions endpoint:
$ nova --os-region-name “RegionTwo” boot ...
41. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Host Aggregates
● Logical groupings of hosts based on metadata.
● Typically metadata describes capabilities hosts expose:
○ SSD hard disks for ephemeral data storage.
○ PCI devices for passthrough.
○ Etc.
● Hosts can be in multiple host aggregates:
○ “Hosts that have SSD storage and 40G interfaces”.
42. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Host Aggregates (cont.)
● Implicitly user targetable:
○ Admin defines host aggregate with metadata and flavor to match:
■ $ nova aggregate-create hypervisors-with-SSD
■ $ nova aggregate-set-metadata 1 SSDs=true
■ $ nova aggregate-add-host 1 hypervisor-1
■ $ nova flavor-key 1 set
aggregate_instance_extra_specs:SSDs=true
○ User selects flavor when requesting instance.
○ Scheduler places on host aggregate with metadata matching flavor
extra specifications using AggregateInstanceExtraSpecsFilter
43. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Availability Zones
● Logical groupings of hosts based on arbitrary factors
like:
○ Location (country, data center, rack, etc.)
○ Network layout
○ Power source
● Explicitly user targetable:
$ nova boot --availability-zone “rack-1”
44. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Availability Zones
● Host aggregates are made explicitly user targetable by
creating them as an AZ:
○ $ nova aggregate-create tier-1 us-east-tier-1
○ tier-1 is the aggregate name, us-east-tier-1 is the AZ name.
● The host aggregate is the availability zone!
○ Unlike aggregates hosts can not be in multiple availability zones.
52. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
API Microversions
● Compute API V2 has been in place for some time, was
to be superseded by V3.
● Determined that implementing new major version of API
would be too difficult:
○ User impact.
○ Developer overhead.
● V2 is extended by adding “extensions”, lots of them.
53. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
API Microversions
● Microversions aim to:
○ Make it possible to evolve the API incrementally.
○ Provide backwards compatibility for REST API users.
○ Improve code cleanliness to make doing the “right thing” easier.
54. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
API Microversions
● Use a single monotonic counter of the form X.Y where:
○ X will only be changed due to a significant backwards incompatible
API change is made. Expected to be rarely never incremented.
○ Y will be changed when making any change to the API. Whether such
a change is backwards compatible or not will be reflected via
documentation.
● Client will specify the version it supports, e.g.:
○ X-OpenStack-Nova-API-Version: 2.114
55. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
API Microversions
● Initial implementation in Kilo:
○ v2.0 API code still used to serve v2.0 API requests.
■ Plan is in Liberty v2.1 API code will serve both v2.0 and v2.1.
○ v2.0 API is frozen:
■ All new features will be added to v2.1 using microversions.
○ python-novaclient does not yet support v2.1.
56. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
vCPU Pinning
● Allows assignment of vCPU cores, and the associated
emulator threads, to dedicated pCPU cores.
● Administrator defines host(s) that accept dedicated
resourcing requests, scheduler places guests on them.
○ Reserve cores for guests using kernel isolcpus and nova
vcpu_pin_set
○ Create flavor and matching host aggregates.
● Scheduler and agent work together to assign
appropriate CPU cores for vCPUs.
57. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Huge Pages
● Huge pages allow the use of larger page sizes (2M, 1
GB) increasing CPU TLB cache efficiency.
○ Backing guest memory with huge pages allows predictable memory
access, at the expense of the ability to over-commit.
○ Different workloads extract different performance characteristics from
different page sizes - bigger is not always better!
● Administrator reserves large pages during compute
node setup and creates flavors to match:
○ hw:mem_page_size=large|small|any|2048|1048576
● User requests using flavor or image properties.
58. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
I/O (PCIe) based NUMA Scheduling
● Extends Libvirt driver to capture NUMA locality of PCI
devices on the host.
● Extends NUMATopologyFilter to take into account
locality of any PCI devices being passed to the guest.
59. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Standalone EC2 API
● Aims to:
○ Implement AWS Virtual Private Cloud API.
○ Provide the EC2 API as a standalone service.
○ Ultimately replace/supersede current Nova EC2 implementation.
● Current state:
○ Recent 0.1.0 release:
■ https://launchpad.net/ec2-api/trunk/0.1.0
○ In addition to Nova EC2 API coverage includes:
■ VPC API
■ Filtering
■ Tags
■ Paging
60. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
Storage Enhancements
● Consistent snapshots using qemu-guest-agent
● Libvirt driver support for KVM/QEMU built-in iSCSI
initiator - allow direct attachment of volumes to guests.
● vCenter driver support for vSAN datastores.
● vCenter driver support for ephemeral disks.
● Libvirt and Hyper-V driver support for SMB based
volumes.
61. OPENSTACK COMPUTE 101OPENSTACK COMPUTE 101
New In-tree Driver Support
● Libvirt driver support for IBM System Z (KVM)
● Libvirt driver support for Parallels Cloud Server