1. General
Bare-metal
Provisioning
Framework
Mikyung Kang, USC/ISI
David Kang, USC/ISI
Ken Igarashi, NTT docomo
Mana Kenoko, NTT docomo
Hiromichi Ito, Virtual Tech Japan
Arata Notsu, Virtual Tech Japan
3. 3
Why Bare-metal Provisioning?
¡ Manage Bare-metal Machines using OpenStack
Virtual Machines Bare-‐‑‒Metal Machines
Real-‐‑‒time Analysis Various CPU Management
support using OpenStack
Open
Stack
General Bare-Metal Provisioning Framework (Speaker Session)
4. 4
Why Bare-metal Provisioning?
¡ Difference between VM and Bare-metal Machines
¡ Virtual Machines
¡ Hypervisor exists between physical resources and virtual machines
¡ Image provisioning, VM’s power management, volume isolation
(iSCSI), console access (VNC), VM’s snapshot
Virtual Machine Bare-Metal Machine
Hypervisor (OpenStack)
NW Storage NW OS
NW StorageNW OS
Host OS
HW iSCSI VLAN imag
HW iSCSI VLAN imag CPU MEM HDD NIC
eDB
CPU MEM HDD NIC
eDB
¡ Bare-metal Machines
¡ There is no hypervisor
¡ Bare-metal machine can access physical resources freely
¡ Need to achieve same security level as virtual environments
General Bare-Metal Provisioning Framework (Speaker Session)
5. 5
Why Bare-metal Provisioning?
¡ Virtual machine vs. Bare-metal machine instances
bare-metal m1.tiny m1.medium m1.large
Driver Hypervisor
CPU
Aggregate Host OS
MEM
resources bm1.medium HW
CPU MEM HDD NIC
HDD
bm1.tiny
Nova-Compute Nova-Compute
(virtual)
Bare-metal machine Virtual machine
General Bare-Metal Provisioning Framework (Speaker Session)
6. 6
OpenStack Bare-metal History
Essex Release: April 2012
• Non-PXE Tilera multi-core bare-metal machines
Folsom Release: Sept. 2012
• Non-PXE Tilera multi-core bare-metal machines
• Pending review: PXE support & bare-metal MySQL DB
Grizzly Release: April 2013
• Finish review à merge to upstream: basic functions
• New features including fault-tolerance and security
enhancement as well as scheduler changes
General Bare-metal Provisioning Framework (Speaker Session)
7. 7
OpenStack Bare-metal History
¡ Initial design for Tilera (Non-PXE) Image Provisioning (TFTP/NFS)
Essex
Folsom
General Bare-metal Provisioning Framework (Speaker Session)
10. 10
Bare-metal Provisioning Framework
Registers
bare-metal
bare-metal resources
Driver Essex
CPU
Aggregate Folsom
Bare-metal Filter:
cpu_arch &
MEM
resources bm1.medium
hypervisor_type
HDD
bm1.tiny
Nova-Compute TEXT
Homogeneous Capability Bare-metal
nodes
Nova-Scheduler Maximum Capability information
Including total number of bare-metal machines
General Bare-metal Provisioning Framework (Speaker Session)
11. 11
Bare-metal Provisioning Framework
Registers
bare-metal
bare-metal resources
Driver
CPU Grizzly
Aggregate
Bare-metal Filter:
cpu_arch &
MEM
resources bm1.medium
hypervisor_type
HDD
bm1.tiny
Nova-Compute
Multiple Capabilities Bare-metal
MySQL DB
Nova-Scheduler
baremetal_sql_connection = mysql://
$ID:$Password@$IP/nova_bm
General Bare-metal Provisioning Framework (Speaker Session)
12. 12
Bare-metal Release Plan
Grizzly-1: Nov. 22nd
Grizzly-3: Feb. 21st
General Bare-metal Provisioning Framework (Speaker Session)
14. Benchmarking
o CPU (Coremark) o Context Switch (LMBench)
180000
Baremetal Virtual 70
160000 Better
Baremetal Virtual
60
140000 worse
50
Time [µS]
120000
100000 40
80000 30
60000 20
40000 10
20000 0
0 2 4 8 16 24 32 64 96
Number of Process
o TCP Throughput (Netperf)
o Ping
Baremetal Virtual SR-IOV 0.4 Baremetal SR-IOV Virtual
10000
Throughput [Mbps]
Better
worse
Latency [ms]
8000 0.3
6000
0.2
4000
0.1
2000
0 0
64 1024 1500
transmit receive Packet Size [bytes]
DOCOMO, INC All Rights Reserved 14
15. VM Provisioning Procedure in Nova
1. Instance Request
Nova-
Nova-API
Scheduler
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
Nova-Volume
DOCOMO, INC All Rights Reserved 15
16. VM Provisioning Procedure in Nova
1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
Nova-Volume
DOCOMO, INC All Rights Reserved 16
17. VM Provisioning Procedure in Nova
1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler
3. Image Provisioning
VM
VM
VM
VM
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
Nova-Volume
DOCOMO, INC All Rights Reserved 17
18. VM Provisioning Procedure in Nova
1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler
3. Image Provisioning
VM
VM
VM
VM
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
4. Network Isolation
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
Nova-Volume
DOCOMO, INC All Rights Reserved 18
19. VM Provisioning Procedure in Nova
1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler
3. Image Provisioning
VM
VM
VM
VM
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
4. Network Isolation
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
5. Nova-Volume Attachment
Nova-Volume
DOCOMO, INC All Rights Reserved 19
20. VM Provisioning Procedure in Nova
1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler
3. Image Provisioning
VM
VM
VM
VM
6. VNC Access
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
4. Network Isolation
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
5. Nova-Volume Attachment
Nova-Volume
DOCOMO, INC All Rights Reserved 20
21. VM Provisioning Procedure in Nova
1. Instance Request
AMI
AMI
2. Choose Nova-Compute
Nova- glance
Nova-API
Scheduler
3. Image Provisioning
7. Snapshot
VM
VM
VM
VM
6. VNC Access
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
4. Network Isolation
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
5. Nova-Volume Attachment
Nova-Volume
DOCOMO, INC All Rights Reserved 21
22. Bare-Metal Provisioning Functions
o We need to implement same functions for bare-metal
provisioning
1. Instance Request – Description for bare-metal machine instances
2. Choose Nova-Compute – Scheduler for bare-metal machines
3. Image Provisioning – Turn on/off and deploy images to bare-metal
machines
4. Network Isolation – Create private LAN among bare-metal
machines
5. Nova-Volume Attachment – Provide secure iSCSI access
6. VNC Access – Provide console access to bare-metal servers
7. Snapshot – Create new AMI from a running VM
How to achieve those functions without hypervisor?
Keep
Compatibility Less impact to
(Same API)
Nova
DOCOMO, INC All Rights Reserved 22
23. 1. Instance Request
o Create instance types for bare-metal machines
Name
Id
memory_mb
VCPUS
local_gb
m1.tiny
1 512
1
40
m1.medium
2
4096
2
80
b1.tiny
3
512
1
40
b1.medium
4
4096
2
80
o bare-metal machine instances have
“instance_type_extra_specs”
Id
key
value
3
cpu_arch
tilepro64
4
cpu_arch
x86_64
Ø euca-run-instances –t m1.tiny -> Create virtual instance
Ø euca-run-instances –t b1.tiney -> Create bare-metal instance
DOCOMO, INC All Rights Reserved 23
24. 2. Choose Nova-Compute (Sceduler)
o Create pseudo Nova-Computes for bare-metal machines
CPU
bare-metal MEM
m1.tiny
1.midium
m1.large
m
Driver
HDD Hypervisor
m1.large
Host OS
b1.midium
HW
CPU MEM HDD NIC
Nova-Compute b1.tiny
Nova-Compute
(virtual)
o Filter scheduler can classify virtual and bare-metal machines
Hypervisor Hypervisor
Filter
m1.large
Host OS
Host OS
Scheduler
HW
Virtual
HW
m1.tiny
CPU MEM HDD NIC CPU MEM HDD NIC
Nova-
m1.midium
b1.tiny
Scheduler
b1.tiny
Nova-API
Bare-Metal
b1.midium
DOCOMO, INC All Rights Reserved 24
25. 3. Image Provisioning (x86_64)
0. Preparation
Create “kernel + ramdisk”, Run bare-metal
and register them to glance deployment servers
“baremetal-mkinitrd.sh”
AKI
ARI
- dnsmasq (PXE server)
nova-compute
- bm_deploy_server
glance
Specify nova-
Edit nova.conf
compute type
compute_driver=nova.virt.baremetal.driver.BareMetalDriver Driver for nova-compute
baremetal_driver=nova.virt.baremetal.pxe.PXE and power manager
power_manager=nova.virt.baremetal.ipmi.Ipmi
baremetal_deploy_ramdisk = 843adb6d-e0f8-452d-9a60-d8c883a0983c
baremetal_deploy_kernel = 7dfd792c-fc85-480e-8d07-7d9b20d58c24
AKI and ARI
for 1st boot
DOCOMO, INC All Rights Reserved 25
26. 3. Image Provisioning (x86_64)
1. 1st Boot
Nova- nova-compute/
Scheduler
PXE server
PXE boot
Use kernel/ramdisk
b1.tiny
for the deployment
Bare-Metal
Machines
AKI ARI
Nova-API
(deploy)
(deploy)
euca-run-instances –t b1.tinyl --ramdisk
ari-bare (–kernel aki-bare) ami-bare
2. System Setup
Send AMI via iSCSI
AMI-
nova-compute/ bare
bm_deploy_server
Read Configuration (Nova-Network)
MAC and IP Address
1. Create File system (SWAP)
2. Configure MAC and IP address
3. Setup PXE for 2nd boot
4. Reboot
DOCOMO, INC All Rights Reserved 26
27. 3. Image Provisioning (x86_64)
3. 2nd Boot
AMI-
PXE boot bare
Use kernel/ramdisk for the
provisioning
nova-compute/
PXE server
aki-Bare ari-Bare
aki-Bare ari-Bare
Boot from Local HDD
euca-run-instances –t b1.tinyl --ramdisk
ari-bare (–kernel aki-bare) ami-bare
Bare-Metal Instance
DOCOMO, INC All Rights Reserved 27
28. 4. Network Isolation
o Virtual Machines o Bare-Metal Machines
Ø Hypervisor checks addresses (IP Ø Use can change address and VLAN
and MAC), and puts VLAN tag
tag freely
IP address MAC, IP address,
spoofing! VLAN spoofing!
(pretend others)
(pretending others)
APL-d APL-d APL-d
MW-d MW-d MW-d
OS-d OS-d OS-d
Hypervisor
Hypervisor
HW
HW
HW
OK
NG
DOCOMO, INC All Rights Reserved 28
29. 4. Network Isolation (β version)
o Use Quantum – NEC’s Trema + OpenFlow Switch
Ø Protect against address spoofing (MAC and IP)
Ø Create a private network among instances
of_in_port=<switch’s port> src_mac !=
<Instance's MAC> -> DROP
Nova-Compute of_in_port=<switch’s port> src_ip !=
Quantum
<Instance's IP> -> DROP
(bare-metal)
of_in_port=* dst_ip=<Instance's IP> protocol
and dst_port Allowed by security group ->
OpenFlow Controller ALLOW
(Trema from NEC)
of_in_port=* dst_ip=<BROADCAST> protocol
and dst_port Allowed by security group ->
ALLOW
Security Group A
OpenFlow
Security Group B
Switch
Security Group B
DOCOMO, INC All Rights Reserved Security Group A
29
30. 5. Nova-Volume Attachment
o Virtual Machines o Bare-Metal Machines
Ø Nova-Volume is transparent to Ø Use can see all Nova-Volumes
users
iscsiadm –m iscsiadm –m
discovery
discovery
APL-d APL-d APL-d
MW-d MW-d MW-d
Don’t work!
Can see all
OS-d OS-d OS-d
the volumes
HW
Hypervisor
Hypervisor
HW
HW
USER1 USER2 USER3 USER4
Storage
Storage
Storage
Storage
Vol-14 Vol-13 Vol-12
Vol-11
Nova-Volume
DOCOMO, INC All Rights Reserved 30
31. 5. Nova-Volume Attachment (β version)
o Use Nova-Compute as a proxy of Nova-Volume
Ø Separate Nova-Volume network and provide ACL using CHAP
2. Provide ACL for each
bare-metal machines
1. Isolate iSCSI netowrk
OpenFlow
Switch
Server A
USER1
Storage
USER2
Storage
Server B
Vol-13 Vol-14
USER3 USER4
Storage
Storage
Vol-11
Vol-12
Nova-Volume
Server C
Server D
Nova-Volume Network Bare-Metal Nova Volume Network
DOCOMO, INC All Rights Reserved 31
32. 6. VNC Access (β version)
o Provide console access by Serial over LAN (SOL)
Nova-Compute
Bare-metal
SOL
Serial Console
o Use Ajax Console (shellinabox)
DOCOMO, INC All Rights Reserved http://code.google.com/p/shellinabox/
32
33. Bare-Metal Provisioning
1. Instance Request
- Create new instance type with “extra_specs = bare-metal”
2. Choose Nova-Compute
- Create new scheduler called “Heterogeneous Scheduler”
3. Image Provisioning
- Use Intel vPro and IPMI to Turn on/off bare-metal machines
4. Network Isolation
- Use Quantum (OpenFlow) to protect against address spoofing and create
a private LAN within a security group
5. Nova Volume Attachment
- Network ACL (VLAN and CHAP)
6. VNC Access
- Serial over LAN
7. Snapshot
- TBD
DOCOMO, INC All Rights Reserved 33
34. Libvirt and Bare-Metal Driver
o Compare operations supported by Horizon
Category
Operation
Libvirt
Bare-Metal
Activate
O O (IPMI)
Reboot
O
O (IPMI)
Suspend
O
X
Instance
Terminate
O
O (IPMI)
MAC/IP
O
O (Deploy Ramdisk)
Address
Floating IP
O
O
Snapshot
O
X
Security
O
O (OpenFlow)
Security
Groups
Keypair
O
O
Console
O (VNC)
△ (SOL)
DOCOMO, INC All Rights Reserved 34
38. Bare-Metal Machine Provisioning
o Manage Bare-Metal Machines same as Virtual Machines
Virtual Bare-Metal
Machines
Machines
Ø Run an instance through OpenStack API
ü euca-run-instances –t b1.tinyl --ramdisk ari-bare (–kernel aki-bare) ami-bare
Utilize all the ecosystem Management
created on top of OpenStack using OpenStack
Open
Stack
Auto-Scaling
DOCOMO, INC All Rights Reserved 38
39. Auto-Scaling of the Nova-Compute
o Change resources dynamically based on load
Common
Computing Pool
Common
Computing Pool
DOCOMO, INC All Rights Reserved 39
40. How Does Zabbix Scale a Nova-Compute?
Nova-Compute
Zabbix
Information
from Libvirt
ITEM
Management
VM
VM's CPU load
Item1, Item2
Total vCPUs VM
VM
V VM’s Memory
M
VM’s Disk etc…
Zabbix
Libvirt Collectd
Plugin
TRIGGER
Plugin
“Item2” = Total vCPUs Scale-out Trigger
Scale-in Trigger
Zabbix argent
“Item1” = Total CPUs
H
O
ACTION
S
System Information
Scale-out Action
T Total CPUs Scale-in Action
Total Memory
Total Disk etc…
DOCOMO, INC All Rights Reserved 40
41. Trigger & Action for scaling the Nova-Compute
Item List
Item1
Total CPUs
Item2
Total vCPUs
Trigger List
Expression
Value
True : PROBLEM
Scale-out
Total vCPUs.ave(60) > Total CPUs
False : OK
Total vCPUs.ave(180) True : PROBLEM
Scale-in
< Total CPUs - number of CPUs per
server
False : OK
Action List
Value Status
Operation
Execute “euca-run-instances~”
Scale-out
PROBLEM
command to Nova-api
Execute “euca-terminate-instances~”
Scale-in
PROBLEM
command to Nova-api
DOCOMO, INC All Rights Reserved 41
43. Bare-metal codes for submission
o Updated scheduler and compute for multiple bare-metal
capabilities
Ø https://review.openstack.org/13920
o Added separate bare-metal MySQL DB
Ø https://review.openstack.org/10726
o A script for bare-metal node management
Ø https://review.openstack.org/#/c/11366/
o Updated bare-metal provisioning framework
Ø https://review.openstack.org/11354
o Added PXE back-end bare-metal
Ø https://review.openstack.org/11088
o Added bare-metal host manager
Ø https://review.openstack.org/11357
DOCOMO, INC All Rights Reserved 43
44. Bare-metal docs 44
OpenStack Wiki
• http://wiki.openstack.org/
GeneralBareMetalProvisioningFramework
OpenStack Source
• nova/virt/baremetal/docs/*.rst
• README and installation documents
The Latest Github branch
• https://github.com/NTTdocomo-openstack/
nova/
DOCOMO, INC All Rights Reserved 44
45. Bare-metal provisioning interests
o Contact: USC/ISI & NTT Docomo
o Interested companies: collaboration / testing
Tuesday
@4:30-5:10pm
[Emma AB]
summit
session
Design & Implementation
meetup
DOCOMO, INC All Rights Reserved 45