The XenServer virtualization platform is used by well over 100,000 organizations to fulfill their IT objectives. Common scenarios include traditional server virtualization such as that found with VMware vSphere, delivery of large scale cloud services via Apache CloudStack or OpenStack, and as a platform for high performance desktop virtualization through XenDesktop. These use cases all have requirements of scale and manageability which imply solid deployments.
The content in this deck was presented in workshop form at FOSSETCON in 2015. Much of the information contained will work for any XenServer version, but XenServer 6.5 was covered. The audience was assumed to have some familiarity with virtualization concepts, but no assumptions about XenServer was made. Core concepts covered included; storage design, network design and operations, scalability and failure domains, as well as core features such as virtualized graphics.
2. #whoami
Name: Tim Mackey
Current roles: XenServer Community Manager and Evangelist; occasional coder
Cool things I’ve done
• Designed laser communication systems
• Early designer of retail self-checkout machines
• Embedded special relativity algorithms into industrial control system
Find me
• Twitter: @XenServerArmy
• SlideShare: slideshare.net/TimMackey
• LinkedIn: www.linkedin.com/in/mackeytim
• Github: github.com/xenserverarmy
3. We’re following “MasterClass Format”
Admins matter
• No sales pitch
• No cost
• Just the facts man
Interactive
• Ask questions; the harder the better
• Get what you need to be successful
5. What is a “XenServer”?
Packaged Linux distribution for virtualization
• All software required in a single ISO
Designed to behave as an appliance
• Managed via SDK, CLI, UI
Not intended to be a toolkit
• Customization requires special attention
Open Source
• Open source roots
• Acquired by Citrix in 2007
• Made open source in 2013 (xenserver.org)
6. XenServer market dynamic
Millions of Downloads
Over 1 million servers deployed
Optimized for XenDesktop
Powering NetScaler SDX
Supporting Hyper-Dense Clouds
7. Why XenServer?
Broad provisioning support
• Apache CloudStack
• Citrix CloudPlatform and XenDesktop
• OpenStack
• Microsoft System Center
• VMware vCloud
Full type-1 hypervisor
• Strong VM isolation
• Supporting Intel TXT for secure boot
Designed for scale
• 1000 VMs per host
• Over 140 Gbps throughput in NetScaler SDX
• Up to 96 shared hardware GPU instances per host
10. Networking StorageCompute
Simplified XenServer Architecture Diagram
Xen Project Hypervisor
Standard Linux
Distribution (dom0)
q
e
m
u
drivers
xapi
Guest
Driver front
Driver back
Guest
Driver front
11. dom0 in detail (XenServer 6.5)
3.10+ kernel.org kernel with CentOS 5.10 distribution
kernel-space
netback
blkback
blktap3
user-space
XenAPI (xapi)
SM xha xenopsd
squeezed alertd multipathd
perfmon
interface
stunnel metadata xenstored
ovs-vswitchd qeum-dm Likewise
networkd
Hardware
drivers
15. XenServer Pool
Live Storage XenMotion
Migrates VM disks from any storage
type to any other storage type
• Local, DAS, iSCSI, FC
Supports cross pool migration
• Requires compatible CPUs
Encrypted Migration model
Specify management interface for
optimal performance
XenServer Hypervisor
VDI(s)
Live
Virtual
Machine
More about Storage XenMotion
16. Migration vs. Storage Migration
Start VM
migration
Copy
VM’s
RAM
Copy
VM’s
RAM
delta
Repeat
until no
delta left
End VM
migration
Use VM’s Hard
disk from
destination
Host
Start
Storage
VM
migration
Snapshot
VMs first /
next disk
Transfer
snapshot
disks
End
Storage
VM
migration
XenMotion Storage XenMotion
If transfer is
finished, repeat
until no disk left
to copy
Mirror all write activity after
snapshot to destination host
All disks
mirroring to
destination host
“normal”
XenMotion
18. High Availability in XenServer
Automatically monitors hosts and VMs
Easily configured within XenCenter
Relies on Shared Storage
• iSCSI, NFS, HBA
Reports failure capacity for DR planning
purposes
More about HA
19. Taking advantage of GPUs
NVIDIA
• vGPU with NVIDIA GRID providing 96 GPU instances
• GPU pass-through
• CUDA support on Linux
• Uses NVIDIA drivers for capability
Intel
• GVT-d support with Haswell and newer
• No extra hardware!!
• Uses standard Intel drivers
AMD
• GPU pass-through
More about GPU
20. Distributed Virtual Network Switching
Virtual Switch
• Open source: www.openvswitch.org
• Provides a rich layer 2 feature set
• Cross host private networks
• Rich traffic monitoring options
• ovs 1.4 compliant
DVS Controller
• Virtual appliance
• Web-based GUI
• Can manage multiple pools
• Can exist within pool it manages
• Note: Controller is deprecated, but supported
VM
VM
VM
VM
VM
22. Host requirements
Validate Hardware Compatibility List (HCL)
• http://hcl.xenserver.org
• Component’s firmware version could be important
BIOS configuration
• VT extensions enabled
• EFI profiles disabled
Limits
• Up to 1TB RAM
• Up to 160 pCPUs
• Up to 16 physical NICs
• Up to 16 hosts per cluster
23. Network topologies
Management networks
• Handle pool configuration and storage traffic
• Require default VLAN configuration
• IPv4 only
VM networks
• Handle guest traffic
• IPv4 and IPv6
• Can assign VLAN and QoS
• Can define ACL and mirroring policy
• Should be separated from mgmt networks
All networks in pool must match More about network design
24. Storage topologies
Local storage
• Yes: SAS, SATA, RAID, DAS
• No: USB, Flash, SW RAID
• Supports live migration
Shared Storage
• iSCSI Unipath/Multipath, NFSv3
• HBA – Check HCL
• Supports live migration
Cloud Storage
• Only if presented as iSCSI/NFS
ISO storage
• CIFS/NFSv3
More about storage design
26. Installation options
Boot from DVD/USB
• Intended for low volume
• ISO media on device
• Install from local/NFS/HTTP/FTP
Boot from PXE
• For scale deployments
• Install from NFS/HTTP/FTP
• Post installation script capabilities
Boot from SAN/iSCSI
• Diskless option
27. Driver disks
Shipped as supplemental packs
• Often updated when kernel is patched
• Option to specify during manual install
Network drivers
• Slipstream into XenServer installer
• Modify XS-REPOSITORY-LIST
Storage drivers
• Add to unattend.xml
• <driver-source type="url">ftp://192.168.1.1/ftp/xs62/driver.qlcnic</driver-source>
28.
29. Types of updates
New version
• Delivered as ISO installer
• Requires host reboot
Feature Pack
• Typically delivered as ISO installer
• Typically requires host reboot
Hotfix
• Delivered as .xsupdate file
• Applied via CLI/XenCenter
• May require host reboot
• Subscribe to KB updates
30. Backup more than just your VMs
Local storage
• Always use RAID controller with battery backup to reduce risk of corruption
dom0 (post install or reconfiguration)
• xe host-backup file-name=<filename> -h <hostname> -u root -pw <password>
Pool metadata (weekly – or when pool structure changes)
• xe pool-dump-database file-name=<NFS backup>
VM to infrastructure relationships (daily or as VMs created/destroyed)
• xe-backup-metadata -c -i -u <SR UUID for backup>
LVM metadata (weekly)
• /etc/lvm/backup
31. XenServer host upgrade
Disk Partitions
4GB
1st Partition
XenServer 6.2
installed
2nd Partition
XenServer Backup
-empty-
4GB
1. Initial installation 2. Backup existing
installation, then upgrade
3. XenServer upgraded
Remaining
space
Local Storage
Space
Disk Partitions
4GB
1st Partition
XenServer
Installation of
Version 6.5
2nd Partition
XenServer Backup
of Version 6.2
4GB
Remaining
space
Local Storage
Space
XenServer 6.5
Install Media
Disk Partitions
4GB
1st Partition
XenServer 6.5
installed
2nd Partition
XenServer 6.2 kept
as Backup
4GB
Remaining
space
Local Storage
Space
32. Pool upgrade process
Upgrade host to
new version
Place host into
normal operation
Evacuate virtual
machines from
host
Place host into
maintenance mode
=
Proceed with
next host
1 2 3
33. 3rd party components
dom0 is tuned for XenServer usage
• yum is intentionally disabled
• Avoid installing new packages into dom0
• Performance/scalability/stability uncertain
Updates preserve XenServer config only!
• Unknown drivers will not be preserved
• Unknown packages will be removed
• Manual configuration changes may be lost
Citrix Ready Marketplace has validated components
34. Exchange SSL certificate on XenServer
• By default, XenServer uses a self-signed
certificate created during installation to
encrypt communication via SSH and XAPI or
HTTPS.
• To trust this certificate, verify its
fingerprint to the one shown on its
physical console (xsconsole / status
display).
• The certificate can also be exchanged for a
certificate issued from a trusted corporate
certificate authority.
Company Certificate Authority
Request
certificate & key
Issue
certificate & key
XenServer Host
Upload to
/etc/xensource
Replace
xapi-ssl.pem
Convert to
PEM format
36. Configuration Maximums: XenServer 6.2 vs 6.5
Per-VM scalability limits more is better
VM
...
VM
...
Host
...
Host
VM VM ... VM
Host
VM VM
...
VM
Per-host scalability limits more is better
RAM per VM
RAM per host
Running VMs per host VBDs per host
pCPUs per host
vCPUs per VM
Host
...
Multipathed LUNs per host
Host
...
XS6.2
XS6.5
XS6.5
16 (Windows)
16 (Windows)
32 (Linux)
XS6.2
XS6.5
160
160
XS6.2
XS6.5 1000
XS6.2
XS6.5 256
150
500
XS6.2
XS6.5 192GB
128GB
XS6.2
XS6.5
1TB
1TB
XS6.2
XS6.5 2,048
512
37. Highlights of XenServer XenServer 6.5 Performance Improvements
Bootstorm data transferred lower is better
XS 6.2
XS 6.5
18.0 GB
0.7 GB = -96%
VM
VM
VM
VM
XenServer
Bootstorm duration lower is better
XS 6.2
XS 6.5
470 s
140 s = -70%
Measurements were taken on various hardware in representative configurations. Measurements made on other hardware or in other configurations may differ.
XenServer
VM
VM
VM
VM
VM
VM
VM
VM
XenServer
XS 6.2
XS 6.5
3 Gb/s
25 Gb/s = +700%
Aggregate storage read throughput higher is better
VM
VM
VM
VM
XenServer
XS 6.2
XS 6.5
2.2 GB/s
9.9 GB/s
= +350%
XS 6.2
XS 6.5
2.8 GB/s
7.8 GB/s = +175%
VM
VM
VM
VM
XenServer
Aggregate network throughput higher is better
Aggregate storage write throughput higher is better
Booting a large number of VMs is significantly quicker in XS 6.5 due to
the read-caching feature.
The read-caching feature significantly reduces the IOPS hitting the
storage array when VMs share a common base image.
XS 6.5 brings many improvements relating to network throughput.
For example, the capacity for a large number of VMs to send or
receive data at a high throughput has been significantly improved.
The new, optimized storage datapath in XS 6.5 enables aggregate
throughput to scale much better with a large number of VMs. This
allows a large number of VMs to sustain I/O at a significantly higher
rate, for both reads and writes.
vGPU scalability higher is better
XS 6.2
XS 6.5
64
96 = +50%
VM VM VM VM
XenServer
The number of VMs that can share a GPU has increased in XS 6.5.
This will reduce TCO for deployments using vGPU-enabled VMs.
GPU
VM
VM
VM
VM
XenServer
NFS
NFS
NFS
NFS
38. 64 bit control domain improves overall scalability
In XenServer 6.2:
• dom0 was 32-bit so had 1GB of ‘low memory’
• Each running VM ate about 1 MB of dom0’s low memory
• Depending on what devices you had in the host, you would exhaust dom0’s low memory with a
few hundred VMs
In Creedence:
• dom0 is 64-bit so has a practically unlimited supply of low memory
• There is no longer any chance of running out of low memory
• Performance will not degrade with larger dom0 memory allocations
39. Limits on number of VMs per host
Scenario 1: HVM guests, each having 1 vCPU, 1 VBD, 1 VIF, and having PV drivers
Limitation XenServer 6.1 XenServer 6.2 Creedence
dom0 event channels 225 800 no limit
tapdisk minor numbers 1024 2048 2048
aio requests 1105 2608 2608
dom0 grant references 372 no limit no limit
xenstored connections 333 500 1000
consoled connections no limit no limit no limit
dom0 low memory 650 650 no limit
Overall limit 225 500 1000
40. Limits on number of VMs per host
Scenario 2: HVM guests, each having 1 vCPU, 3 VBDs, 1 VIF, and having PV drivers
Limitation XenServer 6.1 XenServer 6.2 Creedence
dom0 event channels 150 570 no limit
tapdisk minor numbers 341 682 682
aio requests 368 869 869
dom0 grant references 372 no limit no limit
xenstored connections 333 500 1000
consoled connections no limit no limit no limit
dom0 low memory 650 650 no limit
Overall limit 150 500 682
41. Limits on number of VMs per host
Scenario 3: PV guests each having 1 vCPU, 1 VBD, 1 VIF
Limitation XenServer 6.1 XenServer 6.2 Creedence
dom0 event channels 225 1000 no limit
tapdisk minor numbers 1024 2048 2048
aio requests 1105 2608 2608
dom0 grant references no limit no limit no limit
xenstored connections no limit no limit no limit
consoled connections 341 no limit no limit
dom0 low memory 650 650 no limit
Overall limit 225 650 2048
42. Netback thread-per-VIF model improves fairness
Improves fairness and reduces interference from other VMs
XenServer host
VM
VM
VM
VM
VM
VM
net-
back
net-
back
net-
back
net-
back
XenServer host
VM
VM
VM
VM
VM
VM
netback
netback
netback
netback
netback
netback
XenServer 6.2 Creedence
43. OVS 2.1 support for ‘megaflows’ helps when you have many flows
The OVS kernel module can only cache a certain number of flow rules
If a flow isn’t found in the kernel cache then the ovs-vswitchd userspace process
is consulted
• This adds latency and can lead to a severe CPU contention bottleneck when there are many
flows on a host
OVS 2.1 has support for ‘megaflows’
• This allows the kernel to cache substantially more flow rules
45. Visibility into Docker Containers
Containers
• Great for application packaging
• Extensive tools for deployment
Virtualization
• Total process isolation
• Complete control
Docker and XenServer
• View container details
• Manage container life span
• Integrated in XenCenter
50. HYPERVISOR
NVIDIA/AMD/Intel GPU NVIDIA/AMD/Intel GPU
Responsiveness
VM has direct access to
GPU and includes NVIDIA
fast remoting technology
VM Portability
Cannot migrate VM to any
node.
App Performance
Full API support including
latest OpenGL, DirectX
and CUDA. Includes
Application certifications
VIRTUAL MACHINE
Guest OS
Native GPU Driver
Virtual Desktop
Apps
Remote
Protocol
VIRTUAL MACHINE
Guest OS
Native GPU Driver
Virtual Desktop
Apps
Remote
Protocol
Density
Limited by the number of
GPUs in the server
Dedicated GPU
per User/VM
Direct GPU access from Guest VMDirect GPU access from Guest VM
Remote Workstations
1:1 GPU pass-through
51. VM
pgpu, vgpus and gpu-group objects
XenServer automatically
creates gpu-group, pgpu,
vgpu-type objects for the
physical GPUs it discovers on
startup
GRID K1
GRID K1
5:0.0
gpu-group
GRID K1
allocation: depth-first
GRID K1
6:0.0
GRID K1
7:0.0
GRID K1
8:0.0
GRID K2
GRID K2
11:0.0
GRID K2
12:0.0
GRID K2
GRID K2
85:0.0
GRID K2
86:0.0
pgpu
5:0.0
GRID K1
pgpu
6:0.0
GRID K1
pgpu
7:0.0
GRID K1
pgpu
8:0.0
GRID K1
gpu-group
GRID K2
allocation:depth-first
pgpu
11:0.0
GRID K1
pgpu
12:0.0
GRID K1
pgpu
85:0.0
GRID K1
pgpu
86:0.0
GRID K1
vgpu-type
GRID K100
vgpu-type
GRID K120Q
User creates vgpu objects:
- owned by a specific VM
- associated with a gpu-group
- with a specific vgpu-type
At VM boot, XenServer picks an
available pgpu in the group to
host the vgpu
vgpu
GRID K260Q
GRID K2
86:0.0
VM
vgpu
GRID K100
GRID K1
8:0.0
vgpu-type
GRID K140Q
vgpu-type
GRID K160Q
vgpu-type
GRID K180Q
vgpu-type
GRID K200
vgpu-type
GRID K220Q
vgpu-type
GRID K240Q
vgpu-type
GRID K260Q
vgpu-type
GRID K280Q
52. How GPU Pass-through Works
Identical GPUs in a host auto-create a GPU
group
The GPU Group can be assigned to set of VMs –
each VM will attach to a GPU at VM boottime
When all GPUs in a group are in use, additional
VMs requiring GPUs will not start
GPU and non-GPU VMs can (and should) be
mixed on a host
GPU groups are recognized within a pool
• If Server 1, 2, 3 each have GPU type 1, then VMs
requiring GPU type 1 can be started on any of those
servers
53. Limitations of GPU Pass-through
GPU Pass-through binds the VM to host for duration of session
• Restricts XenMotion
Multiple GPU types can exist in a single server
• E.g. high performance and mid performance GPUs
VNC will be disabled, so RDP is required
Fully supported for XenDesktop, best effort for other windows workloads
HCL is very important
54. GRID K1/K2 GPU
Virtual
GPU
Virtual
GPU
Virtual
GPU
Virtual
GPU
Virtual
GPU
VIRTUAL MACHINE
Guest OS
NVIDIA Driver
Virtual Desktop
Apps
Remote
ProtocolXenServer
NVIDIA GRID Virtual GPU
Manager
Physical GPU
Management
State
VM Portability
Cannot migrate VM to any
node.
Density
Limited by number of
Virtual GPUs in system
Responsiveness
VM has direct access to
GPU and includes NVIDIA
fast remoting technology
App Performance
Full API support including
latest OpenGL & DirectX .
Includes Application
certifications
NVIDIA Grid Architecture
Hardware virtualized GPU
55. Overview of vGPU on XenServer
GRID vGPU enables
multiple VMs to share a
single physical GPU
VMs run an NVIDIA driver
stack and get direct access
to the GPU
• Supports same graphics APIs as
physical GPUs (DX9/10/11, OGL
4.x)
NVIDIA GRID Virtual GPU
Manager for XenServer runs
in dom0
GRID K1 or K2
Xen Hypervisor
XenServer dom0
GRID Virtual GPU
Manager
NVIDIA
Kernel Driver
Graphics fast path -
direct GPU access
from guest VMs
Hypervisor
Control
Interface
Host channel registers,
framebuffer regions,
display, etc.
Per-VM dedicated channels,
framebuffer. Shared access
to GPU engines.
Virtual Machine
Guest OS
NVIDIA
Driver
Apps
Management
Interface
Citrix
XenDesktop
Virtual Machine
Guest OS
NVIDIA
Driver
Apps
Citrix
XenDesktop
NVIDIA
paravirtualized
interface
56. GRID GPU enabled Server
Nvidia vGPU Resource Sharing
Nvidia GRID vGPU
Citrix XenServer dom0
GRID Virtual GPU
Manager
Timeshared Scheduling
Virtual Machine 1
Guest OS
NVIDIA
Driver
Apps
Citrix
VDA
CPU MMU
3D CE NVENC NVDEC
Virtual Machine 2
Guest OS
NVIDIA
Driver
Apps
Citrix
VDA
GPU BAR VM BAR 1 VM BAR 2
Framebuffer
VM1 Framebuffer
VM2 Framebuffer
Channels
Framebuffer
• Allocated at VM startup
Channels
• Used to post work to the GPU
• VM accesses its channels via
GPU Base Address Register;
isolated by CPU‘s Memory
Management Unit (MMU)
GPU Engines
• Timeshared among VMs, like
context on single OS
61. Protecting Workloads
Not just for mission critical
applications anymore
Helps manage VM density issues
"Virtual" definition of HA a little
different than physical
Low cost / complexity option to restart
machines in case of failure
62. High Availability Operation
Pool-wide settings
Failure capacity – number of hosts to
carry out HA Plan
Uses network and storage heartbeat
to verify servers
63. VM Protection Options
Restart Priority
• Do not restart
• Restart if possible
• Restart
Start Order
• Defines a sequence and delay to ensure applications run correctly
64. HA Design – Hot Spares
Simple Design
• Similar to hot spare in disk array
• Guaranteed available
• Inefficient Idle resources
Failure Planning
• If surviving hosts are fully loaded – VMs will be forced to start on spare
• Could lead to restart delays due to resource plugs
• Could lead to performance issues if spare is pool master
65. HA Design – Distributed Capacity
Efficient Design
• All hosts utilized
Failure Planning
• Impacted VMs automatically placed for best fit
• Running VMs undisturbed
• Provides efficient guaranteed availability
66. HA Design – Impact of Dynamic Memory
Enhances Failure Planning
• Define reduced memory which meets SLA
• On restart, some VMs may “squeeze” their memory
• Increases host efficiency
67. High Availability – No Excuses
Shared storage the hardest part of setup
• Simple wizard can have HA defined in minutes
• Minimally invasive technology
Protects your important workloads
• Reduce on-call support incidents
• Addresses VM density risks
• No performance, workload, configuration penalties
Compatible with resilient application designs
Fault tolerant options exist through ecosystem
69. VHD Benefits
Many SRs implement VDIs as VHD trees
VHDs are a copy-on-write format for storing virtual disks
VDIs are the leaves of VHD trees
Interesting VDI operation: snapshot (implemented as VHD “cloning”)
A: Original VDI
B: Snapshot VDI
A
RW
B
RO
A
RW
RO
70. Source Destination
Storage XenMotion
“A” represents the VHD of a VM
The VHD structure (not contents) of “A” is duplicated on the Destination
Virtual
Machine
AA
Parent
Child
Empty
72. Source Destination
B B
Storage XenMotion
VM writes are now synchronous to both Source & Destination Active child VHDs
Parent VHD (now Read-Only) is now background copied to the Destination
AA
Virtual
Machine
AB B
Copy
A
Parent
Child
Empty
73. Source Destination
Storage XenMotion
Once the Parent VHD is copied, the VM is moved using XenMotion
The synchronous writes continue until the XenMotion is complete
AB BA
Virtual
Machine
A
Parent
Child
Empty
75. Benefits of VDI Mirroring
Optimization: start with most similar VDI
• Another VDI with the least number of different blocks
• Only transfer blocks that are different
New VDI field: Content ID for each VDI
• Easy way to confirm that different VDIs have identical content
• Preserved across VDI copy, refreshed after VDI attached RW
Worst case is a full copy (common in server virtualization)
Best case occurs when you use VM “gold images” (i.e. CloudStack)
84. Storage Networks
Independent management network
• Supports iSCSI multipath
• Bonded for redundancy; multipath as best practice
• Best practice to enable Jumbo frames
• Must be consistent across pool members
• 802.3ad LACP provides limited benefit (hashing)
85. Guest VM Networks
Single server private network
• No off host access
• Can be used by multiple VMs
External network
• Off host network with 802.1q tagged traffic
• Multiple VLANs can share physical NIC
• Physical switch port must be trunked
Cross host private network
• Off host network with GRE tunnel
• Requires DVSC or Apache CloudStack controller
88. Thick provisioning
With thick provisioning, disk space is allocated
statically.
As virtual machines are created, their virtual disks
utilize the entire available disk size on the physical
storage.
This can result in a large amount of unused
allocated disk space.
A virtual machine created using a 75 GB virtual disk
would consume the entire 75 GB of physical
storage disk space, even if it only requires a quarter
of that.
Thick Provisioning
75 GB Disk
Space Required
75 GB
Allocated, but
unused space
50 GB
Actually Used
25 GB
89. Thin Provisioning
75 GB Disk
Thin Provisioning
With thin provisioning, disk space is allocated on an
“as-needed” basis.
As virtual machines are created, their virtual disks
will be created using only the specific amount of
storage required at that time.
Additional disk space is automatically allocated for
a virtual machine once it requires it. The unused
storage space remains available for use by other
virtual machines.
A virtual machine created using a 75 GB virtual
disk, but that only uses 25 GB, would consume only
25 GB of space on the physical storage.
Space Required
25 GB
Free Space
for Allocation
50 GB
Actually Used
25 GB
90. Thin Provisioning
75 GB Disk
Sparse Allocation
Sparse allocation is used with thin provisioning.
As virtual machines are created, their virtual disks
will be created using only the specific amount of
storage required at that time.
Additional disk space is automatically allocated for
a virtual machine once it requires it. If the OS
allocates the blocks at the end of the disk,
intermediate blocks will become allocated
A virtual machine created using a 75 GB virtual
disk, but that uses 35 GB in two blocks, could
consume between 35 GB and 75GB of space on
the physical storage.
Space Required
75 GB
Allocated, but
unused space
40 GB
Actually Used
25 GB
10GB used
91. Local DiskLocal Disk
XenServer Disk Layouts (Local)
LVM Volume Group
LVHD Logical Volumes
(Thick)
Virtual Machine Virtual Machine
Storage Repository
Default Layout
dom0
Partition
(4GB)
EXT-Based Layout
xxx.vhd yyy.vhd zzz.vhd
EXT File System
Backup
Partition
dom0
Partition
(4GB)
Backup
Partition
Virtual Machine Virtual Machine
Files (Thin)
Storage Repository
VHD
Header
OS
Partition &
File
System
92. SAN “Raw” Disk
NASVolume
XenServer Disk Layouts (Shared)
NFS Share
xxx.vhd yyy.vhd zzz.vhd
LUNLVM Volume Group
LVHD Logical Volumes (Thick)
Virtual Machine Virtual Machine
Storage Repository
Virtual Machine Virtual Machine
Native iSCSI &
Fiber Channel
NFS-Based
Storage
Storage Repository
93. Management and Monitoring
Fibre Channel LUN Zoning
Since Enterprise SANs consolidate data from multiple servers and operating systems, many
types of traffic and data are sent through the interface, whether it is fabric or the network.
With Fibre Channel, to ensure security and dedicated resources, an administrator creates
zones and zone sets to restrict access to specified areas. A zone divides the fabric into groups
of devices.
Zone sets are groups of zones. Each zone set represents different configurations that optimize
the fabric for certain functions.
WWN - Each HBA has a unique World Wide Name (similar to an Ethernet MAC)
node WWN (WWNN) - can be shared by some or all ports of a device
port WWN (WWPN) - necessarily unique to each port
94. Fibre Channel LUN Zoning
Initiator Group
Xen1, Xen2
LUN0 LUN1
Xen1 Xen2 Xen3
Pool1 Pool2
LUN2
Initiator Group
Xen3
FC Switch
Storage
Zone1
Xen1 WWN Xen2 WWN
Storage WWN
Zone2
Xen3 WWN
Storage WWN
FC Switch example
95. Management and Monitoring
iSCSI Isolation
With iSCSI type storage a similar concept of isolation as fibre-channel zoning can be achieved
by using IP subnets and, if required, VLANs.
IQN – Each storage interface (NIC or iSCSI HBA) has configured a unique iSCSI Qualified
Name
Target IQN – Typically associated with the storage provider interface
Initiator IQN – Configured on the client side
IQN format is standardized:
iqn.yyyy-mm.{reversed domain name} (e.g. iqn.2001-04.com.acme:storage.tape.sys1.xyz)
97. Storage multipathing
• Routes storage traffic over multiple physical paths
• Used for redundancy and increased throughput
• Unique logical networks are required
• Available for Fibre Channel and iSCSI
• Uses Round-Robin Load Balancing (Active- Active)
Storage
Array
Network 1
XenServer Host
Storage
Controller 1
Storage
Controller 2
192.168.1.200
192.168.2.200
192.168.1.201
192.168.2.201
192.168.1.202
192.168.2.202
Network 1
98. Understanding dom0 storage
dom0 isn’t general purpose Linux
• Don’t manage storage locally
• Don’t use software RAID
• Don’t mount extra volumes
• Don’t use dom0 storage as “scratch”
Local storage is automatically an SR
Adding additional local storage
• xe sr-create host-id=<host> content-type=user name-label=”Disk2” device-config:device=”/dev/sdb” type=ext
Spanning multiple local storage drives
• xe sr-create host-id=<host> content-type=user name-label=”Group1” device-config:device=”/dev/sdb,/dev/sdc” type=ext
100. Snapshot Behavior Varies By
The type of SR in use
• LVM-based SRs use “volume-based” VHD
• NFS and ext SRs use “file-based” VHDs
• Native SRs use capabilities of array
Provisioning type
• Volume-based VHDs are always thick provisioned
• File-based VHDs are always thin provisioned
For LVM-based SR types
• If SR/VM/VDI created in previous XS version, VDIs (volumes) will be RAW
101. Snapshot (NFS and EXT Local Storage)
Resulting VDI tree Disk utilization
• VHD files thin provisioned
• VDI A contains writes up to point of
snapshot
• VDI B and C are empty*
• Total:
• VDI A: 20
• VDI B: 0*
• VDI C: 0*
• Snapshot requires no space*
A
B
20 40
400 40
C 0
(1)(2)
(1) Size of VDI
(2) Data written in VDI
Key
Snapshot CloneParent Active* Plus VHD headers
102. Snapshot (Local LVHD, iSCSI or FC SR)
Resulting VDI tree Disk utilization
• Volumes are thick provisioned
• Deflated where possible
• Total:
• VDI A: 20
• VDI B: 40*
• VDI C: 0*
• Snapshot requires 40 + 20GB
A 4020
400
B 40
C 0
(3) (1)(2)
(1) Size of VDI
(2) Data written in VDI
(3) Inflated / deflated state
Key
Snapshot CloneParent Active* Plus VHD headers
103. Automated Coalescing Example
1) VM with two
snapshots,
C and E
A
CB
D E
A + B
3) Parent B is no longer required and
will be coalesced into A
D E
Key
Snapshot CloneParent Active
2) When snapshot C is
deleted…
A
B
D E
http://support.citrix.com/article/CTX122978
104. Suspend VM / Checkpoints
Suspend and snapshot checkpoints store VM memory content
on storage
The storage selection process
• SR specified in pool parameter suspend-image-sr is used
• Suspend-image-sr (pool) by default is default storage repository
• In case no suspend-image-sr (e.g. no default SR) is set on pool level
XenServer falls back to local SR of the host running the VM
Size of suspend image is ~ 1.5 * memory size
Best practice: configure an SR as the suspend images store
• xe pool-param-set uuid=<pool uuid> suspend-image-SR=<shared sr uuid>
VM suspend
Save
memory
on
storage
105. Snapshot storage utilization
LVM-based VHD
Read-Write Child Image
(VHD)
Read-only Parent Image (VHD)
60GB VDI
60GB
Read-Write Child Image
(VHD)
60GB
File-based VHD
Read-Write Child Image
(VHD)
Read-only Parent Image (VHD)
60GB VDI 50% Allocated
30GB of
Data Used
Read-Write Child Image
(VHD)
Size equals data
written to disk
since cloning (thin)
Size equals data
written to disk
since cloning (thin)
50% Allocated
30GB of
Data Used
107. Integrated Site Recovery
Supports LVM SRs only
Replication/mirroring setup outside scope of
solution
• Follow vendor instructions
• Breaking of replication/mirror also manual
Works all iSCSI and FC arrays on HCL
Supports active-active DR
108. Feature Set
Integrated in XenServer and XenCenter
Support failover and failback
Supports grouping and startup order through vApp functionality
Failover pre-checks
• Powerstate of source VM
• Duplicate VMs on target pool
• SR connectivity
Ability to start VMs paused (e.g. for dry-run)
109. How it Works
Depends on “Portable SR” technology
• Different from Metadata backup/restore functionality
Creates a logical volume on SR during setup
Logical Volume contains
• SR metadata information
• VDI metadata information for all VDIs stored on SR
Metadata information is read during failover sr-probe