In today's public and private cloud markets, availability is a very important metric for all cloud service providers. COLO is an ideal Application-agnostic Solution for Non-stop service in the cloud. Our solution can protect user service even from physical network or power interruption. And the the switching process is difficult for users to perceive (TCP connection will not be terminated). Under COLO mode, both primary VM (PVM) and secondary VM (SVM) are running parallel. The COLO has more than ten times performance increase compared with previous solution (like Remus). Current COLO codes has been merged in QEMU community, we can use COLO in upstream without any other addition patches. In this talk, we will talk about the COLO implementation in QEMU and Xen, the new designed COLO-Proxy, discussing on problems we've met while developing COLO. and report the latest progress from Intel.
4. Non-Stop Service with VM Replication
Virtual Machine (VM) replication
A software solution for business continuity and disaster recovery
through application-agnostic hardware fault tolerance by replicating
the state of primary VM (PVM) to secondary VM (SVM) on different
physical nodes.
3
6. What Is COLO ?
Clients and Server(VM) model
Server(VM) and Clients are a system of networked request-response
system
Clients only care about the response from the VM
COarse-grain LOck-stepping VMs for Non-stop Service
(COLO)
PVM and SVM execute in parallel
Compare the output packets from PVM and SVM to detect VM status
Synchronize SVM state with PVM when their responses (network
packets) are not identical (PVM status not same as SVM. )
5
7. Scenario
Stateless service
For example:
UDP based application
Part of online game, etc.
Stateful service
For example:
TCP based application
Live streaming, website service, etc.
COLO is not designed for stateless service, but for critical stateful
service. Our solution can protect user service even from physical
network or power interruption.
6
8. Why COLO Better
Comparing with Continuous VM checkpoint
No buffering-introduced latency
Less checkpoint frequency
On demand vs periodic
Comparing with Instruction lock-stepping
Eliminate excessive overhead of un-deterministic instruction
execution due to Multi-Processor-guest memory access
7
9. COLO History
COLO paper published on SOCC13
2013.10
COLO framework merged by Xen
2015.7
All COLO related patches
merged by Qemu
2018.10
New features
Now
8
10. Old COLO Architecture
COarse-grain LOck-stepping Virtual Machine for Non-stop Service
Primary Node
Primary VM
Heartbeat
Qemu
COLO Proxy
(compare packet
and mirror packet)
Block
Replication
InternalNetwork
Secondary Node
Secondary VMDomain 0
COLO Proxy
(Adjust Sequence
Number and ACK)
Block
Replication
Xen Xen
External NetworkExternal NetworkStorage Storage
Request
Response
Request
Response
Disk IO
Net IO
Domain 0
COLO Frame
Qemu
COLO Frame
Heartbeat
9
11. New COLO Architecture
COarse-grain LOck-stepping Virtual Machine for Non-stop Service
Primary Node
Primary VM
Heartbeat
Qemu
COLO Proxy
(compare packet
and mirror packet)
Block
Replication
InternalNetwork
Secondary Node
Secondary VMDomain 0
COLO Proxy
(Adjust Sequence
Number and ACK)
Block
Replication
Xen Xen
External NetworkExternal NetworkStorage Storage
Request
Response
Request
Response
Disk IO
Net IO
Domain 0
Qemu
Heartbeat
COLO Frame COLO Frame
10
12. COLO Architecture Difference
Re-developed COLO Proxy
Old COLO proxy based on kernel netfilter module, need
manually patch and rebuild kernel in domain0.
New COLO proxy code in Qemu, no need modify kernel.
More generic code, split COLO proxy to filter-redirector, filter-
mirror, filter-rewriter and colo-compare.
11
13. Block Replication(Storage Process) Write
Pnode
Send the write request to Snode
Write the write request to storage
Snode
Receive PVM write request
Read original data to SVM cache &
write PVM write request to
storage(Copy On Write)
Write SVM write request to SVM
cache
Read
Pnode
Read form storage
Snode
Read from SVM cache, or storage
(SVM cache miss)
Checkpoint
Drop SVM cache
Failover
Write SVM cache to storage
Base on qemu’s quorum,nbd,backup-driver,backingfile
Detail: Qemu/docs/block-replication.txt
Primary Node
Primary VM
Qemu
Block
Replication
InternalNetwork
Secondary Node
Secondary
VM
Domain 0
Block
Replication
Xen Xen
Storage StorageDisk IO
Domain 0
Qemu
12
14. COLO Frame (Memory Sync Process)
• PNode
– Track PVM dirty pages, send them to Snode.
• Snode
– Receive the PVM dirty pages, save them.
– On checkpoint, update SVM memory.
Primary Node
Primary VM
InternalNetwork
Secondary Node
Secondary
VM
Domain 0
Xen Xen
Domain 0
Transfer Dirty Pages and VM state
COLO FrameCOLO Frame
13
16. COLO Proxy Design
Detail: Qemu/docs/colo-proxy.txt
15
Kernel scheme(obsolete):
Based on kernel TCP/IP stack and netfilter component.
Better performance but less flexible (need modify
netfilter/iptables and kernel)
Userspace scheme:
All implemented in QEMU.
Based on QEMU netfilter components and SLIRP
component.
More flexible.
17. Proxy Design (Kernel scheme)
Same: release the packet to client
Different: trigger checkpoint and release packet to client
Base on kernel TCP/IP and netfileter
Guest-RX
Pnode
Receive a packet from client
Copy the packet and send to Snode
Send the packet to PVM
Snode
Receive the packet from Pnode
Adjust packet’s ack_seq number
Send the packet to SVM
Guest-TX
Snode
Receive the packet from SVM
Adjust packet’s seq number
Send the SVM packet to Pnode
Pnode
Receive the packet from PVM
Receive the packet from Snode
Compare PVM/SVM packet
Primary Node
Primary VM
Qemu
InternalNetwork
Secondary Node
Secondary
VM
Domain 0
Xen Xen
Domain 0
Request
Response
Request
Response
Kernel
COLO
Proxy
Qemu
Kernel
COLO
Proxy
External NetworkExternal Network
16
18. Proxy Design (Userspace scheme)
Same: release the packet to client
Different: trigger checkpoint and release packet to client
Base on Qemu’s netfilter and SLIRP (Userspace TCP/IP stack)
Guest-RX
Pnode
Receive a packet from client
Copy the packet and send to Snode
Send the packet to PVM
Snode
Receive the packet from Pnode
Adjust packet’s ack_seq number
Send the packet to SVM
Guest-TX
Snode
Receive the packet from SVM
Adjust packet’s seq number
Send the SVM packet to Pnode
Pnode
Receive the packet from PVM
Receive the packet from Snode
Compare PVM/SVM packet
Primary Node
Primary VM
Qemu
InternalNetwork
Secondary Node
Secondary
VM
Domain 0
Xen Xen
Domain 0
Request
Response
Request
Response
KernelCOLO
Proxy
Qemu
Kernel
COLO
Proxy
External NetworkExternal Network
Kernel
17
19. COLO internal connection On XenPrimary Node
Primary VM
Heartbeat
Qemu
COLO Proxy
(compare packet
and mirror packet)
Block
Replication
InternalNetwork
Secondary Node
Secondary VMDomain 0
COLO Proxy
(Adjust Sequence
Number and ACK)
Block
Replication
Xen Xen
External NetworkExternal NetworkStorage Storage
Request
Response
Net IO
Domain 0
Qemu
Heartbeat
COLO Frame COLO Frame
18
20. New Feature
Integrate COLO heartbeat module
New heartbeat module in Qemu, make users enable
COLO easily while preserving external heartbeat
interface.
Internal heartbeat can reduce the time of fault detection
and response.
Enable continuous backup
Continuous backup the VM after failover.
19