This talk presents a production-ready automotive virtualization solution with Xen. The key requirements that we focus are super-fast startup and recovery from failure, static virtual machine creation with dedicated resources, and performance effective graphics rendering. To reduce the boot time, we optimize the Xen startup procedure by effectively initializing Xen heap and VM memory, and booting multiple VMs concurrently. We provide fast recovery mechanism by re-implementing the VM reset feature. We also develop a highly optimized graphics APIs-forwarding mechanism supporting OpenGLES APIs up to v3.2. The pass rate of Khronos CTS in a guest OS is comparable to the Domain0’s. Our experiment shows that our virtualization solution provides reasonable performance for ARM-based automotive systems (hypervisor booting: less than 70ms, graphics performance: about 96% of Domain0).
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen - Sung-Min Lee, Samsung Electronics
1. Design and Implementation of Automotive
Virtualization Based on Xen
June 21, 2018
Xen Developer and Design Summit
Sung-Min Lee
sung.min.lee@samsung.com
2. 1 / 21
Acknowledgement
Bokdeuk Jeong, Hanjung Park, Kiwoong Ha, Hyeongseop Shim,
Jinsub Sim, Byungchul So, Min Kang, Junbong Yu,
Hyeonkwon Cho, Sungjin Kim, Hakyoung Kim, Sangbok Han,
Vasily Leonenko, Dmitrii Martynov, Andrew Gazizov,
Sergey Grekhov, Ildar Ismagilov, Imran Navruzbekov,
and colleagues from S.LSI
Major Contributors
4. 3 / 21
Why Virtualization in Automotive Industry?
Easy Firmware Update for Automotive System
Enhanced In-Vehicle Security
Cost/Complexity Reduction
Virtualization?
5. 4 / 21
Key Requirements for Automotive Virtualization
Fast Startup
Fault Recovery for Reliable In-Vehicle Services
High Performance Graphics Support for Guest OS
7. 6 / 21
Fast Startup (1/3)
Xen Heap Initialization Static VM Memory Assignment
Reducing Memory Initialization Time
Open source Xen is based on
per-page xen heap insertion
Reduce the number of the init function
invocations and remove spinlocks by
implementing a new initialization function
Skip scrub_heap_pages()
Assign specific addresses and size for VMs in
device tree and exclude them for Xen Heap
insertion
The number of pages to initialize is reduced
and double definition overhead (init + VM
assigning) is disappeared
Xen Heap initialization range
Dom0 memory Dom1 memory
Double assignment
Dom0 Memory Dom1 Memory
Heap Initialization Range is reduced
Original ver
Samsung ver
Double assignment
disappeared
0x8000xxxx 0xFFFFxxxxPagePage
Page
Page
Xen Heap
Page
Page
Page
Page
Xen Heap
free_heap_pages fastboot_insert_page
Call init function
by the number
of total pages
times
Initialize total pages as heap
Don’t have to
call init function
many times
alloc_domheap_pages
domain_direct_assign_pages
Contiguous
Page Chunk
Order of 2
Original ver Samsung ver
8. 7 / 21
Fast Startup (2/3)
Relocate DTB Nodes Improve DTB Creation Algorithm
Reducing DTB Lookup and Creation Time
Scanning all nodes in device tree takes
a lot when its size is large
Group scattered “chosen” and
“memory” nodes into one node and
relocate it at the head of device tree to
search faster
Creating device tree for control OS
takes most of time
FDT library uses the most naïve
algorithm when comparing strings
Replace it with better one, Boyer-
Moore algorithm with time complexity
O(m+n)
Flattened Device Tree Description
Grouped
information node
Chosen
node
Memory
node
Memory
node
Visit all nodes to find information nodes
Doesn't have to visit all nodes
Original ver
Samsung ver Special node “xen-early-scan-end”
9. 8 / 21
Fast Startup (3/3)
Assign Big Core for Dom0 Creation Concurrent VM Creation
Assign Big Core and Concurrent VM Creation During the Xen Startup
Exynos8890 uses little core for boot
strap processor (CPU0)
Assign tasklet to sleeping big core to
create control OS
CPU0 waits until domain creation done
To shorten the time spent on XL for
DomU creation, Samsung moves its data
structure initialization to xen startup
Assign a guest OS creation tasklet to
another sleeping big core and create
simultaneously with control OS creation
CPU0
(little core)
CPU4
(Big core)
Scanning DTB, Heap init, Secondary core booting..
Domain creation
Schedule tasklet to Big core
ETC
Wait until dom creation
done…
Create domain
Notify that creation done
Scanning DTB, Heap init, Secondary core booting..
Domain creation
Schedule control OS tasklet to CPU4
ETC
Wait until dom(s)
creation done…
Create dom0
Notify that creation done
CPU0
(little core)
CPU4
(Big core)
CPU5
(Big core)
Schedule guest OS tasklet to CPU5
Create dom1
10. 9 / 21
Fault Recovery (1/3)
First Sub-headingFault Recovery Scenario
Linux Kernel Linux Kernel
Xen
Watchdogd
Xen
Watchdog
Timer
Xen ToolsWatchdogd
HW
Watchdog
Timer
Xen Shutdown Manager
Samsung VM
Reset Manager
(Relaunch VMs)
HW Reset
XEN
CrashCrash
Exynos8
KickKick
Timeout
Timeout
1.Reset
Request
2.Execute
resetVM
void panic (const char *fmt, …)
hypercall
HYPERVISOR_sched_op(SCHEDOP_
shutdown, &r);
void panic (const char *fmt, …)
machine_restart
Set timeout/
notification period
Set timeout/
notification period
Dom0 DomU
11. 10 / 21
Fault Recovery (2/3)
First Sub-headingConventional VM Restart vs Our VM Reset
VM Restart VM Reset
Destroy a domain
Create a new domain
Free/Allocate all resources
Use hypercalls for destroy domain
and create domain in xen
- XEN_DOMCTL_destroydomain
- XEN_DOMCTL_createdomain
Destroy devices only
Just reuse almost all
resources like vcpu structure,
shared_info, p2m table, etc
Reset the existing domain
Add a new hypercall for
reset domain in xen
- XEN_DOMCTL_samsung_reset
12. 11 / 21
Fault Recovery (3/3)
First Sub-heading
XEN
Dom0 (Control domain)
XL
DomU
XS
domain_reset
- Reset grant table
- Reset vcpu, vgic, vtimer
- Reset shared_info page
- Reset p2m table
- Reset event channel
send_global_virq(VIRQ_DOM_EXC);
Release
Domain
Reset flags
setting
libxl_domain_reset
1. xs_release_domain
2. Destroy devices
fire
Hypercall:
XEN_DOMCTL_unpausedomain
DomU
unpause
Relaunching complete
②
③
④
⑤
domain_shutdown
(Xen Shutdown Manager)
- Set reason flags for shutdown
- Pause vcpus
initiate_domain_create
- Reload kernel binary
- Reload device tree from
memory
- Reset CPU registers
- xs_introduce_domain
Hypercall:
XEN_DOMCTL_samsung_reset
⑥
⑦
devices_destroy_cb
- Request domain reset
Xen Watchdog Timeout
①
VM Reset Procedure
Hypercall:
HYPERVISOR_sched
_op(SCHEDOP_shut
down,&r);
Crash
①
13. 12 / 21
I/O Virtualization
First Sub-heading
DomU (IVI OS)Dom0 (Cluster OS) Dom0 (Cluster OS) DomU (IVI OS)
PV I/O Drivers Passthrough I/O Drivers
virtual clocksources, regulators and
pintcontrollers are provided to DomU
for device passthrough
Xen
Pass-thru Device Drivers
vClock
FE
vRegulator
FE
vPinctrl
FE
vNet, vStorage, vConsole : utilized existing
open source codes
vInput: provides touch
vCamera, vAudio : developed from scratch
Performance enhancement with direct
cache operations on foreign pages
vNet BE vBlock BE
vCamera BE
vAudio BE vConsole BE
vInput BE
vNet FE vBlock FE
vCamera FE
(as a V4L2 driver)
vAudio FE
(as an ALSA driver) vConsole FE
vInput FE
image
Native Device Drivers
(Pinctrl, Regulator, Clocksource)
vClock
BE
vRegulator
BE
vPinctrl
BE
I/O
device
Xen
I/O
device
14. 13 / 21
Graphics Virtualization (1/3)
First Sub-headingOverall Architecture of Graphics Virtualization
Xen
Dom0 (Cluster OS) DomU (IVI OS)
Application
Display Server
SVGL FE (Library)
Kernel
vGPU FE
Graphics Control Manager (GCM) FE
Virtual
OpenGL ES FE
Virtual DRI (GBM, Wayland Buf)
Virtual GPU Command Buffer
Native
Display Server
Kernel
Native Graphics Library
Native Graphics
Driver
vGPU BE
Samsung Virtual Display Manager (SVDM)
Client
(Native
Window)
Samsung Virtual Graphics Library(SVGL) BE
Virtual EGL FE
GL Render
Virtual EGL BE
Virtual OpenGL ES BE
RenderThread
③
④⑤
⑥
①
②⑧
⑨
Encoded Graphics Data to
Shared Memory
Graphics
APIsVirtual Display
GLES APIs
and Data
EGL APIs
GCM BE
eglCreateWindowSurface [eglCreatePbufferSurface]
⑦
②
15. 14 / 21
Graphics Virtualization (2/3)
DomU Kernel
vGPU FE
Key Features Adaptive Interrupt/Poll Processing
Key Features and Performance Optimization
Latest Graphics APIs Support
: OpenGL ES v3.2/EGL v1.5 Support
Shared Memory Between BE and FE
With Simple Protocol
Batch Processing for Transferring APIs
Selective APIs Transfer to BE
Adaptive Interrupt/Poll Processing in FE
Efficient Graphics Buffer Management
Dynamically Choosing Either Interrupt or Poll Based on
Request Rate Between SVGL FE and vGPU FE
Recalculate “the rate” at Every 250 ms
Performance Improvement Over Interrupt Method with
the Threshold “0.3” in Policy [75 cmd buf flush OPs/250ms]
Xen
SVGL FE
Dom0
Kernel
Graphics
Native
Library
SVDM
Application
Virtual GPU
Command Buffer
interrupt
GCM
Monitor
Interrupt/
Poll
Read
Graphics APIs
& Data
Interrupt or
Poll
Rate = # of Buffer Flush OPs/Time
Policy
Threshold: 0.3
# if 0.3 the calculated rate
then Polling
else Interrupt
16. 15 / 21
Graphics Virtualization (3/3)
First Sub-heading
DomU Kernel
vGPU
Graphics Command Buffer Transfer Variable Size Buffer Allocation
Efficient Graphics Command Buffer Management
Accumulate each Opcode of API whose
return type is “void” and Data to the Buffer
Flush Buffer for “glFlush”, “glFinish” and “non-
void return type APIs”.
Flush Buffer in the Case of “Buffer Full”
Xen
SVGL FE
DomU Kernel
Dom0
Kernel
Native
Graphics
Library
SVDM
Application
Virtual GPU Command Buffer
…
SVGL FE
Dom0
Kernel
Native Graphics
Library
SVDM
Application
Virtual GPU CMD Buffer
Xen
30
M
30
M
EGL/GLES
API
add/remove
memchunk
Send event
for memory
alloc/realloc
Map/
Unmap
Shared
Memory
30
M
30
M
glFlush, glFinish,
Non void return
type APIs
Void Return
type APIs
Store
opcode and
Data
Send accumulated APIs
Any type of
APIs
Buffer is
full
Dynamically Resizable Buffer Management
: Initial Buffer Size (30MB)
Expand Buffer Pool Based on Available Buffer
Size and Requested Size from SVGL FE
Shrink Buffer When App is Terminated
Foreignmap
Buffer Pool
Manager
Buffer
Allocator
17. 16 / 21
Evaluation: Startup Performance
First Sub-heading
144
103
Base Line Reduce Memory
Initialization
Improve DTB
creation
Bigcore Assining
Time Measurement Unit : msec
Xen Startup Time Changes Time Measurement for Sections
Measured Time:
start_xen() ~ init_done()
[xen/arch/setup.c]
10
471
313
685
64
16
1
Original ver Samsung ver
Page Scrubbing
Domain creation
Heap & Hardware
Initialization
Scanning DTB
10
471
685
313
1
9
59
1479
69
1,479 msec
69 msec
18. 17 / 21
Evaluation: VM Fault Recovery Time
First Sub-headingVM Restart Time vs. VM Reset Time (sec)
7.1
0.97
VM Restart VM Reset
Time Measurement
After Optimization
libxl_domain_unpause function
[in xl_cmdimpl.c]
libxl_domain_unpause function
[in xl_cmdimpl.c]
xl reset xl restart
19. 18 / 21
Evaluation: I/O Performance
First Sub-heading
Storage Throughput
Storage and Network Throughput
Dom0
216MB/s
(94.7%)
228MB/s
DomU
Network Throughput
Dom0
129Mbits/s
(92.8%)
139Mbits/s
DomUDom0
132MB/s
(98.5%)
134MB/s
DomU
Seq. Read
Seq. Write
20. 19 / 21
Evaluation: Khronos CTS
First Sub-headingNumber of Passed Test Case by Khronos CTS
1228
2991
1392
2511
261
1228
2991
1392
2511
261
OpenGL ES 2.0 OpenGL ES 3.0 OpenGL ES 3.1 OpenGL ES 3.2 OpenGL ES EXT
Dom0 DomU
Number of Passed TC
Total # of Passed TC: 8,383
21. 20 / 21
Evaluation: Graphics Performance
Glmark2-es-wayland 1920x1080 Off-screen
660 662 663 662 662 661.8
639 638 636
640 639 638.4
1st 2nd 3rd 4th 5th Average
Dom0 DomU
Score
96.8%
96.4%
95.9%
96.7%
96.5%96.5%
1st 2nd 3rd 4th 5th Average
1st 2nd 3rd
4th 5th Average
Ratio
22. 21 / 21
Demo
System Environment
- Exynos 8890 Octacore (Big core: 2.28GHz, Little: 1.58GHz)
- 6 GB DRAM
- 32 GB MMC
- Mali T880 MP12
Xen (v4.8)
Cluster OS
(Dashboard UI)
IVI OS
(Genivi Demo Platform)
Exynos 8890