SlideShare a Scribd company logo
1 of 18
Download to read offline
OSv –
Optimizing the Operating System
for Virtual Machines
Avi Kivity, Dor Laor,
Glauber Costa, Pekka Enberg,
Nadav Har'El, Don Marti, Vlad Zolotarov
Cloudius Systems
Problem statement
● Virtual Machines are useful and everywhere.
● A VM runs a guest operating system.
● Usually, guest OS is an existing general-purpose OS,
e.g., Linux.
Can we design a better OS specifically for VMs?
Goals of OSv
OSv: a new OS designed specifically for cloud Vms.
● Run existing cloud applications (Linux executables).
● Run these faster than Linux.
● Explore new APIs for even better performance.
● Use those in a common runtime environment (e.g., JVM)
to also benefit unmodified applications.
Goals of OSv (continued)
● Small image and very quick boot.
– Starting a new VM becomes a viable alternative to
reconfiguring a running one.
● Not tied to specific hypervisor or platform
– 64-bit x86 fully working, 64-bit ARM in progress.
– KVM, Xen, VMware, VirtualBox.
– Amazon EC2 and Google GCE clouds.
Goals of OSv (continued)
● Be a platform for continued research on VM OSs
– Actively developed as open source. http://osv.io/
– Community encourages innovation.
– Small code base compared to Linux.
– Modern programming language: C++11.
– Not limited to particular hypervisor or application
programming language.
– Fully supports SMP guests.
OSv design and implementation
● Process isolation is an important role of traditional OSs.
●
● Enough for VM to run single application
– Already common (“scale-out”).
– Simpler code, eliminate isolation costs.
Hardware
Hypervisor
Application
Guest OS
In the cloud, both
hypervisor and guest
isolate applications.
OSv design and implementation
● Single application
– Single process, multiple threads. Single address space.
– No protection between user-space and kernel.
● System calls are just function calls (Library OS)
– OSv runs Linux shared objects by implementing an ELF
dynamic linker.
– Calls to glibc ABI are resolved to functions in the OSv kernel.
– Even “system calls”, e.g., read(), are ordinary function calls
with none of the traditional system-call overheads.
OSv design and implementation
● Linux compatibility
– To run existing applications, OSv implements most of
the Linux/Glibc ABI.
– Some functions like fork() and exec() are not provided,
as they do not fit OSv's single-application model.
OSv design and implementation
● No spin-locks
– Spin-locks are notorious for VM OSs – cause lock
holders preemption problem.
– Often worked around by para-virtual locks.
– OSv avoids spin-locks entirely.
● Most kernel work is done in threads, which can use a
sleeping mutex.
● Mutex implemention not using a spin-lock.
● The scheduler uses lock-free algorithms.
OSv design and implementation
● Network channels
– Network stack redesign proposed by Van Jacobson in 2006.
– Reduce locks, lock contention and cache-line bounces.
– Typical network stack:
● Interrupt thread processes packets, executes TCP protocol,
writes to buffer.
● Application thread reads from this buffer.
– Network channels:
● Interrupt collects packets in lock-free “channels”.
● TCP protocol executed by application thread on read().
OSv design and implementation
● The core of OSv is new code
– Loader, Dynamic linker, Memory management, Thread scheduler,
Synchronization (e.g., mutex, RCU)
– Virtual hardware drivers:
● PC hardware commonly emulated by hypervisors (Keyboard, VGA,
IDE, HPET, etc.)
● Paravirtual network, disk, and clock drivers (virtio, vmxnet3, pvscsi,
etc.)
● Reused existing open-source code when appropriate:
– C library headers and some functions from Musl-libc.
– The ZFS filesystem from FreeBSD.
– Network stack initially imported from FreeBSD.
Beyond the Linux API
●
OSv lowers the overhead of the Posix APIs.
● Some remaining overheads inherent in Posix API. E.g.,
– read() copies data into “userspace” buffer.
– Operations on socket lock it, as same socket can be
accessed from multiple threads.
● Can we improve performance further with new APIs?
Beyond the Linux API - examples
● Zero-copy lock-less network APIs
● Direct-access to page tables
● Shrinker API: dynamic division of all of available memory.
– JVM Balloon – automatically size JVM heap to available
memory, on unmodified JVM.
Biggest obstacle to new APIs is adoption
● Can start with modifying runtime environment (JVM).
● All unmodified JVM applications would benefit.
Evaluation
●
Compared OSv guest to Fedora 20 guest w/o firewall.
● On KVM host.
● See full details in the paper.
Macro benchmarks
● Memcached. UDP. Single-vCPU guest, loaded with
memaslap (90% get, 10% set)
– OSv throughput 22% better than Linux.
● Memcached reimplemented with packet-filtering API
– OSv throughput 290% better than baseline.
● SPECjvm2008. Suite of CPU/memory intensive Java
workloads. Little use of OS services.
– Can't expect much improvement. Got 0.5%.
– Good correctness test (diverse, checks results).
Micro benchmarks
● Netperf – measure network stack performance.
– TCP single-stream thoughput: 24% improvement.
– UDP and TCP r/r latency: 37%-47% reduction.
● Context switch - two threads, alternate waking each other with
pthreads condition variable.
– 3-10 times faster than in Linux.
– As little as 328 ns when two threads on same CPU.
● JVM Balloon – microbenchmark where large heap and large
page cache are needed, but not at the same time.
– Osv 35% faster than Linux.
Latest unofficial results
● Experimental, non-release, code...
● Need more verification...
– Cassandra stress test, READ, 4 vcpu, 4 GB ram
● OSv 34% better
– Tomcat, servlet sending fixed response, 128 concurrent
HTTP connections, measure throughput. 4 vcpus, 3GB
● OSv 41% better.
Thank you!
● Come visit us at http://osv.io/
– Github source repository
– Mailing list
– Twitter, @CloudiusSystems, #Osv.
●
We invite you to join the OSv open-source project!

More Related Content

What's hot

XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, OracleXPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, OracleThe Linux Foundation
 
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Community
 
Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph Community
 
CloudStack Automated Integration Testing with Marvin
CloudStack Automated Integration Testing with Marvin CloudStack Automated Integration Testing with Marvin
CloudStack Automated Integration Testing with Marvin NetApp
 
Optimizing VM images for OpenStack with KVM/QEMU
Optimizing VM images for OpenStack with KVM/QEMUOptimizing VM images for OpenStack with KVM/QEMU
Optimizing VM images for OpenStack with KVM/QEMUOpenStack Foundation
 
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaTechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaOpenNebula Project
 
Guaranteeing CloudStack Storage Performance
Guaranteeing CloudStack Storage Performance Guaranteeing CloudStack Storage Performance
Guaranteeing CloudStack Storage Performance NetApp
 
Storage based snapshots for KVM VMs in CloudStack
Storage based snapshots for KVM VMs in CloudStackStorage based snapshots for KVM VMs in CloudStack
Storage based snapshots for KVM VMs in CloudStackShapeBlue
 
Openvz booth
Openvz boothOpenvz booth
Openvz boothOpenVZ
 
ops300 Week5 storage (1)
ops300 Week5 storage (1)ops300 Week5 storage (1)
ops300 Week5 storage (1)trayyoo
 
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...OpenNebula Project
 
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBITOpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBITOpenNebula Project
 
RBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanRBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanCeph Community
 
Arbiter volumes in gluster
Arbiter volumes in glusterArbiter volumes in gluster
Arbiter volumes in glusteritisravi
 
GFProxy: Scaling the GlusterFS FUSE Client
GFProxy: Scaling the GlusterFS FUSE Client	GFProxy: Scaling the GlusterFS FUSE Client
GFProxy: Scaling the GlusterFS FUSE Client Gluster.org
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific DashboardCeph Community
 

What's hot (20)

XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, OracleXPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
 
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
 
Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021
 
64-bit ARM Unikernels on uKVM
64-bit ARM Unikernels on uKVM64-bit ARM Unikernels on uKVM
64-bit ARM Unikernels on uKVM
 
CloudStack Automated Integration Testing with Marvin
CloudStack Automated Integration Testing with Marvin CloudStack Automated Integration Testing with Marvin
CloudStack Automated Integration Testing with Marvin
 
Optimizing VM images for OpenStack with KVM/QEMU
Optimizing VM images for OpenStack with KVM/QEMUOptimizing VM images for OpenStack with KVM/QEMU
Optimizing VM images for OpenStack with KVM/QEMU
 
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaTechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
 
Guaranteeing CloudStack Storage Performance
Guaranteeing CloudStack Storage Performance Guaranteeing CloudStack Storage Performance
Guaranteeing CloudStack Storage Performance
 
Storage based snapshots for KVM VMs in CloudStack
Storage based snapshots for KVM VMs in CloudStackStorage based snapshots for KVM VMs in CloudStack
Storage based snapshots for KVM VMs in CloudStack
 
Openvz booth
Openvz boothOpenvz booth
Openvz booth
 
ops300 Week5 storage (1)
ops300 Week5 storage (1)ops300 Week5 storage (1)
ops300 Week5 storage (1)
 
Ceph on Windows
Ceph on WindowsCeph on Windows
Ceph on Windows
 
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
 
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBITOpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
 
RBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanRBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason Dillaman
 
Ceph on arm64 upload
Ceph on arm64   uploadCeph on arm64   upload
Ceph on arm64 upload
 
Arbiter volumes in gluster
Arbiter volumes in glusterArbiter volumes in gluster
Arbiter volumes in gluster
 
CephFS Update
CephFS UpdateCephFS Update
CephFS Update
 
GFProxy: Scaling the GlusterFS FUSE Client
GFProxy: Scaling the GlusterFS FUSE Client	GFProxy: Scaling the GlusterFS FUSE Client
GFProxy: Scaling the GlusterFS FUSE Client
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 

Similar to OSv at Usenix ATC 2014

BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP IntegrationBKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP IntegrationLinaro
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackBoden Russell
 
Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack eurobsdcon
 
Academy PRO: Docker. Part 1
Academy PRO: Docker. Part 1Academy PRO: Docker. Part 1
Academy PRO: Docker. Part 1Binary Studio
 
Academy PRO: Docker. Lecture 1
Academy PRO: Docker. Lecture 1Academy PRO: Docker. Lecture 1
Academy PRO: Docker. Lecture 1Binary Studio
 
Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014Hajime Tazaki
 
Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0guest72e8c1
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Hajime Tazaki
 
elfconv: AOT compiler that translates Linux/AArch64 ELF binary to LLVM bitcod...
elfconv: AOT compiler that translates Linux/AArch64 ELF binary to LLVM bitcod...elfconv: AOT compiler that translates Linux/AArch64 ELF binary to LLVM bitcod...
elfconv: AOT compiler that translates Linux/AArch64 ELF binary to LLVM bitcod...Masashi Yoshimura
 
Unikraft: Fast, Specialized Unikernels the Easy Way
Unikraft: Fast, Specialized Unikernels the Easy WayUnikraft: Fast, Specialized Unikernels the Easy Way
Unikraft: Fast, Specialized Unikernels the Easy WayScyllaDB
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...Jim St. Leger
 
Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...
Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...
Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...Takaya Saeki
 
An Introduce of OPNFV (Open Platform for NFV)
An Introduce of OPNFV (Open Platform for NFV)An Introduce of OPNFV (Open Platform for NFV)
An Introduce of OPNFV (Open Platform for NFV)Mario Cho
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMDevOps.com
 
Fedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUFedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUAndrey Vagin
 
mSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software SwitchmSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software Switchmicchie
 

Similar to OSv at Usenix ATC 2014 (20)

BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP IntegrationBKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStack
 
Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack
 
Academy PRO: Docker. Part 1
Academy PRO: Docker. Part 1Academy PRO: Docker. Part 1
Academy PRO: Docker. Part 1
 
Academy PRO: Docker. Lecture 1
Academy PRO: Docker. Lecture 1Academy PRO: Docker. Lecture 1
Academy PRO: Docker. Lecture 1
 
OpenVZ Linux Containers
OpenVZ Linux ContainersOpenVZ Linux Containers
OpenVZ Linux Containers
 
mTCP使ってみた
mTCP使ってみたmTCP使ってみた
mTCP使ってみた
 
Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014
 
Node js internal
Node js internalNode js internal
Node js internal
 
Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0
 
RMLL / LSM 2009
RMLL / LSM 2009RMLL / LSM 2009
RMLL / LSM 2009
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)
 
elfconv: AOT compiler that translates Linux/AArch64 ELF binary to LLVM bitcod...
elfconv: AOT compiler that translates Linux/AArch64 ELF binary to LLVM bitcod...elfconv: AOT compiler that translates Linux/AArch64 ELF binary to LLVM bitcod...
elfconv: AOT compiler that translates Linux/AArch64 ELF binary to LLVM bitcod...
 
Unikraft: Fast, Specialized Unikernels the Easy Way
Unikraft: Fast, Specialized Unikernels the Easy WayUnikraft: Fast, Specialized Unikernels the Easy Way
Unikraft: Fast, Specialized Unikernels the Easy Way
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
 
Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...
Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...
Noah - Robust and Flexible Operating System Compatibility Architecture - Cont...
 
An Introduce of OPNFV (Open Platform for NFV)
An Introduce of OPNFV (Open Platform for NFV)An Introduce of OPNFV (Open Platform for NFV)
An Introduce of OPNFV (Open Platform for NFV)
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
Fedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUFedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIU
 
mSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software SwitchmSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software Switch
 

Recently uploaded

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profileakrivarotava
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Anthony Dahanne
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 

Recently uploaded (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profile
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 

OSv at Usenix ATC 2014

  • 1. OSv – Optimizing the Operating System for Virtual Machines Avi Kivity, Dor Laor, Glauber Costa, Pekka Enberg, Nadav Har'El, Don Marti, Vlad Zolotarov Cloudius Systems
  • 2. Problem statement ● Virtual Machines are useful and everywhere. ● A VM runs a guest operating system. ● Usually, guest OS is an existing general-purpose OS, e.g., Linux. Can we design a better OS specifically for VMs?
  • 3. Goals of OSv OSv: a new OS designed specifically for cloud Vms. ● Run existing cloud applications (Linux executables). ● Run these faster than Linux. ● Explore new APIs for even better performance. ● Use those in a common runtime environment (e.g., JVM) to also benefit unmodified applications.
  • 4. Goals of OSv (continued) ● Small image and very quick boot. – Starting a new VM becomes a viable alternative to reconfiguring a running one. ● Not tied to specific hypervisor or platform – 64-bit x86 fully working, 64-bit ARM in progress. – KVM, Xen, VMware, VirtualBox. – Amazon EC2 and Google GCE clouds.
  • 5. Goals of OSv (continued) ● Be a platform for continued research on VM OSs – Actively developed as open source. http://osv.io/ – Community encourages innovation. – Small code base compared to Linux. – Modern programming language: C++11. – Not limited to particular hypervisor or application programming language. – Fully supports SMP guests.
  • 6. OSv design and implementation ● Process isolation is an important role of traditional OSs. ● ● Enough for VM to run single application – Already common (“scale-out”). – Simpler code, eliminate isolation costs. Hardware Hypervisor Application Guest OS In the cloud, both hypervisor and guest isolate applications.
  • 7. OSv design and implementation ● Single application – Single process, multiple threads. Single address space. – No protection between user-space and kernel. ● System calls are just function calls (Library OS) – OSv runs Linux shared objects by implementing an ELF dynamic linker. – Calls to glibc ABI are resolved to functions in the OSv kernel. – Even “system calls”, e.g., read(), are ordinary function calls with none of the traditional system-call overheads.
  • 8. OSv design and implementation ● Linux compatibility – To run existing applications, OSv implements most of the Linux/Glibc ABI. – Some functions like fork() and exec() are not provided, as they do not fit OSv's single-application model.
  • 9. OSv design and implementation ● No spin-locks – Spin-locks are notorious for VM OSs – cause lock holders preemption problem. – Often worked around by para-virtual locks. – OSv avoids spin-locks entirely. ● Most kernel work is done in threads, which can use a sleeping mutex. ● Mutex implemention not using a spin-lock. ● The scheduler uses lock-free algorithms.
  • 10. OSv design and implementation ● Network channels – Network stack redesign proposed by Van Jacobson in 2006. – Reduce locks, lock contention and cache-line bounces. – Typical network stack: ● Interrupt thread processes packets, executes TCP protocol, writes to buffer. ● Application thread reads from this buffer. – Network channels: ● Interrupt collects packets in lock-free “channels”. ● TCP protocol executed by application thread on read().
  • 11. OSv design and implementation ● The core of OSv is new code – Loader, Dynamic linker, Memory management, Thread scheduler, Synchronization (e.g., mutex, RCU) – Virtual hardware drivers: ● PC hardware commonly emulated by hypervisors (Keyboard, VGA, IDE, HPET, etc.) ● Paravirtual network, disk, and clock drivers (virtio, vmxnet3, pvscsi, etc.) ● Reused existing open-source code when appropriate: – C library headers and some functions from Musl-libc. – The ZFS filesystem from FreeBSD. – Network stack initially imported from FreeBSD.
  • 12. Beyond the Linux API ● OSv lowers the overhead of the Posix APIs. ● Some remaining overheads inherent in Posix API. E.g., – read() copies data into “userspace” buffer. – Operations on socket lock it, as same socket can be accessed from multiple threads. ● Can we improve performance further with new APIs?
  • 13. Beyond the Linux API - examples ● Zero-copy lock-less network APIs ● Direct-access to page tables ● Shrinker API: dynamic division of all of available memory. – JVM Balloon – automatically size JVM heap to available memory, on unmodified JVM. Biggest obstacle to new APIs is adoption ● Can start with modifying runtime environment (JVM). ● All unmodified JVM applications would benefit.
  • 14. Evaluation ● Compared OSv guest to Fedora 20 guest w/o firewall. ● On KVM host. ● See full details in the paper.
  • 15. Macro benchmarks ● Memcached. UDP. Single-vCPU guest, loaded with memaslap (90% get, 10% set) – OSv throughput 22% better than Linux. ● Memcached reimplemented with packet-filtering API – OSv throughput 290% better than baseline. ● SPECjvm2008. Suite of CPU/memory intensive Java workloads. Little use of OS services. – Can't expect much improvement. Got 0.5%. – Good correctness test (diverse, checks results).
  • 16. Micro benchmarks ● Netperf – measure network stack performance. – TCP single-stream thoughput: 24% improvement. – UDP and TCP r/r latency: 37%-47% reduction. ● Context switch - two threads, alternate waking each other with pthreads condition variable. – 3-10 times faster than in Linux. – As little as 328 ns when two threads on same CPU. ● JVM Balloon – microbenchmark where large heap and large page cache are needed, but not at the same time. – Osv 35% faster than Linux.
  • 17. Latest unofficial results ● Experimental, non-release, code... ● Need more verification... – Cassandra stress test, READ, 4 vcpu, 4 GB ram ● OSv 34% better – Tomcat, servlet sending fixed response, 128 concurrent HTTP connections, measure throughput. 4 vcpus, 3GB ● OSv 41% better.
  • 18. Thank you! ● Come visit us at http://osv.io/ – Github source repository – Mailing list – Twitter, @CloudiusSystems, #Osv. ● We invite you to join the OSv open-source project!