SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Andrey Vagin <avagin@openvz.org>
● 1 June 2013, Moscow<
Linux Containers
Fedora Virtualization Day
2
Different types of Virtualization
● Virtual Machines
– Emulation (qemu)
– Paravirtualization (XEN)
– Hardware Virtualization (KVM, ESX)
● OS Level Virtualization
– Containers (Linux Containers, Solaris Zones, BSD Jails)
3
Virtual Machine (VM)
Hardware
Hypervisor
Virtual HW
Kernel
Apps
Virtual HW
Kernel
Apps
Virtual HW
Kernel
Apps
Virtual HW
Kernel
Apps
4
Containers (CT)
Hardware
Host Kernel
Apps
Namespaces
Apps
Namespaces
Apps
Namespaces
Apps
Namespaces
- chroot() on steroids
5
7
Comparison VM-s vs CT-s
● One real HW, many virtual HW,
many OS-s.
● One real HW, one kernel, many
userspace instances
● Full control on the guest OS ● Native performance: [almost] no
overhead
● High density
● KSM (Kernel SamePage Merging) ● Use resources on demand
● Dynamic resource allocation
● Naturally share pages
● Depends on hardware
(VT-x, VT-d, EPT, etc)
● Not all functionality are virtualized
● Flexibility
8
9
10
Evolution of Operating System
● Multitask
many processes
● Multiuser
many users
● Multicontainer
many containers
11
Containers (CT)
Cgroups
– control resources
● cpu, cpuacct, cpuset
● blkio
● memory
● net_cls
Namespaces
– isolate environments
● MNT
● PID
● NET
● IPC
● User
● UTS
12
How to execute CT
All allowed by default
● unshare, nsenter
● Systemd Lightweight Containers
● LXC
● Libvirt LXC
All restricted by default
● OpenVZ (vzctl-core) (FC19)
13
vzctl - perform various operations on a container
# yum install -y vzctl-core
# vzctl create 101 --ostemplate fedora-15
# vzctl start 101
# vzctl exec 101 ps ax
PID TTY STAT TIME COMMAND
1 ? Ss 0:00 init
11830 ? Ss 0:00 syslogd -m 0
11897 ? Ss 0:00 /usr/sbin/sshd
11943 ? Ss 0:00 xinetd -stayalive -pidfile ...
12218 ? Ss 0:00 sendmail: accepting connections
12265 ? Ss 0:00 sendmail: Queue runner@01:00:00
13362 ? Ss 0:00 /usr/sbin/httpd
13363 ? S 0:00 _ /usr/sbin/httpd
..............................................
6416 ? Rs 0:00 ps axf
# vzctl stop 101
# vzctl destroy 101
14
OpenVZ kernel only features
● Ploop (snapshot, backups, different formats)
● Second level quota
● More functional memory accounting
● PFCache (memory deduplication. Io-ops saving)
● More isolated in compare with FC19 (lack of userns)
Questions?
http://openvz.org
Andrey Vagin <avagin@openvz.org><
CRIU - Checkpoint/Restore in User-space
17
What is C/R and how can it be used?
C/R is the ability to save states of processes and to restore them later.
Usage scenarios:
– Failure recovery
– Live migration
– Reboot-less upgrade
– Speed up of slow-boot services
– HPC issues
18
History
●
Berkeley Lab Checkpoint/Restart (BLCR) (2003)
– Load a kernel module and link with a library
● DMTCP: Distributed MultiThreaded CheckPointing (2004-2006)
– Preload a library
●
OpenVZ (2005)
– OpenVZ kernel
● Linux Checkpoint/Restart by Oren Laadan (2008)
– A non-mainline kernel
●
CRIU (2011)
OpenVZ
2005
BLCR
2003
Linux C/R
2008
CRIU
2011
DMTCP
2007
19
How does this work?
Kernel objects Process tree
crtools
Image files
Name-spaces
Files
Sockets
Pipes
001101
101010
110001
011010
000011
010101
001101
101010
110001
011010
000011
010101
001101
101010
110001
011010
000011
010101
001101
101010
110001
011010
000011
010101
001101
101010
110001
011010
000011
010101
001101
101010
110001
011010
000011
010101
20
Kernel interfaces
Dump Restore
syscalls
netlink
/proc/
ptrace
21
Dump
● Parasite code
– Receive file descriptors
– Dump memory content
– Prctl(), sigaction, pending signals, timers, etc.
● Ptrace
– freeze processes
– Inject a parasite code
● Netlink
– Get information about sockets, netns
● Procfs
/proc/PID/maps, /proc/PID/map_files/,
/proc/PID/status, /proc/PID/mountinfo
22
Restore
● Collect shared objects
● Restore name-spaces
● Create a process tree
– Restore SID, PGID
– Restore objects, which should be inherited
● Files, sockets, pipes, ...
● Restore per-task properties.
● Restore memory
● Call sigreturn
● Awesome
Namespaces
Processes
23
Interesting moments
● How to restore shared objects?
– Send file descriptors via unix sockets
– Map files from /proc/self/map_files/ for restoring anon shared mappings
● How to restore memory mappings on the correct places?
– Map a new code block and a stack
– Unmap crtools' mappings
– Remap task's mappings on the correct places
● How to resume a process?
– Create a signal frame
– Call sigreturn()
24
Kernel impact
~140 patches merged ~10 patches in flight
~11 new features appeared ~2 new features to come
25
New features in a kernel
● Parasite code injection (by Tejun Heo)
– Read task states, that are currently retrieved by a task only about itself
● The kcmp() system call
– Helps checking which kernel objects are shared between processes
● Proc map_files directory
– Find out what exact file is mapped
– Mappings sharing info
● A bunch of prctl extensions
– Set various private stuff on task/mm objects (c/r-only feature)
● Last-pid sysctl
– Restore task with desired PID value
26
New features in a kernel
● TCP repair mode
– Read intimate state of a TCP connection
and reconstructs it from scratch on a freshly created socket
● Sockets information dumping via netlink (sock_diag)
– Extendable sockets state retrieving engine
● Virtual net devices indexes
– Allows to restore network devices in a namespace
● Socket peeking offset
– Allows peeking sockets queues (reading without removing data from queue)
● Task memory tracking
– incremental snapshots, online migration
27
What are already supported?
– X86_64 architecture
– Process tree linkage
– Multi-threaded apps
– All kinds of memory mappings
– Terminals, groups, sessions
– Open files (shared and unlinked)
– Established TCP connections
– Unix sockets, Packet sockets
– Name-spaces (net, mount, ipc)
– Non-posix files (epoll, inotify)
– Pipes, Fifo-s, IPC, ...
– ARM architecture
– Pending signals
– TCP time-stamps
– Iterative snapshots
– VDSO
– LXC and OpenVZ containers
In flight
– Posix timers
– Convert OpenVZ images
28
How is CRIU tested?
● ZDTM – a set of unit-tests
● Real-life applications
– Apache, Nginx
– MySQL, MongoDB, Oracle
– Make && gcc
– Tar & gzip
– Screen
– Java
– LXC
– VNC server + GUI applications
29
Future plans (Feb, 2013)
● Support all kinds of kernel objects
● Merge all in-flight patches in the mainstream kernel
● Integrate CRIU with OpenVZ and LXC utilities
● Iterative migration
– Migrate memory content before freezing applications
● Integration in distributions
– CRIU was accepted to Fedora 19
30
How to use
● ./crtools dump -t pid [<options>]
– checkpoint a process/tree identified by pid
● ./crtools restore -t pid [<options>]
– restore - restore a process/tree identified by pid
● ./crtools show (-D dir)|(-f file) [<options>]
– show dump file(s) contents
● ./crtools check
– checks whether the kernel support is up-to-date
● ./crtools exec -t pid <syscall-string>
– exec - execute a system call by other task
31
Checkpoint/restore of a VNC server.
Questions?
http://criu.org

Weitere ähnliche Inhalte

Was ist angesagt?

Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6Gluster.org
 
Gluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and futureGluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and futureGluster.org
 
Live migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchasLive migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchasDocker, Inc.
 
Live migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel EmelyanovLive migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel EmelyanovOpenVZ
 
Gluster volume snapshot
Gluster volume snapshotGluster volume snapshot
Gluster volume snapshotRajesh Joseph
 
Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2Gluster.org
 
Container-relevant Upstream Kernel Developments
Container-relevant Upstream Kernel DevelopmentsContainer-relevant Upstream Kernel Developments
Container-relevant Upstream Kernel DevelopmentsDocker, Inc.
 
Gluster and Kubernetes
Gluster and KubernetesGluster and Kubernetes
Gluster and KubernetesGluster.org
 
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...NETWAYS
 
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius Systems
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius SystemsXPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius Systems
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius SystemsThe Linux Foundation
 
Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster.org
 
High Availability Storage (susecon2016)
High Availability Storage (susecon2016)High Availability Storage (susecon2016)
High Availability Storage (susecon2016)Roger Zhou 周志强
 
Small, Simple, and Secure: Alpine Linux under the Microscope
Small, Simple, and Secure: Alpine Linux under the MicroscopeSmall, Simple, and Secure: Alpine Linux under the Microscope
Small, Simple, and Secure: Alpine Linux under the MicroscopeDocker, Inc.
 
OSv at Usenix ATC 2014
OSv at Usenix ATC 2014OSv at Usenix ATC 2014
OSv at Usenix ATC 2014Don Marti
 
Gluster as Block Store in Containers
Gluster as Block Store in ContainersGluster as Block Store in Containers
Gluster as Block Store in ContainersGluster.org
 
Talk on PHP Day Uruguay about Docker
Talk on PHP Day Uruguay about DockerTalk on PHP Day Uruguay about Docker
Talk on PHP Day Uruguay about DockerWellington Silva
 
GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...
GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...
GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...Deepak Shetty
 

Was ist angesagt? (19)

Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6
 
Gluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and futureGluster as Native Storage for Containers - past, present and future
Gluster as Native Storage for Containers - past, present and future
 
Speeding up ps and top
Speeding up ps and topSpeeding up ps and top
Speeding up ps and top
 
Live migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchasLive migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchas
 
Live migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel EmelyanovLive migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel Emelyanov
 
Gluster volume snapshot
Gluster volume snapshotGluster volume snapshot
Gluster volume snapshot
 
Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2Heketi Functionality into Glusterd2
Heketi Functionality into Glusterd2
 
Container-relevant Upstream Kernel Developments
Container-relevant Upstream Kernel DevelopmentsContainer-relevant Upstream Kernel Developments
Container-relevant Upstream Kernel Developments
 
Gluster and Kubernetes
Gluster and KubernetesGluster and Kubernetes
Gluster and Kubernetes
 
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
 
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius Systems
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius SystemsXPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius Systems
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius Systems
 
Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016
 
High Availability Storage (susecon2016)
High Availability Storage (susecon2016)High Availability Storage (susecon2016)
High Availability Storage (susecon2016)
 
Small, Simple, and Secure: Alpine Linux under the Microscope
Small, Simple, and Secure: Alpine Linux under the MicroscopeSmall, Simple, and Secure: Alpine Linux under the Microscope
Small, Simple, and Secure: Alpine Linux under the Microscope
 
OSv at Usenix ATC 2014
OSv at Usenix ATC 2014OSv at Usenix ATC 2014
OSv at Usenix ATC 2014
 
Gluster as Block Store in Containers
Gluster as Block Store in ContainersGluster as Block Store in Containers
Gluster as Block Store in Containers
 
CoreOS Intro
CoreOS IntroCoreOS Intro
CoreOS Intro
 
Talk on PHP Day Uruguay about Docker
Talk on PHP Day Uruguay about DockerTalk on PHP Day Uruguay about Docker
Talk on PHP Day Uruguay about Docker
 
GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...
GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...
GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...
 

Ähnlich wie Fedora Virtualization Day: Linux Containers & CRIU

Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)Boden Russell
 
Openvz booth
Openvz boothOpenvz booth
Openvz boothOpenVZ
 
LXC on Ganeti
LXC on GanetiLXC on Ganeti
LXC on Ganetikawamuray
 
CRIU: are we there yet?
CRIU: are we there yet?CRIU: are we there yet?
CRIU: are we there yet?OpenVZ
 
Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in LinuxSadegh Dorri N.
 
Let's Containerize New York with Docker!
Let's Containerize New York with Docker!Let's Containerize New York with Docker!
Let's Containerize New York with Docker!Jérôme Petazzoni
 
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQDocker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQJérôme Petazzoni
 
Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Jérôme Petazzoni
 
Talk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopTalk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopQuey-Liang Kao
 
Containers and Namespaces in the Linux Kernel
Containers and Namespaces in the Linux KernelContainers and Namespaces in the Linux Kernel
Containers and Namespaces in the Linux KernelOpenVZ
 
Not so brief history of Linux Containers - Kir Kolyshkin
Not so brief history of Linux Containers - Kir KolyshkinNot so brief history of Linux Containers - Kir Kolyshkin
Not so brief history of Linux Containers - Kir KolyshkinOpenVZ
 
Not so brief history of Linux Containers
Not so brief history of Linux ContainersNot so brief history of Linux Containers
Not so brief history of Linux ContainersKirill Kolyshkin
 
Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)Vivian Vhaves
 
Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302Boden Russell
 
Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Jérôme Petazzoni
 
Docker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12xDocker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12xrkr10
 
Evolution of Linux Containerization
Evolution of Linux Containerization Evolution of Linux Containerization
Evolution of Linux Containerization WSO2
 
Evoluation of Linux Container Virtualization
Evoluation of Linux Container VirtualizationEvoluation of Linux Container Virtualization
Evoluation of Linux Container VirtualizationImesh Gunaratne
 

Ähnlich wie Fedora Virtualization Day: Linux Containers & CRIU (20)

OpenVZ Linux Containers
OpenVZ Linux ContainersOpenVZ Linux Containers
OpenVZ Linux Containers
 
Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)
 
Openvz booth
Openvz boothOpenvz booth
Openvz booth
 
LXC on Ganeti
LXC on GanetiLXC on Ganeti
LXC on Ganeti
 
CRIU: are we there yet?
CRIU: are we there yet?CRIU: are we there yet?
CRIU: are we there yet?
 
Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in Linux
 
Let's Containerize New York with Docker!
Let's Containerize New York with Docker!Let's Containerize New York with Docker!
Let's Containerize New York with Docker!
 
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQDocker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
 
Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9
 
Talk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopTalk 160920 @ Cat System Workshop
Talk 160920 @ Cat System Workshop
 
Containers and Namespaces in the Linux Kernel
Containers and Namespaces in the Linux KernelContainers and Namespaces in the Linux Kernel
Containers and Namespaces in the Linux Kernel
 
Not so brief history of Linux Containers - Kir Kolyshkin
Not so brief history of Linux Containers - Kir KolyshkinNot so brief history of Linux Containers - Kir Kolyshkin
Not so brief history of Linux Containers - Kir Kolyshkin
 
Not so brief history of Linux Containers
Not so brief history of Linux ContainersNot so brief history of Linux Containers
Not so brief history of Linux Containers
 
Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)
 
ubantu ppt.pptx
ubantu ppt.pptxubantu ppt.pptx
ubantu ppt.pptx
 
Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302Linux Container Brief for IEEE WG P2302
Linux Container Brief for IEEE WG P2302
 
Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...
 
Docker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12xDocker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12x
 
Evolution of Linux Containerization
Evolution of Linux Containerization Evolution of Linux Containerization
Evolution of Linux Containerization
 
Evoluation of Linux Container Virtualization
Evoluation of Linux Container VirtualizationEvoluation of Linux Container Virtualization
Evoluation of Linux Container Virtualization
 

Kürzlich hochgeladen

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Fedora Virtualization Day: Linux Containers & CRIU

  • 1. Andrey Vagin <avagin@openvz.org> ● 1 June 2013, Moscow< Linux Containers Fedora Virtualization Day
  • 2. 2 Different types of Virtualization ● Virtual Machines – Emulation (qemu) – Paravirtualization (XEN) – Hardware Virtualization (KVM, ESX) ● OS Level Virtualization – Containers (Linux Containers, Solaris Zones, BSD Jails)
  • 3. 3 Virtual Machine (VM) Hardware Hypervisor Virtual HW Kernel Apps Virtual HW Kernel Apps Virtual HW Kernel Apps Virtual HW Kernel Apps
  • 5. 5
  • 6. 7 Comparison VM-s vs CT-s ● One real HW, many virtual HW, many OS-s. ● One real HW, one kernel, many userspace instances ● Full control on the guest OS ● Native performance: [almost] no overhead ● High density ● KSM (Kernel SamePage Merging) ● Use resources on demand ● Dynamic resource allocation ● Naturally share pages ● Depends on hardware (VT-x, VT-d, EPT, etc) ● Not all functionality are virtualized ● Flexibility
  • 7. 8
  • 8. 9
  • 9. 10 Evolution of Operating System ● Multitask many processes ● Multiuser many users ● Multicontainer many containers
  • 10. 11 Containers (CT) Cgroups – control resources ● cpu, cpuacct, cpuset ● blkio ● memory ● net_cls Namespaces – isolate environments ● MNT ● PID ● NET ● IPC ● User ● UTS
  • 11. 12 How to execute CT All allowed by default ● unshare, nsenter ● Systemd Lightweight Containers ● LXC ● Libvirt LXC All restricted by default ● OpenVZ (vzctl-core) (FC19)
  • 12. 13 vzctl - perform various operations on a container # yum install -y vzctl-core # vzctl create 101 --ostemplate fedora-15 # vzctl start 101 # vzctl exec 101 ps ax PID TTY STAT TIME COMMAND 1 ? Ss 0:00 init 11830 ? Ss 0:00 syslogd -m 0 11897 ? Ss 0:00 /usr/sbin/sshd 11943 ? Ss 0:00 xinetd -stayalive -pidfile ... 12218 ? Ss 0:00 sendmail: accepting connections 12265 ? Ss 0:00 sendmail: Queue runner@01:00:00 13362 ? Ss 0:00 /usr/sbin/httpd 13363 ? S 0:00 _ /usr/sbin/httpd .............................................. 6416 ? Rs 0:00 ps axf # vzctl stop 101 # vzctl destroy 101
  • 13. 14 OpenVZ kernel only features ● Ploop (snapshot, backups, different formats) ● Second level quota ● More functional memory accounting ● PFCache (memory deduplication. Io-ops saving) ● More isolated in compare with FC19 (lack of userns)
  • 15. Andrey Vagin <avagin@openvz.org>< CRIU - Checkpoint/Restore in User-space
  • 16. 17 What is C/R and how can it be used? C/R is the ability to save states of processes and to restore them later. Usage scenarios: – Failure recovery – Live migration – Reboot-less upgrade – Speed up of slow-boot services – HPC issues
  • 17. 18 History ● Berkeley Lab Checkpoint/Restart (BLCR) (2003) – Load a kernel module and link with a library ● DMTCP: Distributed MultiThreaded CheckPointing (2004-2006) – Preload a library ● OpenVZ (2005) – OpenVZ kernel ● Linux Checkpoint/Restart by Oren Laadan (2008) – A non-mainline kernel ● CRIU (2011) OpenVZ 2005 BLCR 2003 Linux C/R 2008 CRIU 2011 DMTCP 2007
  • 18. 19 How does this work? Kernel objects Process tree crtools Image files Name-spaces Files Sockets Pipes 001101 101010 110001 011010 000011 010101 001101 101010 110001 011010 000011 010101 001101 101010 110001 011010 000011 010101 001101 101010 110001 011010 000011 010101 001101 101010 110001 011010 000011 010101 001101 101010 110001 011010 000011 010101
  • 20. 21 Dump ● Parasite code – Receive file descriptors – Dump memory content – Prctl(), sigaction, pending signals, timers, etc. ● Ptrace – freeze processes – Inject a parasite code ● Netlink – Get information about sockets, netns ● Procfs /proc/PID/maps, /proc/PID/map_files/, /proc/PID/status, /proc/PID/mountinfo
  • 21. 22 Restore ● Collect shared objects ● Restore name-spaces ● Create a process tree – Restore SID, PGID – Restore objects, which should be inherited ● Files, sockets, pipes, ... ● Restore per-task properties. ● Restore memory ● Call sigreturn ● Awesome Namespaces Processes
  • 22. 23 Interesting moments ● How to restore shared objects? – Send file descriptors via unix sockets – Map files from /proc/self/map_files/ for restoring anon shared mappings ● How to restore memory mappings on the correct places? – Map a new code block and a stack – Unmap crtools' mappings – Remap task's mappings on the correct places ● How to resume a process? – Create a signal frame – Call sigreturn()
  • 23. 24 Kernel impact ~140 patches merged ~10 patches in flight ~11 new features appeared ~2 new features to come
  • 24. 25 New features in a kernel ● Parasite code injection (by Tejun Heo) – Read task states, that are currently retrieved by a task only about itself ● The kcmp() system call – Helps checking which kernel objects are shared between processes ● Proc map_files directory – Find out what exact file is mapped – Mappings sharing info ● A bunch of prctl extensions – Set various private stuff on task/mm objects (c/r-only feature) ● Last-pid sysctl – Restore task with desired PID value
  • 25. 26 New features in a kernel ● TCP repair mode – Read intimate state of a TCP connection and reconstructs it from scratch on a freshly created socket ● Sockets information dumping via netlink (sock_diag) – Extendable sockets state retrieving engine ● Virtual net devices indexes – Allows to restore network devices in a namespace ● Socket peeking offset – Allows peeking sockets queues (reading without removing data from queue) ● Task memory tracking – incremental snapshots, online migration
  • 26. 27 What are already supported? – X86_64 architecture – Process tree linkage – Multi-threaded apps – All kinds of memory mappings – Terminals, groups, sessions – Open files (shared and unlinked) – Established TCP connections – Unix sockets, Packet sockets – Name-spaces (net, mount, ipc) – Non-posix files (epoll, inotify) – Pipes, Fifo-s, IPC, ... – ARM architecture – Pending signals – TCP time-stamps – Iterative snapshots – VDSO – LXC and OpenVZ containers In flight – Posix timers – Convert OpenVZ images
  • 27. 28 How is CRIU tested? ● ZDTM – a set of unit-tests ● Real-life applications – Apache, Nginx – MySQL, MongoDB, Oracle – Make && gcc – Tar & gzip – Screen – Java – LXC – VNC server + GUI applications
  • 28. 29 Future plans (Feb, 2013) ● Support all kinds of kernel objects ● Merge all in-flight patches in the mainstream kernel ● Integrate CRIU with OpenVZ and LXC utilities ● Iterative migration – Migrate memory content before freezing applications ● Integration in distributions – CRIU was accepted to Fedora 19
  • 29. 30 How to use ● ./crtools dump -t pid [<options>] – checkpoint a process/tree identified by pid ● ./crtools restore -t pid [<options>] – restore - restore a process/tree identified by pid ● ./crtools show (-D dir)|(-f file) [<options>] – show dump file(s) contents ● ./crtools check – checks whether the kernel support is up-to-date ● ./crtools exec -t pid <syscall-string> – exec - execute a system call by other task

Hinweis der Redaktion

  1. BLCR is used a kernel module, doesn&apos;t checkpoint sockets, SysV IPC, zombies, etc. Applications should be linked with a library and executed via a helper. DMTCP uses an executer too, but doesn&apos;t require a kernel module. C/R in OpenVZ is used for checkpount/restore and migrate OpenVZ containers. It requires the OpenVZ kernel. Linux C/R is very similar on OpenVZ C/R. It is used for checkpoint/restore of LXC. CRIU combines all this project. It will work on the pure upstream kernel. It&apos;s able to dump a task without any preparation.