Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Live migrating a container:
pros, cons and gotchas
Pavel Emelyanov
Principal engineer @ Virtuozzo
AgendaAgenda
• Why you might want to live migrate a container
• Why (and how) to avoid live migration
• Why is container l...
Migration in a nutshelMigration in a nutshel
• Save state
• Copy state
• Restore from state
3
Why you might want to live migrate a containerWhy you might want to live migrate a container
• Spectacular
• Load balancin...
Why to avoid live migrationWhy to avoid live migration
5
How to avoid live migrationHow to avoid live migration
• Balance network traffic
• Microservices
• Crash-driven updates
• ...
Making live migration liveMaking live migration live
• State saving, transfering and restoring happens with tasks frozen
–...
Making live migration liveMaking live migration live
• Save/restore speed up is a long-running task
• Big memory transfer ...
Pre-copyPre-copy
• Copy memory while tasks are running
• Track memory changes
• goto again
9
Pre-copyPre-copy
• Pros:
– Safe: once migrated, source node can disappear
• Cons:
– Unpredictable: iterations may take lon...
Post-copyPost-copy
• Migrate all but memory
• Turn on “network swap” on destination
11
Post-copyPost-copy
Pros:
– Predictable: time to migrate can be well estimated
• Cons:
– Unsafe: src node death means death...
Live migration at lengthLive migration at length
• Memory pre-copy (iteratively, optional)
• Freeze + Save state
• Copy st...
GotchasGotchas
14
VS
Things to deal withThings to deal with
• VM
– Environment: virtual hardware, paravirt
– CPU
– Memory
• Container
– Environ...
Memory pre-copyMemory pre-copy
• VM
– All memory at hands
– Plain address space
• Container
– Memory
●
is scatered over th...
Save stateSave state
• VM
– Hardware state
●
Tree of ~100 objects
●
Fixed amount of data per each
• Container
– State of a...
Restore from stateRestore from state
• VM
– Copy memory in place, write state into devices
• Container
– Creation of many ...
Memory post-copyMemory post-copy
• UserfaultFD from Andrea Archangeli
• VM
– Merged into 4.2
• Container
– Non-cooperative...
And we also need this, this and this!And we also need this, this and this!
• Check for CPUs compatibility
• Check and load...
ImplementationImplementation
• CRIU
– Save & restore state
– Memory pre/post copy
• P.Haul
– Checks
– Orchestrate all C/R ...
P.Haul goalsP.Haul goals
• Provide engine for containers live miration using CRIU
• Perform necessary pre-checks (e.g. CPU...
Under the hoodUnder the hood
23
CRIU CRIUp.haul p.hauldocker -d docker -d
migrate
src dst
check (CPUs, kernels)
pre-dump
m...
More infoMore info
• http://criu.org
• http://criu.org/P.Haul
• criu@openvz.org
• +CriuOrg / @__criu__
• https://github.co...
Thank you!
Pavel Emelyanov
@__criu__
xemul@openvz.org
Nächste SlideShare
Wird geladen in …5
×

Live migrating a container: pros, cons and gotchas -- Pavel Emelyanov

2.408 Aufrufe

Veröffentlicht am

Live migrating a container: pros, cons and gotchas
Monday, November 16 • 17:20 - 18:05

Pavel Emelyanov
Principal Engineer, Odin
Principal engineer at Odin Server Virtualization team, creator and maintainer of the CRIU project. Joined Parallels in 2004 as junior Linux kernel developer, later became kernel team leader. Now works on architecture of the Odin Server products. | | Pavel tweets at @xemulp.

http://dockerconeu2015.sched.org/event/62e6d2ea7380442a48fafaeee26c9842

Veröffentlicht in: Software
  • Login to see the comments

Live migrating a container: pros, cons and gotchas -- Pavel Emelyanov

  1. 1. Live migrating a container: pros, cons and gotchas Pavel Emelyanov Principal engineer @ Virtuozzo
  2. 2. AgendaAgenda • Why you might want to live migrate a container • Why (and how) to avoid live migration • Why is container live migration so complex 2
  3. 3. Migration in a nutshelMigration in a nutshel • Save state • Copy state • Restore from state 3
  4. 4. Why you might want to live migrate a containerWhy you might want to live migrate a container • Spectacular • Load balancing • Updating kernel – Can avoid live migration, just C/R • Updaring or replacing hardware 4
  5. 5. Why to avoid live migrationWhy to avoid live migration 5
  6. 6. How to avoid live migrationHow to avoid live migration • Balance network traffic • Microservices • Crash-driven updates • Planned downtime 6
  7. 7. Making live migration liveMaking live migration live • State saving, transfering and restoring happens with tasks frozen – “Scatter” container is too complex • Save state quickly • Big transfer should not be done at that time • Restore from state quickly 7
  8. 8. Making live migration liveMaking live migration live • Save/restore speed up is a long-running task • Big memory transfer should not be done at frozen time • Memory pre-copy • Memory post-copy 8
  9. 9. Pre-copyPre-copy • Copy memory while tasks are running • Track memory changes • goto again 9
  10. 10. Pre-copyPre-copy • Pros: – Safe: once migrated, source node can disappear • Cons: – Unpredictable: iterations may take long – Non-guaranteed: “dirty” memory next round may remain big 10
  11. 11. Post-copyPost-copy • Migrate all but memory • Turn on “network swap” on destination 11
  12. 12. Post-copyPost-copy Pros: – Predictable: time to migrate can be well estimated • Cons: – Unsafe: src node death means death of container on destination – Application slows down after migration 12
  13. 13. Live migration at lengthLive migration at length • Memory pre-copy (iteratively, optional) • Freeze + Save state • Copy state • Restore from state + Unfreeze and resume • Memory post-copy (optional) 13
  14. 14. GotchasGotchas 14 VS
  15. 15. Things to deal withThings to deal with • VM – Environment: virtual hardware, paravirt – CPU – Memory • Container – Environment: cgroups, namespaces – Processes and other animals – Memory 15
  16. 16. Memory pre-copyMemory pre-copy • VM – All memory at hands – Plain address space • Container – Memory ● is scatered over the processes ● can be (or can be not) shared ● can be (or can be not) mapped to disk files 16
  17. 17. Save stateSave state • VM – Hardware state ● Tree of ~100 objects ● Fixed amount of data per each • Container – State of all objects ● Graph of up to ~1000 objects ● All have different amount of data, different reading API 17
  18. 18. Restore from stateRestore from state • VM – Copy memory in place, write state into devices • Container – Creation of many small objects – Not all have sane API for creation ● Creation sequence can be non-trivial 18
  19. 19. Memory post-copyMemory post-copy • UserfaultFD from Andrea Archangeli • VM – Merged into 4.2 • Container – Non-cooperative work of uffd monitor and client, need further patching 19
  20. 20. And we also need this, this and this!And we also need this, this and this! • Check for CPUs compatibility • Check and load necessary kernel modules (iptables, filesystems) • Non-shared filesystem should be copied • Roll-back on source node if something fails in between – Keep tasks frozen after dump, kill after restore 20
  21. 21. ImplementationImplementation • CRIU – Save & restore state – Memory pre/post copy • P.Haul – Checks – Orchestrate all C/R steps – Deal with filesystem 21
  22. 22. P.Haul goalsP.Haul goals • Provide engine for containers live miration using CRIU • Perform necessary pre-checks (e.g. CPU compatibility) • Organize memory pre-copy and/or post-copy • Take care of file-system migration (if needed) 22
  23. 23. Under the hoodUnder the hood 23 CRIU CRIUp.haul p.hauldocker -d docker -d migrate src dst check (CPUs, kernels) pre-dump memory dump other images restore memory lazy mem FS FS copy done pre-copypost-copy kill freeze time
  24. 24. More infoMore info • http://criu.org • http://criu.org/P.Haul • criu@openvz.org • +CriuOrg / @__criu__ • https://github.com/xemul/(criu|p.haul) 24
  25. 25. Thank you! Pavel Emelyanov @__criu__ xemul@openvz.org

×