SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
VastSky
Cluster storage system for XCP


            Apr. 28th, 2010
     VA Linux Systems Japan K.K.
         Hirokazu Takahashi
            Tomoaki Sato
          Takashi Yamamoto

            Xen Summit AMD 2010
What is VastSky all about?

VastSky is a cluster storage system 
made up of a lot of servers and 
disks, from which VastSky Manager 
creates logical volumes for VMs
VMs can directly run on VastSky, 
which XCP can control
VastSky is scalable, high availabile
and has a good performance

              Xen Summit AMD 2010
XCP




                                                                ate



                                                                                XML
                                                                                r e q PC
                                                              cr e




                                                                                     ues
                                                                                      -R
                                                                                         t
VM       VM       VM           VM           VM           VM

server                 agent                     agent                control     VastSky
                                                                                  Manager
         logical volumes




                                                                                     l
                                                                                  tro
                                                     ate




                                                                                   n
                                                                                co
                                                  cre
                                    agent                  agent




                               server                   agent         storage pool



                               Xen Summit AMD 2010
Announcement
 The code of VastSky has become 
 open at 
 http://sf.net/projects/vastsky/
  Some more work is needed to be done 
  before the first release.




              Xen Summit AMD 2010
Basic Design
A logical volume is a set of several 
mirrored disks, each of which 
consists of several physical disk 
chunks on different servers.
  The logical volume won’t lose its data 
  whether a physical disk or a storage 
  server in the storage pool has broken.
   All I/O requests, including read, 
  write and even re‐synchronizing 
  requests of mirrored devices will be 
  distributed to all the physical disks. 
               Xen Summit AMD 2010
The way of making a logical volume
   logical volume             storage pool (physical disks)




                     server
                     server




                    Xen Summit AMD 2010
Good performance
 All I/O operations will be done in 
 the linux kernel without any 
 VastSky Manager interactions.
 I/O loads of logical volumes, which 
 can be extremely unbalanced, will 
 be equalized between the physical 
 disks. 
 I/O requests to rebuild mirrored 
 devices are also distributed across 
 a lot of physical disks.
              Xen Summit AMD 2010
Load balancing of read/write requests
          logical volumes                 storage pool (physical disks)



  read/write




  read/write




  read/write




  read/write


                            Xen Summit AMD 2010
Load balancing of read/write requests
           logical volumes                 storage pool (physical disks)



  read/write




  read/write




  read/write




  read/write



                             Xen Summit AMD 2010
Mirrored disk recovery
Each mirrored disk doesn’t have its own 
spare disk.
When VastSky detects one of the physical 
disk chunks of the mirrored disk has 
caused an error, VastSky Manager 
allocates a new chunk form the storage 
pool and assigned it to the mirrored disk 
as a spare.
  The manager schedules when it should be 
  assinged, so two or more re‐sync operations 
  won’t work on the same physical disk.
The mirrored disk starts re‐synchronizing 
the disk chunks right after the spare is 
assigned.       Xen Summit AMD 2010
Mirrored disk recovery

                        mirrored disk (a part of a logical volume)




   3.delete the failure disk               3.assign the chunk as a spare




              BROKEN           physical disk chunks

        1. detect the error                      2. allocate a chunk

                                                  storage pool



                                 Xen Summit AMD 2010
Load balancing when re-synchronizing the
mirrored devices
  When a certain physical disk gets 
  broken, VastSky tries to rebuild the 
  mirrored disks related to the 
  physical disk simultaneously since 
  the disk chunks belong to different 
  mirrored disks.
    No need to rebuild if the disk chunk is 
    unsed.


                 Xen Summit AMD 2010
Load balancing when re-synchronizing the
mirrored devices                    storage pool (physical disks)
      logical volumes

                                                     spare




                                 spare
                                                             crashed



                                 spare



                        Xen Summit AMD 2010
Scalable
 Each volume can handle its I/O 
 operations independently.
   VastSky Manager doesn’t care about it.
 Servers can be added to the system 
 dynamically.




                Xen Summit AMD 2010
How to setup
VastSky should be installed with VM 
management software such as XCP to take care 
about VM life‐cycle.
Networking redundancy should be implemented 
outside VastSky, such as using a bonding 
device.
Hardware health check should also work 
outside VastSky and hopefully it can tell 
VastSky which server or disk should be 
removed.
The current implementation of VastSky
requires HA cluster software to detect its 
manager down to be restarted.
                 Xen Summit AMD 2010
API
VastSky supports XML‐RPC interface 
and CUI like:
  Define a logical volume.
  Attach the logical volume on a 
  specified server.
  Detach the volume.
  Notify which disk or server has gone.
  Add a new server or a physical disk.
  Delete the server or the physical 
  disk.
               Xen Summit AMD 2010
ToDo (1)
 XCP integration
  Under development.
 Improve scalability.
  Network topology aware volume allocation. 
  When creating a new logical volume, 
  physical disk chunks should be allocated 
  from storage servers close to the server 
  that owns the logical volume.
 Logical volume expansion feature.
 Snapshot feature for Guest volumes.

                Xen Summit AMD 2010
Ideas of how to implement volume
snapshot feature
 Use dm‐snap. It is the easiest way 
 but works slow.
 Implement a completely new 
 implementation like Parallax does 
 but it will take long time.
 Use OCFS2, which has rich features 
 but it will be a bit heavy.


              Xen Summit AMD 2010
An idea of using OCFS2
 If you place only one VM's volume 
 placed in an OCFS2 on a logical 
 volume on a head‐server, you can 
 obtain:
  Better snapshot mechanism using an 
  ocfs2's new feature reflink.
  Thin provisioning.
  The volume can still be moved to 
  another server.

               Xen Summit AMD 2010
Place only one guest's volume in
    an ocfs2 filesystem
                     Server A                                                 Server B

   VM1                             VM2                  migrate         VM2              VM3



                    OCFS2                                                                           OCFS2
                                                     OCFS2
    VM1's                          VM2's                                                 VM3's
    volume                         volume                                                volume
                    snap                                                                             snap
                                                              migrate
                                                     snap
                  snap                                                                             snap
                                                                                         snap
                                   logical volume
    logical volume
                                                                                         logical volume

can be extended
                                            create
                         cre




                                                                                          create
                             ate




                                     physical storage pool
                                              Xen Summit AMD 2010
ToDo (2)
 Shared storage for VMs, which some type 
 of active/active clustering software 
 requires. The point is the way of 
 rebuilding the mirrored devices.
  The way to determine which server should 
  take the job to rebuild the mirrored device.
  Make the rebuilding job and write access to 
  the device exclusively.
 Fast VM deploying and cloning. This can 
 be done with the combination of “shared 
 storage” and “snapshot” features.
                 Xen Summit AMD 2010
Fast VM deployment
            VM                                 VM



      logical volume                     logical volume
              snapshot                  snapshot
      cow                                           cow



                 ready only      ready only




                   e.g. CentOS installed
                                           storage pool
                         Xen Summit AMD 2010
ToDo (3)
 Make one server be able to manage both 
 VMs and a lot of physical disk.
   Do you really want this feature?
 Improve the disk chunk allocation 
 algorithm.
   Make it disk performance aware. 
 Graceful server termination.
   The copies of the chunks in the server 
   should be prepared before the termination.
 Make VastSky Manager be able to run in a 
 VM.
   Need some trick. The info to create the 
   volume of the VM for VastSky is stored in 
   this volume.
                 Xen Summit AMD 2010
Roadmap
 First version release
  XCP integration
  Make it stable
  Performance test
  Write documents
  The target date is this coming June.
 Second version and after
  The rest on the Todo list. What 
  should we do first?
              Xen Summit AMD 2010
Thank you!




 Xen Summit AMD 2010

Weitere ähnliche Inhalte

Was ist angesagt?

Intrack14dec tips tricks_clean
Intrack14dec tips tricks_cleanIntrack14dec tips tricks_clean
Intrack14dec tips tricks_clean
chinitooo
 
S4 xen hypervisor_20080622
S4 xen hypervisor_20080622S4 xen hypervisor_20080622
S4 xen hypervisor_20080622
Todd Deshane
 

Was ist angesagt? (20)

Kdump
KdumpKdump
Kdump
 
XPDS13: HVM Dom0 - Any unmodified OS as Dom0 - Will Auld, Intel
XPDS13: HVM Dom0 - Any unmodified OS as Dom0 - Will Auld, IntelXPDS13: HVM Dom0 - Any unmodified OS as Dom0 - Will Auld, Intel
XPDS13: HVM Dom0 - Any unmodified OS as Dom0 - Will Auld, Intel
 
XS Boston 2008 Memory Overcommit
XS Boston 2008 Memory OvercommitXS Boston 2008 Memory Overcommit
XS Boston 2008 Memory Overcommit
 
Porting Xen Paravirtualization to MIPS Architecture
Porting Xen Paravirtualization to MIPS ArchitecturePorting Xen Paravirtualization to MIPS Architecture
Porting Xen Paravirtualization to MIPS Architecture
 
Provissioning storage
Provissioning storageProvissioning storage
Provissioning storage
 
Linaro connect : Introduction to Xen on ARM
Linaro connect : Introduction to Xen on ARMLinaro connect : Introduction to Xen on ARM
Linaro connect : Introduction to Xen on ARM
 
RunX ELCE 2020
RunX ELCE 2020RunX ELCE 2020
RunX ELCE 2020
 
RunX: deploy real-time OSes as containers at the edge
RunX: deploy real-time OSes as containers at the edgeRunX: deploy real-time OSes as containers at the edge
RunX: deploy real-time OSes as containers at the edge
 
ProfessionalVMware BrownBag VCP5 Section3: Storage
ProfessionalVMware BrownBag VCP5 Section3: StorageProfessionalVMware BrownBag VCP5 Section3: Storage
ProfessionalVMware BrownBag VCP5 Section3: Storage
 
.ppt
.ppt.ppt
.ppt
 
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...
 
XPDS16: CPUID handling for guests - Andrew Cooper, Citrix
XPDS16:  CPUID handling for guests - Andrew Cooper, CitrixXPDS16:  CPUID handling for guests - Andrew Cooper, Citrix
XPDS16: CPUID handling for guests - Andrew Cooper, Citrix
 
Trivadis TechEvent 2017 ACFS Replication as of 12 2 by Mathias Zarick
Trivadis TechEvent 2017 ACFS Replication as of 12 2 by Mathias ZarickTrivadis TechEvent 2017 ACFS Replication as of 12 2 by Mathias Zarick
Trivadis TechEvent 2017 ACFS Replication as of 12 2 by Mathias Zarick
 
XS Boston 2008 Quantitative
XS Boston 2008 QuantitativeXS Boston 2008 Quantitative
XS Boston 2008 Quantitative
 
vSphere vStorage: Troubleshooting Performance
vSphere vStorage: Troubleshooting PerformancevSphere vStorage: Troubleshooting Performance
vSphere vStorage: Troubleshooting Performance
 
Intrack14dec tips tricks_clean
Intrack14dec tips tricks_cleanIntrack14dec tips tricks_clean
Intrack14dec tips tricks_clean
 
Esx.sc.quickref
Esx.sc.quickrefEsx.sc.quickref
Esx.sc.quickref
 
제2회난공불락 오픈소스 세미나 커널튜닝
제2회난공불락 오픈소스 세미나 커널튜닝제2회난공불락 오픈소스 세미나 커널튜닝
제2회난공불락 오픈소스 세미나 커널튜닝
 
S4 xen hypervisor_20080622
S4 xen hypervisor_20080622S4 xen hypervisor_20080622
S4 xen hypervisor_20080622
 
ZFS and MySQL on Linux, the Sweet Spots
ZFS and MySQL on Linux, the Sweet SpotsZFS and MySQL on Linux, the Sweet Spots
ZFS and MySQL on Linux, the Sweet Spots
 

Ähnlich wie Vastsky xen summit20100428

Windsor: Domain 0 Disaggregation for XenServer and XCP
	Windsor: Domain 0 Disaggregation for XenServer and XCP	Windsor: Domain 0 Disaggregation for XenServer and XCP
Windsor: Domain 0 Disaggregation for XenServer and XCP
The Linux Foundation
 
Nova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute modelsNova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute models
openstackindia
 
Vss 101 and design considerations in v mware environment
Vss 101 and design considerations in v mware environmentVss 101 and design considerations in v mware environment
Vss 101 and design considerations in v mware environment
vanhn1205
 
Running your Java EE 6 applications in the Cloud
Running your Java EE 6 applications in the CloudRunning your Java EE 6 applications in the Cloud
Running your Java EE 6 applications in the Cloud
IndicThreads
 
Turning OpenStack Swift into a VM storage platform
Turning OpenStack Swift into a VM storage platformTurning OpenStack Swift into a VM storage platform
Turning OpenStack Swift into a VM storage platform
OpenStack_Online
 
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief ComparisonCloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
bizalgo
 
Finadmin virtualization2
Finadmin virtualization2Finadmin virtualization2
Finadmin virtualization2
David Moody
 
Perf Vsphere Storage Protocols
Perf Vsphere Storage ProtocolsPerf Vsphere Storage Protocols
Perf Vsphere Storage Protocols
Yanghua Zhang
 

Ähnlich wie Vastsky xen summit20100428 (20)

Windsor: Domain 0 Disaggregation for XenServer and XCP
	Windsor: Domain 0 Disaggregation for XenServer and XCP	Windsor: Domain 0 Disaggregation for XenServer and XCP
Windsor: Domain 0 Disaggregation for XenServer and XCP
 
Nova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute modelsNova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute models
 
Windows offloaded data_transfer_steve_olsson
Windows offloaded data_transfer_steve_olssonWindows offloaded data_transfer_steve_olsson
Windows offloaded data_transfer_steve_olsson
 
Esxi troubleshooting
Esxi troubleshootingEsxi troubleshooting
Esxi troubleshooting
 
Vss 101 and design considerations in v mware environment
Vss 101 and design considerations in v mware environmentVss 101 and design considerations in v mware environment
Vss 101 and design considerations in v mware environment
 
Running your Java EE 6 applications in the Cloud
Running your Java EE 6 applications in the CloudRunning your Java EE 6 applications in the Cloud
Running your Java EE 6 applications in the Cloud
 
DevCloud and CloudMonkey
DevCloud and CloudMonkeyDevCloud and CloudMonkey
DevCloud and CloudMonkey
 
Windows server 2012 failover clustering improvements
Windows server 2012   failover clustering improvementsWindows server 2012   failover clustering improvements
Windows server 2012 failover clustering improvements
 
Ha & drs gotcha's
Ha & drs gotcha'sHa & drs gotcha's
Ha & drs gotcha's
 
Turning OpenStack Swift into a VM storage platform
Turning OpenStack Swift into a VM storage platformTurning OpenStack Swift into a VM storage platform
Turning OpenStack Swift into a VM storage platform
 
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief ComparisonCloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
 
Finadmin virtualization2
Finadmin virtualization2Finadmin virtualization2
Finadmin virtualization2
 
Improvements in Failover Clustering in Windows Server 2012
Improvements in Failover Clustering in Windows Server 2012Improvements in Failover Clustering in Windows Server 2012
Improvements in Failover Clustering in Windows Server 2012
 
XS 2008 Boston Project Snowflock
XS 2008 Boston Project SnowflockXS 2008 Boston Project Snowflock
XS 2008 Boston Project Snowflock
 
Intro to CloudStack Build a Cloud Day
Intro to CloudStack Build a Cloud DayIntro to CloudStack Build a Cloud Day
Intro to CloudStack Build a Cloud Day
 
Bangalore cloudstack user group
Bangalore cloudstack user groupBangalore cloudstack user group
Bangalore cloudstack user group
 
Perf Vsphere Storage Protocols
Perf Vsphere Storage ProtocolsPerf Vsphere Storage Protocols
Perf Vsphere Storage Protocols
 
Apache CloudStack AlpesJUG
Apache CloudStack AlpesJUGApache CloudStack AlpesJUG
Apache CloudStack AlpesJUG
 
Turning OpenStack Swift into a VM storage platform
Turning OpenStack Swift into a VM storage platformTurning OpenStack Swift into a VM storage platform
Turning OpenStack Swift into a VM storage platform
 
Windows 2008 R2 Virtualization
Windows 2008  R2  VirtualizationWindows 2008  R2  Virtualization
Windows 2008 R2 Virtualization
 

Mehr von The Linux Foundation

Mehr von The Linux Foundation (20)

ELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made Simple
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
 
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote:  Unikraft Weather ReportXPDDS19 Keynote:  Unikraft Weather Report
XPDDS19 Keynote: Unikraft Weather Report
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxXPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderXPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making... OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixXPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdXPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DXPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEXPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
 

Kürzlich hochgeladen

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 

Vastsky xen summit20100428

  • 1. VastSky Cluster storage system for XCP Apr. 28th, 2010 VA Linux Systems Japan K.K. Hirokazu Takahashi Tomoaki Sato Takashi Yamamoto Xen Summit AMD 2010
  • 2. What is VastSky all about? VastSky is a cluster storage system  made up of a lot of servers and  disks, from which VastSky Manager  creates logical volumes for VMs VMs can directly run on VastSky,  which XCP can control VastSky is scalable, high availabile and has a good performance Xen Summit AMD 2010
  • 3. XCP ate XML r e q PC cr e ues -R t VM VM VM VM VM VM server agent agent control VastSky Manager logical volumes l tro ate n co cre agent agent server agent storage pool Xen Summit AMD 2010
  • 4. Announcement The code of VastSky has become  open at  http://sf.net/projects/vastsky/ Some more work is needed to be done  before the first release. Xen Summit AMD 2010
  • 5. Basic Design A logical volume is a set of several  mirrored disks, each of which  consists of several physical disk  chunks on different servers. The logical volume won’t lose its data  whether a physical disk or a storage  server in the storage pool has broken. All I/O requests, including read,  write and even re‐synchronizing  requests of mirrored devices will be  distributed to all the physical disks.  Xen Summit AMD 2010
  • 6. The way of making a logical volume logical volume storage pool (physical disks) server server Xen Summit AMD 2010
  • 7. Good performance All I/O operations will be done in  the linux kernel without any  VastSky Manager interactions. I/O loads of logical volumes, which  can be extremely unbalanced, will  be equalized between the physical  disks.  I/O requests to rebuild mirrored  devices are also distributed across  a lot of physical disks. Xen Summit AMD 2010
  • 8. Load balancing of read/write requests logical volumes storage pool (physical disks) read/write read/write read/write read/write Xen Summit AMD 2010
  • 9. Load balancing of read/write requests logical volumes storage pool (physical disks) read/write read/write read/write read/write Xen Summit AMD 2010
  • 10. Mirrored disk recovery Each mirrored disk doesn’t have its own  spare disk. When VastSky detects one of the physical  disk chunks of the mirrored disk has  caused an error, VastSky Manager  allocates a new chunk form the storage  pool and assigned it to the mirrored disk  as a spare. The manager schedules when it should be  assinged, so two or more re‐sync operations  won’t work on the same physical disk. The mirrored disk starts re‐synchronizing  the disk chunks right after the spare is  assigned. Xen Summit AMD 2010
  • 11. Mirrored disk recovery mirrored disk (a part of a logical volume) 3.delete the failure disk 3.assign the chunk as a spare BROKEN physical disk chunks 1. detect the error 2. allocate a chunk storage pool Xen Summit AMD 2010
  • 12. Load balancing when re-synchronizing the mirrored devices When a certain physical disk gets  broken, VastSky tries to rebuild the  mirrored disks related to the  physical disk simultaneously since  the disk chunks belong to different  mirrored disks. No need to rebuild if the disk chunk is  unsed. Xen Summit AMD 2010
  • 13. Load balancing when re-synchronizing the mirrored devices storage pool (physical disks) logical volumes spare spare crashed spare Xen Summit AMD 2010
  • 14. Scalable Each volume can handle its I/O  operations independently. VastSky Manager doesn’t care about it. Servers can be added to the system  dynamically. Xen Summit AMD 2010
  • 15. How to setup VastSky should be installed with VM  management software such as XCP to take care  about VM life‐cycle. Networking redundancy should be implemented  outside VastSky, such as using a bonding  device. Hardware health check should also work  outside VastSky and hopefully it can tell  VastSky which server or disk should be  removed. The current implementation of VastSky requires HA cluster software to detect its  manager down to be restarted. Xen Summit AMD 2010
  • 16. API VastSky supports XML‐RPC interface  and CUI like: Define a logical volume. Attach the logical volume on a  specified server. Detach the volume. Notify which disk or server has gone. Add a new server or a physical disk. Delete the server or the physical  disk. Xen Summit AMD 2010
  • 17. ToDo (1) XCP integration Under development. Improve scalability. Network topology aware volume allocation.  When creating a new logical volume,  physical disk chunks should be allocated  from storage servers close to the server  that owns the logical volume. Logical volume expansion feature. Snapshot feature for Guest volumes. Xen Summit AMD 2010
  • 18. Ideas of how to implement volume snapshot feature Use dm‐snap. It is the easiest way  but works slow. Implement a completely new  implementation like Parallax does  but it will take long time. Use OCFS2, which has rich features  but it will be a bit heavy. Xen Summit AMD 2010
  • 19. An idea of using OCFS2 If you place only one VM's volume  placed in an OCFS2 on a logical  volume on a head‐server, you can  obtain: Better snapshot mechanism using an  ocfs2's new feature reflink. Thin provisioning. The volume can still be moved to  another server. Xen Summit AMD 2010
  • 20. Place only one guest's volume in an ocfs2 filesystem Server A Server B VM1 VM2 migrate VM2 VM3 OCFS2 OCFS2 OCFS2 VM1's VM2's VM3's volume volume volume snap snap migrate snap snap snap snap logical volume logical volume logical volume can be extended create cre create ate physical storage pool Xen Summit AMD 2010
  • 21. ToDo (2) Shared storage for VMs, which some type  of active/active clustering software  requires. The point is the way of  rebuilding the mirrored devices. The way to determine which server should  take the job to rebuild the mirrored device. Make the rebuilding job and write access to  the device exclusively. Fast VM deploying and cloning. This can  be done with the combination of “shared  storage” and “snapshot” features. Xen Summit AMD 2010
  • 22. Fast VM deployment VM VM logical volume logical volume snapshot snapshot cow cow ready only ready only e.g. CentOS installed storage pool Xen Summit AMD 2010
  • 23. ToDo (3) Make one server be able to manage both  VMs and a lot of physical disk. Do you really want this feature? Improve the disk chunk allocation  algorithm. Make it disk performance aware.  Graceful server termination. The copies of the chunks in the server  should be prepared before the termination. Make VastSky Manager be able to run in a  VM. Need some trick. The info to create the  volume of the VM for VastSky is stored in  this volume. Xen Summit AMD 2010
  • 24. Roadmap First version release XCP integration Make it stable Performance test Write documents The target date is this coming June. Second version and after The rest on the Todo list. What  should we do first? Xen Summit AMD 2010
  • 25. Thank you! Xen Summit AMD 2010