SlideShare ist ein Scribd-Unternehmen logo
1 von 46
Simple Regenerating Codes:
Network Coding for Cloud
Storage
Dimitris S. Papailiopoulos, Jianqiang Luo,
Alexandros G. Dimakis, Cheng Huang, and Jin Li

INFOCOM 2012

Presented by Tangkai
Index
 About the author
 Introduction
 SRC
 Simulations
 Conclusion
About the author
   Jianqiang Luo
    ◦ Experience
        Senior Software Engineer @ EMC
        Received PhD, Wayne State University
        Intern @ Microsoft, Data Domain
        Team Leader @ Actuate
        Received MS, SJTU
    ◦ Specialties
      Working on distributed storage systems during
       PhD
       Performance profiling.
About the author
   Alexandros G. Dimakis
    ◦ Assistant Professor
      Dept of EE – Systems, USC

    ◦ Research interests:
      Communications, signal processing and
       networking.


    ◦ INFOCOM 2012 - 2
    ◦ Erasure code MDS MSR MBR etc
About the author
   Cheng Huang
    ◦ Education
      Microsoft Research
      Ph.D. Washington University
      B.S. and M.S. EE Dept, SJTU

    ◦ Research interest
      cloud services, internet measurements, erasure
       correction codes, distributed storage systems, peer-to-
       peer streaming, networking and multimedia
       communications.

    ◦ INFOCOM 2011
      Public DNS System and Global Traffic Management
      Estimating the Performance of Hypothetical Cloud
       Service Deployments: A Measurement-Based Approach
About the author
   Jin Li
    ◦ Experience
       Microsoft Research
       BS/MS/PhD THU (within 7 years)
       计算机普及要从娃娃抓起


    ◦ Title
       IEEE Fellow
       GLOBECOM/ICME/ACM MM Chair
Index
 About the author
 Introduction
 SRC
 Simulations
 Conclusion
Introduction
   Background
    ◦ We have come into BIG DATA ERA!
      Digital Universe 1.8 ZB (=1.8e9 TB)
      Several PBs photo stored on Facebook
      14.1PB data stored on Taobao (2010)


    ◦ Data security is IMPORTANT
      Free from unwanted actions of unauthorized
       users.
      Free from data loss caused by destructive
       forces
Introduction
   Background
    ◦ Recovery
       rare exception -> regular operation
         GFS[1]:
           Hundreds or even thousands of machines
           Inexpensive commodity parts
           High concurrency/IO
    ◦ High failure tolerance, both for
       High availability and to prevent data loss
[1] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google file system,” in
SOSP ’03: Proc. of the 19th ACM Symposium on Operating Systems
Principles, 2003.
Introduction
   Background
    ◦ Erasure coding > replication
      1. redundancy level, reliability
      2. reliability, storage cost
    ◦ Some applications
      Cloud storage systems
      Archival storage
      Peer-to-peer storage systems
Introduction
   Erasure coding: MDS            n=3                n=4
                       k=2
         File or                    A                  A
          data          A
         object

     A             B                B                  B

                        B

                                  A+B                A+B


                             (3,2) MDS code,
                               (single parity)      A+2B
                              used in RAID 5
                                                   (4,2) MDS
                                                 code. Tolerates
                                                  any 2 failures
                                                 Used in RAID 6
Introduction
                 Erasure coding vs. Replica[3]erasure code
                                        (4,2) MDS
                                             Replication        (any 2 suffice to recover)

            File or                              A                      A
             data              A
            object


                                                 A                      B
                                                           vs
                               B

                                                 B                    A+B



                                                 B                   A+2B

[3]A. G. Dimakis, P. G. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran,“Network
coding for distributed storage systems,” in IEEE Trans. on Inform. Theory, vol. 56, pp.
Introduction
                 Erasure coding vs. Replica[3]erasure code
                                        (4,2) MDS
                                                  Replication    (any 2 suffice to recover)

            File or                                    A                 A
             data                  A
            object


                                                       A                 B
                        Erasure coding is introducing redundancy in an optimal way.
                                                                 vs
                                    B      Very useful in practice
                      i.e. Reed-Solomon codes, Fountain Codes, (LT and Raptor)…
                                                       B               A+B



                                                       B              A+2B

[3]A. G. Dimakis, P. G. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran,“Network
coding for distributed storage systems,” in IEEE Trans. on Inform. Theory, vol. 56, pp.
Introduction
   Metrics
    ◦ Storage per node (α)
    ◦ Repair Bandwidth per single node repair
      (γ)
    ◦ Disk Accesses per single node repair (d)
    ◦ Effective Coding Rate (R)

   Contribution
    ◦ High R, Small d
    ◦ Low repair computation complexity
Index
 About the author
 Introduction
 SRC
 Simulations
 Conclusion
SRC
   SRC: Simple Regenerating Codes
    ◦ Regenerating Codes
      address the issue of rebuilding (also called
       repairing) lost encoded fragments from existing
       encoded fragments. This issue arises in
       distributed storage systems where
       communication to maintain encoded
       redundancy is a problem.
SRC
    Object
        Requirement I: (n, k) property
            MDS[2]




[2] Alexandros G. Dimakis, Kannan Ramchandran, Yunnan
Wu, Changho Suh:
A Survey on Network Codes for Distributed Storage. in Proceedings of the
SRC
 ◦ MDS
SRC
    Requirement II: efficient exact repair
     ◦ Efficient: Low complexity
     ◦ Exact repair (vs. functional repair)[3] :
        1. [demands]Data have to stay in systematic
         form
        2. [complexity]Updating repairing-decoding
         rules-> additional overhead
        3. [security] dynamic repairing-and-decoding
         rules observed by eavesdroppers ->
         information leakage
[2] Changho Suh, Kannan Ramchandran: Exact Regeneration Codes
for Distributed Storage Repair Using Interference Alignment. in IEEE
TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 3, MARCH
SRC
   Solution

    ◦ MDS codes are used to provide reliability
      to meets Requirement I

    ◦ simple XORs applied over the MDS coded
      packets provide efficient exact repair to
      meets Requirement II
SRC
   Construction
SRC
   Repair
(n,k,2)-SRC
   Code Construction
    ◦ File f , of size M = 2k
    ◦ Split into 2 parts

    ◦ 1. 2 independent (n,k)-MDS encoding

    ◦ 2. Generating a parity sum vector using
      XOR
(n,k,2)-SRC
   Distribution
    ◦ 3n chunks in n storage nodes
(n,k,2)-SRC
   Repair
(n,k,f)-SRC
   General Code Construction
    ◦ File f , of size M = fk
    ◦ Cut into f parts

    ◦ 1. f independent (n,k)-MDS encoding

    ◦ 2. Generating a parity sum vector using
      XOR
(n,k,f)-SRC
   Distribution
    ◦ (f+1)n chunks in n storage nodes
(n,k,f)-SRC
   Repair
(n,k,f)-SRC
   Theorem
    ◦ Effective Coding Rate (R)



      SRC is a fraction f/f+1 of the coding rate of an
       (n, k) MDS code, hence is upper bounded
(n,k,f)-SRC
   Theorem
    ◦ Effective Coding Rate (R)
(n,k,f)-SRC
   Theorem
    ◦ Storage per node (α)

    ◦ Repair Bandwidth per single node repair
      (γ)

    ◦ Disk Accesses per single node repair (d)
      Seek time
(n,k,f)-SRC
   Theorem
    ◦ Disk Accesses per single node repair (d)
      Starting with f disk accesses for the first chunk
       repair
(n,k,f)-SRC
   Theorem
    ◦ Disk Accesses per single node repair (d)



      each additional chunk repair requires an
       additional disk access
(n,k,f)-SRC
   Comparasion
(n,k,f)-SRC
   Asymptotics of the SRC -> MDS
    ◦ let the degree of parities f grow as a
      function of k

    ◦ Repair Bandwidth per single node repair
      (γ)



    ◦ Effective Coding Rate (R)
Index
 About the author
 Introduction
 SRC
 Simulations
 Conclusion
Simulations
   Simulator Introduction
    ◦ One master, other storage server.
    ◦ Chunks form the smallest accessible data
      units and in our system are set to be
      64MB

   Simulator Validation
    ◦   16 machines
    ◦   1Gbps network.
    ◦   410GB data per machine
    ◦   Approximately 6400 chunks
Simulations
   Simulator Validation
    ◦ matches very well, when the percentile is
      below 95
Simulations
   Storage Cost Analysis
    ◦ 3-way replication as baseline
Simulations
   Repair Performance
    ◦ Calculated on time
    ◦ Highlights: Scalability
Simulations
   Degraded Read Performance
    ◦ The only difference is after a chunk is
      repaired, we do not write it back.
Simulations
   Data Reliability Analysis
    ◦ simple Markov model to estimate the
      reliability
    ◦ 5 years /1PB data /
    ◦ 30 min for replica / 15 min for SRC
Simulations
   Data Reliability Analysis
      Several order of magnitude of reliablity
      Scalability
Index
 About the author
 Introduction
 SRC
 Simulations
 Conclusion
Conclusions
   Highlight
    ◦ R-S
      Low IO/bandwidth -> scalability
    ◦ replica
      High reliability
      Decent repair/degraded read performance
Critical Thinking
 Simulation
 (n, k)as n grows, erasure
  performance is weaker
 Compare
    ◦ MSR?
    ◦ Exact?
    ◦ Implementation - > Simulation

Weitere ähnliche Inhalte

Was ist angesagt?

Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd Iaetsd
 
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
zaidinvisible
 
129966862758614726[1]
129966862758614726[1]129966862758614726[1]
129966862758614726[1]
威華 王
 

Was ist angesagt? (19)

Hardware Implementations of RS Decoding Algorithm for Multi-Gb/s Communicatio...
Hardware Implementations of RS Decoding Algorithm for Multi-Gb/s Communicatio...Hardware Implementations of RS Decoding Algorithm for Multi-Gb/s Communicatio...
Hardware Implementations of RS Decoding Algorithm for Multi-Gb/s Communicatio...
 
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
 
Design and Implementation of an Embedded System for Software Defined Radio
Design and Implementation of an Embedded System for Software Defined RadioDesign and Implementation of an Embedded System for Software Defined Radio
Design and Implementation of an Embedded System for Software Defined Radio
 
Ecc cipher processor based on knapsack algorithm
Ecc cipher processor based on knapsack algorithmEcc cipher processor based on knapsack algorithm
Ecc cipher processor based on knapsack algorithm
 
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
 
Design of Reversible Sequential Circuit Using Reversible Logic Synthesis
Design of Reversible Sequential Circuit Using Reversible Logic SynthesisDesign of Reversible Sequential Circuit Using Reversible Logic Synthesis
Design of Reversible Sequential Circuit Using Reversible Logic Synthesis
 
Hardware implementation of (63, 51) bch encoder and decoder for wban using lf...
Hardware implementation of (63, 51) bch encoder and decoder for wban using lf...Hardware implementation of (63, 51) bch encoder and decoder for wban using lf...
Hardware implementation of (63, 51) bch encoder and decoder for wban using lf...
 
Design and implementation of log domain decoder
Design and implementation of log domain decoder Design and implementation of log domain decoder
Design and implementation of log domain decoder
 
Watermarking of JPEG2000 Compressed Images with Improved Encryption
Watermarking of JPEG2000 Compressed Images with Improved EncryptionWatermarking of JPEG2000 Compressed Images with Improved Encryption
Watermarking of JPEG2000 Compressed Images with Improved Encryption
 
Research Paper
Research PaperResearch Paper
Research Paper
 
Reduced Complexity Maximum Likelihood Decoding Algorithm for LDPC Code Correc...
Reduced Complexity Maximum Likelihood Decoding Algorithm for LDPC Code Correc...Reduced Complexity Maximum Likelihood Decoding Algorithm for LDPC Code Correc...
Reduced Complexity Maximum Likelihood Decoding Algorithm for LDPC Code Correc...
 
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
 
Cryptoghraphy
CryptoghraphyCryptoghraphy
Cryptoghraphy
 
129966862758614726[1]
129966862758614726[1]129966862758614726[1]
129966862758614726[1]
 
Rc6 algorithm
Rc6 algorithmRc6 algorithm
Rc6 algorithm
 
Performance Analysis of Steepest Descent Decoding Algorithm for LDPC Codes
Performance Analysis of Steepest Descent Decoding Algorithm for LDPC CodesPerformance Analysis of Steepest Descent Decoding Algorithm for LDPC Codes
Performance Analysis of Steepest Descent Decoding Algorithm for LDPC Codes
 
IRJET- FPGA Implementation of Image Encryption and Decryption using Fully Hom...
IRJET- FPGA Implementation of Image Encryption and Decryption using Fully Hom...IRJET- FPGA Implementation of Image Encryption and Decryption using Fully Hom...
IRJET- FPGA Implementation of Image Encryption and Decryption using Fully Hom...
 
DESIGN OF SOFT VITERBI ALGORITHM DECODER ENHANCED WITH NON-TRANSMITTABLE CODE...
DESIGN OF SOFT VITERBI ALGORITHM DECODER ENHANCED WITH NON-TRANSMITTABLE CODE...DESIGN OF SOFT VITERBI ALGORITHM DECODER ENHANCED WITH NON-TRANSMITTABLE CODE...
DESIGN OF SOFT VITERBI ALGORITHM DECODER ENHANCED WITH NON-TRANSMITTABLE CODE...
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 

Andere mochten auch

The Performance of MapReduce: An In-depth Study
The Performance of MapReduce: An In-depth StudyThe Performance of MapReduce: An In-depth Study
The Performance of MapReduce: An In-depth Study
Kevin Tong
 
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
TCP-FIT: An Improved TCP Congestion Control Algorithm and its PerformanceTCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
Kevin Tong
 

Andere mochten auch (20)

Enabling data integrity protection in regenerating coding-based cloud storage...
Enabling data integrity protection in regenerating coding-based cloud storage...Enabling data integrity protection in regenerating coding-based cloud storage...
Enabling data integrity protection in regenerating coding-based cloud storage...
 
Ieeepro techno solutions ieee dotnet project - nc cloud applying network co...
Ieeepro techno solutions   ieee dotnet project - nc cloud applying network co...Ieeepro techno solutions   ieee dotnet project - nc cloud applying network co...
Ieeepro techno solutions ieee dotnet project - nc cloud applying network co...
 
140320702029 maurya ppt
140320702029 maurya ppt140320702029 maurya ppt
140320702029 maurya ppt
 
The Performance of MapReduce: An In-depth Study
The Performance of MapReduce: An In-depth StudyThe Performance of MapReduce: An In-depth Study
The Performance of MapReduce: An In-depth Study
 
臺灣閩南語推薦用字700字表
臺灣閩南語推薦用字700字表臺灣閩南語推薦用字700字表
臺灣閩南語推薦用字700字表
 
臺灣閩南語推薦用字第二批
臺灣閩南語推薦用字第二批臺灣閩南語推薦用字第二批
臺灣閩南語推薦用字第二批
 
臺灣閩南語羅馬字拼音方案使用手冊
臺灣閩南語羅馬字拼音方案使用手冊臺灣閩南語羅馬字拼音方案使用手冊
臺灣閩南語羅馬字拼音方案使用手冊
 
全球最佳外派目的地 新加坡居冠台灣第8
全球最佳外派目的地 新加坡居冠台灣第8全球最佳外派目的地 新加坡居冠台灣第8
全球最佳外派目的地 新加坡居冠台灣第8
 
漢語間統計式機器翻譯語料處理-用臺灣閩南語示範
漢語間統計式機器翻譯語料處理-用臺灣閩南語示範漢語間統計式機器翻譯語料處理-用臺灣閩南語示範
漢語間統計式機器翻譯語料處理-用臺灣閩南語示範
 
Transport methods in 3DTV--A Survey
Transport methods in 3DTV--A SurveyTransport methods in 3DTV--A Survey
Transport methods in 3DTV--A Survey
 
走入現代生活的台灣諺語
走入現代生活的台灣諺語走入現代生活的台灣諺語
走入現代生活的台灣諺語
 
花宅聚落數位典藏執行簡報20081124
花宅聚落數位典藏執行簡報20081124花宅聚落數位典藏執行簡報20081124
花宅聚落數位典藏執行簡報20081124
 
Analysis of Adaptive Streaming for Hybrid CDN/P2P Live Video Systems
Analysis of Adaptive Streaming for Hybrid CDN/P2P Live Video SystemsAnalysis of Adaptive Streaming for Hybrid CDN/P2P Live Video Systems
Analysis of Adaptive Streaming for Hybrid CDN/P2P Live Video Systems
 
談莫札特的歌劇《女人皆如此》
談莫札特的歌劇《女人皆如此》談莫札特的歌劇《女人皆如此》
談莫札特的歌劇《女人皆如此》
 
閩南俚語
閩南俚語閩南俚語
閩南俚語
 
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
TCP-FIT: An Improved TCP Congestion Control Algorithm and its PerformanceTCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
 
Parte 1 - Linux ed i sistemi embedded per le reti (di Andrea Tassi)
Parte 1 - Linux ed i sistemi embedded per le reti (di Andrea Tassi)Parte 1 - Linux ed i sistemi embedded per le reti (di Andrea Tassi)
Parte 1 - Linux ed i sistemi embedded per le reti (di Andrea Tassi)
 
Real-Coded Extended Compact Genetic Algorithm based on Mixtures of Models
Real-Coded Extended Compact Genetic Algorithm based on Mixtures of ModelsReal-Coded Extended Compact Genetic Algorithm based on Mixtures of Models
Real-Coded Extended Compact Genetic Algorithm based on Mixtures of Models
 
On Extended Compact Genetic Algorithm
On Extended Compact Genetic AlgorithmOn Extended Compact Genetic Algorithm
On Extended Compact Genetic Algorithm
 
Lecture
LectureLecture
Lecture
 

Ähnlich wie Simple regenerating codes: Network Coding for Cloud Storage

Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Kiruthikak14
 
Key-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscanaKey-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscana
Matteo Baglini
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Cloudera, Inc.
 
Data Driven Innovation with Amazon Web Services
Data Driven Innovation with Amazon Web ServicesData Driven Innovation with Amazon Web Services
Data Driven Innovation with Amazon Web Services
Amazon Web Services
 
OrientDB the graph database
OrientDB the graph databaseOrientDB the graph database
OrientDB the graph database
artem_orobets
 
OrientDB the graph database
OrientDB the graph databaseOrientDB the graph database
OrientDB the graph database
Artem Orobets
 

Ähnlich wie Simple regenerating codes: Network Coding for Cloud Storage (20)

Multi core k means
Multi core k meansMulti core k means
Multi core k means
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Module 2 network and computer security
Module 2 network and computer securityModule 2 network and computer security
Module 2 network and computer security
 
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
 
07784576
0778457607784576
07784576
 
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemAccelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
 
International Journal of Computer Science, Engineering and Applications (IJCSEA)
International Journal of Computer Science, Engineering and Applications (IJCSEA)International Journal of Computer Science, Engineering and Applications (IJCSEA)
International Journal of Computer Science, Engineering and Applications (IJCSEA)
 
Dryad Paper Review and System Analysis
Dryad Paper Review and System AnalysisDryad Paper Review and System Analysis
Dryad Paper Review and System Analysis
 
Ag32224229
Ag32224229Ag32224229
Ag32224229
 
MongoDB World 2018: MongoDB for High Volume Time Series Data Streams
MongoDB World 2018: MongoDB for High Volume Time Series Data StreamsMongoDB World 2018: MongoDB for High Volume Time Series Data Streams
MongoDB World 2018: MongoDB for High Volume Time Series Data Streams
 
Robust video data hiding using forbidden zone
Robust video data hiding using forbidden zoneRobust video data hiding using forbidden zone
Robust video data hiding using forbidden zone
 
Key-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscanaKey-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscana
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
 
Distribute Storage System May-2014
Distribute Storage System May-2014Distribute Storage System May-2014
Distribute Storage System May-2014
 
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary DatabaseRedis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
 
Data Driven Innovation with Amazon Web Services
Data Driven Innovation with Amazon Web ServicesData Driven Innovation with Amazon Web Services
Data Driven Innovation with Amazon Web Services
 
OrientDB the graph database
OrientDB the graph databaseOrientDB the graph database
OrientDB the graph database
 
OrientDB the graph database
OrientDB the graph databaseOrientDB the graph database
OrientDB the graph database
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Simple regenerating codes: Network Coding for Cloud Storage

  • 1. Simple Regenerating Codes: Network Coding for Cloud Storage Dimitris S. Papailiopoulos, Jianqiang Luo, Alexandros G. Dimakis, Cheng Huang, and Jin Li INFOCOM 2012 Presented by Tangkai
  • 2. Index  About the author  Introduction  SRC  Simulations  Conclusion
  • 3. About the author  Jianqiang Luo ◦ Experience  Senior Software Engineer @ EMC  Received PhD, Wayne State University  Intern @ Microsoft, Data Domain  Team Leader @ Actuate  Received MS, SJTU ◦ Specialties  Working on distributed storage systems during PhD Performance profiling.
  • 4. About the author  Alexandros G. Dimakis ◦ Assistant Professor Dept of EE – Systems, USC ◦ Research interests:  Communications, signal processing and networking. ◦ INFOCOM 2012 - 2 ◦ Erasure code MDS MSR MBR etc
  • 5. About the author  Cheng Huang ◦ Education  Microsoft Research  Ph.D. Washington University  B.S. and M.S. EE Dept, SJTU ◦ Research interest  cloud services, internet measurements, erasure correction codes, distributed storage systems, peer-to- peer streaming, networking and multimedia communications. ◦ INFOCOM 2011  Public DNS System and Global Traffic Management  Estimating the Performance of Hypothetical Cloud Service Deployments: A Measurement-Based Approach
  • 6. About the author  Jin Li ◦ Experience  Microsoft Research  BS/MS/PhD THU (within 7 years)  计算机普及要从娃娃抓起 ◦ Title  IEEE Fellow  GLOBECOM/ICME/ACM MM Chair
  • 7. Index  About the author  Introduction  SRC  Simulations  Conclusion
  • 8. Introduction  Background ◦ We have come into BIG DATA ERA!  Digital Universe 1.8 ZB (=1.8e9 TB)  Several PBs photo stored on Facebook  14.1PB data stored on Taobao (2010) ◦ Data security is IMPORTANT  Free from unwanted actions of unauthorized users.  Free from data loss caused by destructive forces
  • 9. Introduction  Background ◦ Recovery  rare exception -> regular operation  GFS[1]:  Hundreds or even thousands of machines  Inexpensive commodity parts  High concurrency/IO ◦ High failure tolerance, both for  High availability and to prevent data loss [1] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google file system,” in SOSP ’03: Proc. of the 19th ACM Symposium on Operating Systems Principles, 2003.
  • 10. Introduction  Background ◦ Erasure coding > replication  1. redundancy level, reliability  2. reliability, storage cost ◦ Some applications  Cloud storage systems  Archival storage  Peer-to-peer storage systems
  • 11. Introduction  Erasure coding: MDS n=3 n=4 k=2 File or A A data A object A B B B B A+B A+B (3,2) MDS code, (single parity) A+2B used in RAID 5 (4,2) MDS code. Tolerates any 2 failures Used in RAID 6
  • 12. Introduction  Erasure coding vs. Replica[3]erasure code (4,2) MDS Replication (any 2 suffice to recover) File or A A data A object A B vs B B A+B B A+2B [3]A. G. Dimakis, P. G. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran,“Network coding for distributed storage systems,” in IEEE Trans. on Inform. Theory, vol. 56, pp.
  • 13. Introduction  Erasure coding vs. Replica[3]erasure code (4,2) MDS Replication (any 2 suffice to recover) File or A A data A object A B Erasure coding is introducing redundancy in an optimal way. vs B Very useful in practice i.e. Reed-Solomon codes, Fountain Codes, (LT and Raptor)… B A+B B A+2B [3]A. G. Dimakis, P. G. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran,“Network coding for distributed storage systems,” in IEEE Trans. on Inform. Theory, vol. 56, pp.
  • 14. Introduction  Metrics ◦ Storage per node (α) ◦ Repair Bandwidth per single node repair (γ) ◦ Disk Accesses per single node repair (d) ◦ Effective Coding Rate (R)  Contribution ◦ High R, Small d ◦ Low repair computation complexity
  • 15. Index  About the author  Introduction  SRC  Simulations  Conclusion
  • 16. SRC  SRC: Simple Regenerating Codes ◦ Regenerating Codes  address the issue of rebuilding (also called repairing) lost encoded fragments from existing encoded fragments. This issue arises in distributed storage systems where communication to maintain encoded redundancy is a problem.
  • 17. SRC  Object  Requirement I: (n, k) property  MDS[2] [2] Alexandros G. Dimakis, Kannan Ramchandran, Yunnan Wu, Changho Suh: A Survey on Network Codes for Distributed Storage. in Proceedings of the
  • 19. SRC  Requirement II: efficient exact repair ◦ Efficient: Low complexity ◦ Exact repair (vs. functional repair)[3] :  1. [demands]Data have to stay in systematic form  2. [complexity]Updating repairing-decoding rules-> additional overhead  3. [security] dynamic repairing-and-decoding rules observed by eavesdroppers -> information leakage [2] Changho Suh, Kannan Ramchandran: Exact Regeneration Codes for Distributed Storage Repair Using Interference Alignment. in IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 3, MARCH
  • 20. SRC  Solution ◦ MDS codes are used to provide reliability to meets Requirement I ◦ simple XORs applied over the MDS coded packets provide efficient exact repair to meets Requirement II
  • 21. SRC  Construction
  • 22. SRC  Repair
  • 23. (n,k,2)-SRC  Code Construction ◦ File f , of size M = 2k ◦ Split into 2 parts ◦ 1. 2 independent (n,k)-MDS encoding ◦ 2. Generating a parity sum vector using XOR
  • 24. (n,k,2)-SRC  Distribution ◦ 3n chunks in n storage nodes
  • 25. (n,k,2)-SRC  Repair
  • 26. (n,k,f)-SRC  General Code Construction ◦ File f , of size M = fk ◦ Cut into f parts ◦ 1. f independent (n,k)-MDS encoding ◦ 2. Generating a parity sum vector using XOR
  • 27. (n,k,f)-SRC  Distribution ◦ (f+1)n chunks in n storage nodes
  • 28. (n,k,f)-SRC  Repair
  • 29. (n,k,f)-SRC  Theorem ◦ Effective Coding Rate (R)  SRC is a fraction f/f+1 of the coding rate of an (n, k) MDS code, hence is upper bounded
  • 30. (n,k,f)-SRC  Theorem ◦ Effective Coding Rate (R)
  • 31. (n,k,f)-SRC  Theorem ◦ Storage per node (α) ◦ Repair Bandwidth per single node repair (γ) ◦ Disk Accesses per single node repair (d)  Seek time
  • 32. (n,k,f)-SRC  Theorem ◦ Disk Accesses per single node repair (d)  Starting with f disk accesses for the first chunk repair
  • 33. (n,k,f)-SRC  Theorem ◦ Disk Accesses per single node repair (d)  each additional chunk repair requires an additional disk access
  • 34. (n,k,f)-SRC  Comparasion
  • 35. (n,k,f)-SRC  Asymptotics of the SRC -> MDS ◦ let the degree of parities f grow as a function of k ◦ Repair Bandwidth per single node repair (γ) ◦ Effective Coding Rate (R)
  • 36. Index  About the author  Introduction  SRC  Simulations  Conclusion
  • 37. Simulations  Simulator Introduction ◦ One master, other storage server. ◦ Chunks form the smallest accessible data units and in our system are set to be 64MB  Simulator Validation ◦ 16 machines ◦ 1Gbps network. ◦ 410GB data per machine ◦ Approximately 6400 chunks
  • 38. Simulations  Simulator Validation ◦ matches very well, when the percentile is below 95
  • 39. Simulations  Storage Cost Analysis ◦ 3-way replication as baseline
  • 40. Simulations  Repair Performance ◦ Calculated on time ◦ Highlights: Scalability
  • 41. Simulations  Degraded Read Performance ◦ The only difference is after a chunk is repaired, we do not write it back.
  • 42. Simulations  Data Reliability Analysis ◦ simple Markov model to estimate the reliability ◦ 5 years /1PB data / ◦ 30 min for replica / 15 min for SRC
  • 43. Simulations  Data Reliability Analysis  Several order of magnitude of reliablity  Scalability
  • 44. Index  About the author  Introduction  SRC  Simulations  Conclusion
  • 45. Conclusions  Highlight ◦ R-S  Low IO/bandwidth -> scalability ◦ replica  High reliability  Decent repair/degraded read performance
  • 46. Critical Thinking  Simulation  (n, k)as n grows, erasure performance is weaker  Compare ◦ MSR? ◦ Exact? ◦ Implementation - > Simulation