SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Byzantine Fault-Tolerant MapReduce
        in Cloud-of-Clouds
       Joint work with: Miguel Correia, Marcelo Pasin,
     Alysson Bessani, Fernando Ramos, Paulo Verissimo
                   Presenter: Pedro Costa


                         Navtalk
Motivation
• How to count the number of words in the
  internet?
• How to do it with the help of a cloud-of-clouds
  (ie, several clouds)
• Guarantee integrity and availability of data




                                                2
Outline
• Introduction
   – MapReduce programming model
   – Fault tolerance in Cloud-of-clouds
   – 3 problems for Basic scheme
• Our approach
   – Byzantine fault-tolerant MapReduce in clouds-of-clouds
• Evaluation




                                                              3
MAPREDUCE AND FAULTS


                       4
What is MapReduce?
• Programming model + execution environment
   • Introduced by Google in 2004
   • Used for processing large data sets using clusters of servers
   • A few implementations available, used by many companies
• Hadoop MapReduce, an open-source MapReduce of Apache
   • The most used, the one we have been using
   • Includes HDFS, a distributed file system for large files




                                                                     5
MapReduce basic idea
A file with all the words
      on the Internet


                            Map Phase   <word,1>

                                                                                 <word,n>


                                                                  Reduce Phase




                                                    Tasktracker
                                                   servers

                                  Tasktracker
                                      servers
                     Job tracker detects and recovers crashed map/reduce tasks              6
MapReduce components
  Wordcount




   TT1        TT2   TT3   TT1   TT3




  (TT)




                                      7
But there are more faults…
• Problem: Accidental faults may affect the correctness of the results
  of MapReduce
    • Task corruptions: memory errors, chipset errors, …
    • Cloud outages: MapReduce job interruptions
                     (as reported in popular clouds)

• Our goal:
    • guarantee integrity and availability (despite task corruptions and
      cloud outages)
    • Develop a new model to compute MapReduce in cloud-of-clouds
    • Commercially feasible?
        Yes, but out of scope of this presentation
        Tobias Kurze et al., Cloud federation. In Proceedings of the 2nd International
        Conference on Cloud Computing, GRIDs, and Virtualization CLOUD COMPUTING
        2011.

                                                                                         8
Byzantine fault-tolerant MapReduce
• Basic idea: to replicate tasks in different clouds and vote the
  results returned by the replicas
   • The set of clouds forms a clouds, so cloud-of-clouds
   • Inputs initially stored in all clouds (i.e., not our problem)


                                                                     Cloud 1


                                                             Cloud 2


                                                                Cloud 3




                                                                               9
System model
• Client is correct (not part of MapReduce)
• Clouds: up to t clouds can arbitrarily corrupt all tasks and
  other modules they execute
• Why use t and not f? t≤f

• Next:
   • Basic BFT MapReduce scheme
   • 3 problems of the Basic scheme
   • Our approach: Full BFT MapReduce scheme




                                                                 10
MapReduce: Map perspective

Official               Cloud-of-Clouds




                       Replicas in different
                              clouds




                                               11
MapReduce: Reduce perspective

Official                    Cloud-of-Clouds




                                                   Replicas in different
                                                          clouds
                But we can do better.         12
Improvements over basic version
• 3 problems have risen
   • Computation problem
   • Communication problem
   • Job execution control problem


• 3 Solutions: Our BFT MapReduce can be thought of as this
  basic version plus the following mechanisms,
   • Deferred execution (computation problem)
   • Digest communication (communication problem)
   • Distributed Job tracker (job execution control problem)


                                                               13
Problem 1: computation


                        split 0                                   part 0




                        split 0                                   part 0




                                                                                Replicas in different
Replicas in different




                                                                                       clouds
       clouds




                        split 0                                   part 0




                                  Tasks are executed 2t+1 times            14
Solution 1: Deferred execution
• Computation problem is uncommon
• Job Tracker replicates tasks across t+1 clouds (t in standby)
• If results differ or one cloud stops, request 1 more (up to t)


     split 0

                                                part 0

     split 0

                                                part 0



                                                                   15
Problem 2: communication


    split 0                                     part 0




    split 0                                     part 0




                                                                  Replicas in different
                                                                         clouds
    split 0                                     part 0




All this communication through the Internet (delay, cost)!   16
Solution 2: Transferring Digests
• Reduces must fetch the map task outputs
• Intra-cloud fetch: output fetched normally
• Inter-cloud fetch: only hash of the output fetched – key idea


          split 0




                                                            other clouds same cloud
                                                   part 0

          split 0




          split 0
                                                                                      17
Problem 3: Job execution control
• Job tracker controls all task executions in the task trackers in
  all clouds
• If Job tracker is in one cloud separated from many task
  trackers by the internet:
   • Communication is slow
   • Large timeouts for detecting task tracker failure
   • …and it’s a single point of failure (this is the case in MR & Hadoop MR)




                                                                            18
Solution 3: Job execution control
                                      Client
                                               VJT




                                               Job
                                           Tracker


            Job                                Task                       Job
          Tracker                          Tracker                      Tracker
                               Task                    Task
                              Tracker                 Tracker
           Task                                                          Task
          Tracker                                                       Tracker
 Task                Task                                       Task               Task
Tracker             Tracker                                   Tracker             Tracker


                                                                                            19
EVALUATION


             20
Setup and Test
Platform configuration
• 3 clouds
• Each cloud has 3 nodes
• 1 JT and 3TT for each cloud
• All JTs are interconnected

Job submitted (Wordcount)
• Input data: 26 chunks of 64 MB (total 1.5GB )
• Map tasks: 26
• Reduce tasks: 120, 180, 360, 400

                                                  21
Number of reduce tasks executed
          (no faults, t=1)


                             Nr.      Job          Job        Diff
                             Reduce   duration     duration
                             tasks    (Official)   (CoC)
                             120      00:15:35     00:17:13   00:02:35
                             180      00:19:35     00:21:36   00:02:01
                             360      00:31:12     00:33:30   00:02:18
                             400      00:33:37     00:36:24   00:02:47
Task details
Official                                                      BFT Cloud-of-clouds: 1 view
                Map Duration: 00:06:47                                      Map duration: 00:07:08
 Map Tasks




                                                   Map Tasks
                Reduce duration: 00:13:18                                  Reduce duration: 00:14:46
 Reduce Tasks




                                                   Reduce Tasks




                                                                                                       23
Conclusions
• Our method guarantee integrity and availability despite task
  corruptions and cloud outages
• BFT MapReduce in cloud-of-clouds is feasible!
   • No need to execute in all 2t+1 clouds
   • Only digests sent through the Internet (no “big data”)
   • Control job execution within each cloud




                          Thank you
                                                                 24

Weitere ähnliche Inhalte

Was ist angesagt?

Fast Optimization Intevac
Fast Optimization IntevacFast Optimization Intevac
Fast Optimization Intevacvvk0
 
A Dual Tree Complex Wavelet Transform Construction and Its Application to Ima...
A Dual Tree Complex Wavelet Transform Construction and Its Application to Ima...A Dual Tree Complex Wavelet Transform Construction and Its Application to Ima...
A Dual Tree Complex Wavelet Transform Construction and Its Application to Ima...CSCJournals
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...Big Data Spain
 
SVD and Lifting Wavelet Based Fragile Image Watermarking
SVD and Lifting Wavelet Based Fragile Image WatermarkingSVD and Lifting Wavelet Based Fragile Image Watermarking
SVD and Lifting Wavelet Based Fragile Image WatermarkingIDES Editor
 
XCPU3: Workload Distribution and Aggregation
XCPU3: Workload Distribution and AggregationXCPU3: Workload Distribution and Aggregation
XCPU3: Workload Distribution and AggregationEric Van Hensbergen
 
Scheduling MapReduce Jobs in HPC Clusters
Scheduling MapReduce Jobs in HPC ClustersScheduling MapReduce Jobs in HPC Clusters
Scheduling MapReduce Jobs in HPC ClustersMarcelo Veiga Neves
 
benchmarks-sigmod09
benchmarks-sigmod09benchmarks-sigmod09
benchmarks-sigmod09Hiroshi Ono
 
Parallel Data Processing with MapReduce: A Survey
Parallel Data Processing with MapReduce: A SurveyParallel Data Processing with MapReduce: A Survey
Parallel Data Processing with MapReduce: A SurveyKyong-Ha Lee
 
An Introduction to Hadoop
An Introduction to HadoopAn Introduction to Hadoop
An Introduction to HadoopDan Harvey
 
Design and implemation of an enhanced dds based digital
Design and implemation of an enhanced dds based digitalDesign and implemation of an enhanced dds based digital
Design and implemation of an enhanced dds based digitalManoj Kollam
 
Classification of Virtualization Environment for Cloud Computing
Classification of Virtualization Environment for Cloud ComputingClassification of Virtualization Environment for Cloud Computing
Classification of Virtualization Environment for Cloud ComputingSouvik Pal
 
Scientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution ServiceScientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution ServiceAngelo Corsaro
 
Distributed System Management
Distributed System ManagementDistributed System Management
Distributed System ManagementIbrahim Amer
 
discrete wavelet transform based satellite image resolution enhancement
discrete wavelet transform based satellite image resolution enhancement discrete wavelet transform based satellite image resolution enhancement
discrete wavelet transform based satellite image resolution enhancement muniswamy Paluru
 

Was ist angesagt? (20)

Ppt
PptPpt
Ppt
 
Fast Optimization Intevac
Fast Optimization IntevacFast Optimization Intevac
Fast Optimization Intevac
 
A Dual Tree Complex Wavelet Transform Construction and Its Application to Ima...
A Dual Tree Complex Wavelet Transform Construction and Its Application to Ima...A Dual Tree Complex Wavelet Transform Construction and Its Application to Ima...
A Dual Tree Complex Wavelet Transform Construction and Its Application to Ima...
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
 
SVD and Lifting Wavelet Based Fragile Image Watermarking
SVD and Lifting Wavelet Based Fragile Image WatermarkingSVD and Lifting Wavelet Based Fragile Image Watermarking
SVD and Lifting Wavelet Based Fragile Image Watermarking
 
XCPU3: Workload Distribution and Aggregation
XCPU3: Workload Distribution and AggregationXCPU3: Workload Distribution and Aggregation
XCPU3: Workload Distribution and Aggregation
 
Scheduling MapReduce Jobs in HPC Clusters
Scheduling MapReduce Jobs in HPC ClustersScheduling MapReduce Jobs in HPC Clusters
Scheduling MapReduce Jobs in HPC Clusters
 
Fuzzy causal order
Fuzzy causal orderFuzzy causal order
Fuzzy causal order
 
benchmarks-sigmod09
benchmarks-sigmod09benchmarks-sigmod09
benchmarks-sigmod09
 
Parallel Data Processing with MapReduce: A Survey
Parallel Data Processing with MapReduce: A SurveyParallel Data Processing with MapReduce: A Survey
Parallel Data Processing with MapReduce: A Survey
 
145 153
145 153145 153
145 153
 
Gh2411361141
Gh2411361141Gh2411361141
Gh2411361141
 
Hadoop
HadoopHadoop
Hadoop
 
An Introduction to Hadoop
An Introduction to HadoopAn Introduction to Hadoop
An Introduction to Hadoop
 
MapReduce basics
MapReduce basicsMapReduce basics
MapReduce basics
 
Design and implemation of an enhanced dds based digital
Design and implemation of an enhanced dds based digitalDesign and implemation of an enhanced dds based digital
Design and implemation of an enhanced dds based digital
 
Classification of Virtualization Environment for Cloud Computing
Classification of Virtualization Environment for Cloud ComputingClassification of Virtualization Environment for Cloud Computing
Classification of Virtualization Environment for Cloud Computing
 
Scientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution ServiceScientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution Service
 
Distributed System Management
Distributed System ManagementDistributed System Management
Distributed System Management
 
discrete wavelet transform based satellite image resolution enhancement
discrete wavelet transform based satellite image resolution enhancement discrete wavelet transform based satellite image resolution enhancement
discrete wavelet transform based satellite image resolution enhancement
 

Ähnlich wie Bft mr-clouds-of-clouds-discco2012 - navtalk

Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerancePallav Jha
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aSchubert Zhang
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduceM Baddar
 
A Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis TechniquesA Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis Techniquesijsrd.com
 
mapreduce-advanced.pptx
mapreduce-advanced.pptxmapreduce-advanced.pptx
mapreduce-advanced.pptxShimoFcis
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsMichael Kopp
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftLee Stott
 
10c introduction
10c introduction10c introduction
10c introductionInyoung Cho
 
Strata + Hadoop World 2012: Knitting Boar
Strata + Hadoop World 2012: Knitting BoarStrata + Hadoop World 2012: Knitting Boar
Strata + Hadoop World 2012: Knitting BoarCloudera, Inc.
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabadsreehari orienit
 
Scalable and Available Services with Docker and Kubernetes
Scalable and Available Services with Docker and KubernetesScalable and Available Services with Docker and Kubernetes
Scalable and Available Services with Docker and KubernetesLaura Frank Tacho
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
Scheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii VozniukScheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii VozniukAndrii Vozniuk
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAlbert Bifet
 

Ähnlich wie Bft mr-clouds-of-clouds-discco2012 - navtalk (20)

Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerance
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221a
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
EEDC Programming Models
EEDC Programming ModelsEEDC Programming Models
EEDC Programming Models
 
A Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis TechniquesA Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis Techniques
 
mapreduce-advanced.pptx
mapreduce-advanced.pptxmapreduce-advanced.pptx
mapreduce-advanced.pptx
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ Applications
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop Microsoft
 
10c introduction
10c introduction10c introduction
10c introduction
 
10c introduction
10c introduction10c introduction
10c introduction
 
Strata + Hadoop World 2012: Knitting Boar
Strata + Hadoop World 2012: Knitting BoarStrata + Hadoop World 2012: Knitting Boar
Strata + Hadoop World 2012: Knitting Boar
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
Scalable and Available Services with Docker and Kubernetes
Scalable and Available Services with Docker and KubernetesScalable and Available Services with Docker and Kubernetes
Scalable and Available Services with Docker and Kubernetes
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
E031201032036
E031201032036E031201032036
E031201032036
 
Scheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii VozniukScheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii Vozniuk
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 

Kürzlich hochgeladen

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Kürzlich hochgeladen (20)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

Bft mr-clouds-of-clouds-discco2012 - navtalk

  • 1. Byzantine Fault-Tolerant MapReduce in Cloud-of-Clouds Joint work with: Miguel Correia, Marcelo Pasin, Alysson Bessani, Fernando Ramos, Paulo Verissimo Presenter: Pedro Costa Navtalk
  • 2. Motivation • How to count the number of words in the internet? • How to do it with the help of a cloud-of-clouds (ie, several clouds) • Guarantee integrity and availability of data 2
  • 3. Outline • Introduction – MapReduce programming model – Fault tolerance in Cloud-of-clouds – 3 problems for Basic scheme • Our approach – Byzantine fault-tolerant MapReduce in clouds-of-clouds • Evaluation 3
  • 5. What is MapReduce? • Programming model + execution environment • Introduced by Google in 2004 • Used for processing large data sets using clusters of servers • A few implementations available, used by many companies • Hadoop MapReduce, an open-source MapReduce of Apache • The most used, the one we have been using • Includes HDFS, a distributed file system for large files 5
  • 6. MapReduce basic idea A file with all the words on the Internet Map Phase <word,1> <word,n> Reduce Phase Tasktracker servers Tasktracker servers Job tracker detects and recovers crashed map/reduce tasks 6
  • 7. MapReduce components Wordcount TT1 TT2 TT3 TT1 TT3 (TT) 7
  • 8. But there are more faults… • Problem: Accidental faults may affect the correctness of the results of MapReduce • Task corruptions: memory errors, chipset errors, … • Cloud outages: MapReduce job interruptions (as reported in popular clouds) • Our goal: • guarantee integrity and availability (despite task corruptions and cloud outages) • Develop a new model to compute MapReduce in cloud-of-clouds • Commercially feasible? Yes, but out of scope of this presentation Tobias Kurze et al., Cloud federation. In Proceedings of the 2nd International Conference on Cloud Computing, GRIDs, and Virtualization CLOUD COMPUTING 2011. 8
  • 9. Byzantine fault-tolerant MapReduce • Basic idea: to replicate tasks in different clouds and vote the results returned by the replicas • The set of clouds forms a clouds, so cloud-of-clouds • Inputs initially stored in all clouds (i.e., not our problem) Cloud 1 Cloud 2 Cloud 3 9
  • 10. System model • Client is correct (not part of MapReduce) • Clouds: up to t clouds can arbitrarily corrupt all tasks and other modules they execute • Why use t and not f? t≤f • Next: • Basic BFT MapReduce scheme • 3 problems of the Basic scheme • Our approach: Full BFT MapReduce scheme 10
  • 11. MapReduce: Map perspective Official Cloud-of-Clouds Replicas in different clouds 11
  • 12. MapReduce: Reduce perspective Official Cloud-of-Clouds Replicas in different clouds But we can do better. 12
  • 13. Improvements over basic version • 3 problems have risen • Computation problem • Communication problem • Job execution control problem • 3 Solutions: Our BFT MapReduce can be thought of as this basic version plus the following mechanisms, • Deferred execution (computation problem) • Digest communication (communication problem) • Distributed Job tracker (job execution control problem) 13
  • 14. Problem 1: computation split 0 part 0 split 0 part 0 Replicas in different Replicas in different clouds clouds split 0 part 0 Tasks are executed 2t+1 times 14
  • 15. Solution 1: Deferred execution • Computation problem is uncommon • Job Tracker replicates tasks across t+1 clouds (t in standby) • If results differ or one cloud stops, request 1 more (up to t) split 0 part 0 split 0 part 0 15
  • 16. Problem 2: communication split 0 part 0 split 0 part 0 Replicas in different clouds split 0 part 0 All this communication through the Internet (delay, cost)! 16
  • 17. Solution 2: Transferring Digests • Reduces must fetch the map task outputs • Intra-cloud fetch: output fetched normally • Inter-cloud fetch: only hash of the output fetched – key idea split 0 other clouds same cloud part 0 split 0 split 0 17
  • 18. Problem 3: Job execution control • Job tracker controls all task executions in the task trackers in all clouds • If Job tracker is in one cloud separated from many task trackers by the internet: • Communication is slow • Large timeouts for detecting task tracker failure • …and it’s a single point of failure (this is the case in MR & Hadoop MR) 18
  • 19. Solution 3: Job execution control Client VJT Job Tracker Job Task Job Tracker Tracker Tracker Task Task Tracker Tracker Task Task Tracker Tracker Task Task Task Task Tracker Tracker Tracker Tracker 19
  • 21. Setup and Test Platform configuration • 3 clouds • Each cloud has 3 nodes • 1 JT and 3TT for each cloud • All JTs are interconnected Job submitted (Wordcount) • Input data: 26 chunks of 64 MB (total 1.5GB ) • Map tasks: 26 • Reduce tasks: 120, 180, 360, 400 21
  • 22. Number of reduce tasks executed (no faults, t=1) Nr. Job Job Diff Reduce duration duration tasks (Official) (CoC) 120 00:15:35 00:17:13 00:02:35 180 00:19:35 00:21:36 00:02:01 360 00:31:12 00:33:30 00:02:18 400 00:33:37 00:36:24 00:02:47
  • 23. Task details Official BFT Cloud-of-clouds: 1 view Map Duration: 00:06:47 Map duration: 00:07:08 Map Tasks Map Tasks Reduce duration: 00:13:18 Reduce duration: 00:14:46 Reduce Tasks Reduce Tasks 23
  • 24. Conclusions • Our method guarantee integrity and availability despite task corruptions and cloud outages • BFT MapReduce in cloud-of-clouds is feasible! • No need to execute in all 2t+1 clouds • Only digests sent through the Internet (no “big data”) • Control job execution within each cloud Thank you 24