SlideShare a Scribd company logo
1 of 34
Download to read offline
Characterizing Fault Tolerance of Genetic
       Algorithms in Desktop Grid Systems

                 ˜      ´                 ´
   Daniel Lombrana Gonzalez Juan Luis Jimenez Laredo
                   ´                      ´
     Francisco Fernandez de Vega Juan Julian Merelo
                              ´
                         Guervos


                                          April 8, 2010




         ˜                      ´
D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Outline




1      Introduction


2      Motivation


3      Methodology


4      Experiments and Results


5      Conclusions




             ˜                      ´
    D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo    Evocop 2010
Introduction




         ˜                      ´
D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo     Evocop 2010
Introduction

Parallel Genetic Algorithms (PGA)




        Sometimes Evolutionary Algorithms (EAs) require large
        execution times.
        One solution is to use:
                Parallel Computing and
                Distributed Platforms.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo     Evocop 2010
Introduction

Parallel algorithms can be run in




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo     Evocop 2010
Introduction

Parallel algorithms can be run in




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo     Evocop 2010
Introduction

Failures in distributed platforms




        Distributed platforms are prone to errors.
        Failures are expected events rather than catastrophic
        exceptions.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo     Evocop 2010
Introduction

Fault Tolerance




  Fault Tolerance
  is the ability of a system to behave in a well-defined manner
  once a failure occurs.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo     Evocop 2010
Introduction

Fault Tolerance


  Different techniques have been developed to cope with failures:

        Redundancy,
                S. Ghosh. Distributed systems: an algorithmic approach. Chapman & Hall/CRC, 2006.

        Checkpointing,
                E. Elnozahy, L. Alvisi, Y. Wang, and D. Johnson. A survey of rollback-recovery protocols in
                message-passing systems. ACM Computing Surveys (CSUR), 34(3):375–408, 2002.

        Rejuvenation frameworks,
                A. T. Tai and K. S. Tso. A performability-oriented software rejuvenation framework for distributed
                applications. In DSN ’05, pages 570–579, Washington, DC, USA, 2005. IEEE Computer Society.

        etc.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo        Evocop 2010
Introduction

Fault Tolerance




  The use of a fault tolerance technique mandates that:
        the application has to be modified, and even
        the parallel algorithm.
  Thus, this modification can represent a heavy burden for the
  developer.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo     Evocop 2010
Motivation




         ˜                      ´
D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo    Evocop 2010
Motivation

Parallel EAs and Fault Tolerance




  To the best of our knowledge
  there has been little research about the fault tolerance features
  of PEAs in general and of PGA applications in particular.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo    Evocop 2010
Motivation

Previous Works



        We firstly studied the Fault-Tolerance nature of Parallel
        Genetic Programming (PGP) on:
                Real World Desktop Grid Systems.
        Concluding that PGP is fault-tolerant by default.
                               ˜         ´                   ´
                Daniel Lombrana Gonzalez, Francisco Fernandez de Vega, and Henri Casanova.
                Characterizing fault tolerance in genetic programming.
                Future Generation Computer Systems, 2010.
                DOI: 10.1016/j.future.2010.02.006.
                               ˜         ´                   ´
                Daniel Lombrana Gonzalez, Francisco Fernandez de Vega, and Henri Casanova.
                Characterizing fault tolerance in genetic programming.
                In Workshop on Bio-Inspired Algorithms for Distributed Systems,
                pages 1–10. Barcelona, Spain, 2009. ISBN 978-1-60558-564-2.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo    Evocop 2010
Motivation

Proposal




  Based on this insight
  This work builds on top of the previous ones, and extends the
  study of fault-tolerance in EAs to PGAs, using the same
  methodology.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo    Evocop 2010
Methodology




         ˜                      ´
D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Methodology

Master-Worker




           ˜                      ´
  D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Methodology

Desktop Grid platforms (DGs)




        DGs exhibit large numbers of failures.
        DGs failure behavior has been studied in literature.
        DGs are low-cost when compared to clusters of
        comparable scale.
        And, PGA applications are loosely coupled and thus
        well-suited to DGs.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Methodology

Desktop Grid Platforms




        DGs are very promising for PGA applications, and
        their high failure rate make them a great test case for
        studying and characterizing the fault tolerance abilities of
        PGA.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Methodology

Experiments




       In order to characterize the fault-tolerant nature of PGA we
       run two kind of experiments:
               a failure-free environment, and
               replaying and simulating failure traces from real-world DG
               platforms.




           ˜                      ´
  D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Methodology

DG traces




        We perform simulations of DG platforms and of host
        availability based on three real-world traces:
                entrfin,
                ucb,
                xwtr.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Methodology

DG traces




           Trace           Hosts         Venue                        Time
           Entrfin           275          San Diego                    1.0 months
           Ucb               85          UC Berkeley                  1.5 months
           Xwtr             100                   ´
                                         Univeriste Paris-Sud         1.0 months




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Methodology

Using the traces




        We consider two cases:
                hosts that become unavailable never become available
                again (worst case assumption),
                and the complete host-churn (unavailable hosts can be
                re-acquired afterwards).
        For two different days of each trace.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Methodology

Host availability for 1 day of the ucb trace


                         25




                         20




                         15
             Computers




                         10




                          5




                          0
                              0   50         100          150         200         250   300
                                                      Time Step
                                       Original Trace      Trace without return




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo      Evocop 2010
Experiments and Results




         ˜                      ´
D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Experiments and Results

Problems




       We conduct experiments with a 3-trap instance:
                         a         →
                                   −             →
                                                 −
                →
                −
         trap(u( x )) =  z (z − u( x )),   if u( x ) ≤ z
                                                                     (1)
                          b     →
                                −
                         l−z (u( x ) − z), otherwise




           ˜                      ´
  D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Experiments and Results

GA Parameters for 3-Trap instance



             Trap instance
                  Size of sub-function (k )             3
             Number of sub-functions (m)                10
                     Individual length (L)              30

             GA settings
                                          GA            GGA
                             Population size            3000
                         Selection of Parents           Binary Tournament
                             Recombination              Uniform crossover, pc = 1.0
                                                                                1
                                    Mutation            Bit-Flip mutation, pm = L




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Experiments and Results

Population size vs. generation


                           4000                                                                 0


                           3500


                           3000                                                                 25


                           2500
             Individuals




                                                                                                      % of Loss
                           2000                                                                 50


                           1500


                           1000                                                                 75


                           500


                             0                                                                  100
                                  0        10           20              30            40   50
                                                          Generations
                                         entrfin 1         ucb 1             xwtr 1
                                         entrfin 2         ucb 2             xwtr 2




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo            Evocop 2010
Experiments and Results

Obtained Fitness for 3-Trap Day1


                                            Error Free fitness = 23.56
              Trace           Fitness    Wilcoxon Test                    Significantly different?
              Entrfin           23.30     W = 6093, p-value = 0.002688              yes
              Entrfin 10%       23.47     W = 5408.5, p-value = 0.2535              no
              Entrfin 20%       23.48     W = 5360, p-value = 0.3137                no
              Entrfin 30%       23.49     W = 5283.5, p-value = 0.4271              no
              Entrfin 40%       23.57     W = 4923.5, p-value = 0.8286              no
              Entrfin 50%       23.59     W = 4910.5, p-value = 0.7994              no
              Ucb              23.22     W = 6453, p-value = 6.877e-05             yes
              Ucb 10%          23.27     W = 6098.5, p-value = 0.002753            yes
              Ucb 20%          23.37     W = 5837.5, p-value = 0.02051             yes
              Ucb 30%          23.40     W = 5664, p-value = 0.06588               no
              Ucb 40%          23.51     W = 5186.5, p-value = 0.6004              no
              Ucb 50%          23.42     W = 5623, p-value = 0.08335               no
              Xwtr             23.56     W = 5056, p-value = 0.8748                no
              Xwtr 10%         23.57     W = 4923.5, p-value = 0.8286              no
              Xwtr 20%         23.68     W = 4474, p-value = 0.1245                no
              Xwtr 30%         23.73     W = 4259.5, p-value = 0.02812             yes
              Xwtr 40%         23.68     W = 4502, p-value = 0.1466                no
              Xwtr 50%         23.71     W = 4356.5, p-value = 0.05817             no




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo      Evocop 2010
Experiments and Results

Obtained fitness for 3-Trap Day2


                                            Error Free fitness = 23.56
              Trace           Fitness    Wilcoxon Test                     Significantly different?
              Entrfin           23.57     W = 4979.5, p-value = 0.9546               no
              Entrfin 10%       23.69     W = 4397.5, p-value = 0.07682              no
              Entrfin 20%       23.67     W = 4522.5, p-value = 0.1645               no
              Entrfin 30%       23.70     W = 4405, p-value = 0.08086                no
              Entrfin 40%       23.69     W = 4453.5, p-value = 0.11                 no
              Entrfin 50%       23.75     W = 4162.5, p-value = 0.01234              yes
              Ucb             23.09      W = 6672.5, p-value = 7.486e-06            yes
              Ucb 10%         23.12      W = 6826, p-value = 6.647e-07              yes
              Ucb 20%         23.14      W = 6654, p-value = 7.223e-06              yes
              Ucb 30%         23.26      W = 6371, p-value = 0.0001507              yes
              Ucb 40%         23.37      W = 5893.5, p-value = 0.01316              yes
              Ucb 50%         23.32      W = 6108, p-value = 0.002166               yes
              Xwtr            23.60      W = 4806, p-value = 0.5791                 no
              Xwtr 10%        23.62      W = 4765, p-value = 0.5002                 no
              Xwtr 20%        23.69      W = 4453.5, p-value = 0.11                 no
              Xwtr 30%        23.60      W = 4806, p-value = 0.5791                 no
              Xwtr 40%        23.63      W = 4688.5, p-value = 0.3695               no
              Xwtr 50%        23.77      W = 4065.5, p-value = 0.004877             yes




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo      Evocop 2010
Experiments and Results

Obtained fitness with host-churn



                                               Table: Day1
                                           Error Free fitness = 23.56
                Trace      Fitness    Wilcoxon Test                      Significantly different?
                Entrfin      23.52     W = W = 5222, p-value = 0.5322              no
                Ucb         21.31     W = 9708.5, p-value < 2.2e-16               yes
                Xwtr        23.64     W = 4640, p-value = 0.2982                  no




                                               Table: Day2
                                            Error Free fitness = 23.56
                 Trace     Fitness     Wilcoxon Test                     Significantly different?
                 Entrfin     23.58      W = 4931, p-value = 0.8452                 no
                 Ucb        23.03      W = 7038.5, p-value = 4.588e-08            yes
                 Xwtr        23.7      W = 4405, p-value = 0.08086                no




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo      Evocop 2010
Conclusions




         ˜                      ´
D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Conclusions

Summary of Results




       PGA applications are fault-tolerant by nature in DG
       platforms.
       PGA features the well-known fault-tolerant technique
       known as graceful degradation in DG platforms.
       We provided a new method to mitigate the effect of failures
       by increasing the initial population.




           ˜                      ´
  D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Conclusions

Conclusions




        We have studied and characterized the behavior of PGA
        applications running in distributed platforms with high
        failure rates.
        We have tested the PGA fault-tolerance using three
        real-world DG traces.
        Our main conclusion is that PGA inherently provides
        graceful degradation.




            ˜                      ´
   D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010
Conclusions

Questions


                         daniellg@unex.es
                      juanlu@geneura.ugr.es
                         fcofdez@unex.es
                     jmerelo@geneura.ugr.es

           Icons from Tango Desktop project and Gnome Desktop (Creative Commons & GPL License)




           ˜                      ´
  D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo   Evocop 2010

More Related Content

Recently uploaded

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Recently uploaded (20)

How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems

  • 1. Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems ˜ ´ ´ Daniel Lombrana Gonzalez Juan Luis Jimenez Laredo ´ ´ Francisco Fernandez de Vega Juan Julian Merelo ´ Guervos April 8, 2010 ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 2. Outline 1 Introduction 2 Motivation 3 Methodology 4 Experiments and Results 5 Conclusions ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 3. Introduction ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 4. Introduction Parallel Genetic Algorithms (PGA) Sometimes Evolutionary Algorithms (EAs) require large execution times. One solution is to use: Parallel Computing and Distributed Platforms. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 5. Introduction Parallel algorithms can be run in ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 6. Introduction Parallel algorithms can be run in ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 7. Introduction Failures in distributed platforms Distributed platforms are prone to errors. Failures are expected events rather than catastrophic exceptions. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 8. Introduction Fault Tolerance Fault Tolerance is the ability of a system to behave in a well-defined manner once a failure occurs. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 9. Introduction Fault Tolerance Different techniques have been developed to cope with failures: Redundancy, S. Ghosh. Distributed systems: an algorithmic approach. Chapman & Hall/CRC, 2006. Checkpointing, E. Elnozahy, L. Alvisi, Y. Wang, and D. Johnson. A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys (CSUR), 34(3):375–408, 2002. Rejuvenation frameworks, A. T. Tai and K. S. Tso. A performability-oriented software rejuvenation framework for distributed applications. In DSN ’05, pages 570–579, Washington, DC, USA, 2005. IEEE Computer Society. etc. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 10. Introduction Fault Tolerance The use of a fault tolerance technique mandates that: the application has to be modified, and even the parallel algorithm. Thus, this modification can represent a heavy burden for the developer. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 11. Motivation ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 12. Motivation Parallel EAs and Fault Tolerance To the best of our knowledge there has been little research about the fault tolerance features of PEAs in general and of PGA applications in particular. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 13. Motivation Previous Works We firstly studied the Fault-Tolerance nature of Parallel Genetic Programming (PGP) on: Real World Desktop Grid Systems. Concluding that PGP is fault-tolerant by default. ˜ ´ ´ Daniel Lombrana Gonzalez, Francisco Fernandez de Vega, and Henri Casanova. Characterizing fault tolerance in genetic programming. Future Generation Computer Systems, 2010. DOI: 10.1016/j.future.2010.02.006. ˜ ´ ´ Daniel Lombrana Gonzalez, Francisco Fernandez de Vega, and Henri Casanova. Characterizing fault tolerance in genetic programming. In Workshop on Bio-Inspired Algorithms for Distributed Systems, pages 1–10. Barcelona, Spain, 2009. ISBN 978-1-60558-564-2. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 14. Motivation Proposal Based on this insight This work builds on top of the previous ones, and extends the study of fault-tolerance in EAs to PGAs, using the same methodology. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 15. Methodology ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 16. Methodology Master-Worker ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 17. Methodology Desktop Grid platforms (DGs) DGs exhibit large numbers of failures. DGs failure behavior has been studied in literature. DGs are low-cost when compared to clusters of comparable scale. And, PGA applications are loosely coupled and thus well-suited to DGs. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 18. Methodology Desktop Grid Platforms DGs are very promising for PGA applications, and their high failure rate make them a great test case for studying and characterizing the fault tolerance abilities of PGA. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 19. Methodology Experiments In order to characterize the fault-tolerant nature of PGA we run two kind of experiments: a failure-free environment, and replaying and simulating failure traces from real-world DG platforms. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 20. Methodology DG traces We perform simulations of DG platforms and of host availability based on three real-world traces: entrfin, ucb, xwtr. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 21. Methodology DG traces Trace Hosts Venue Time Entrfin 275 San Diego 1.0 months Ucb 85 UC Berkeley 1.5 months Xwtr 100 ´ Univeriste Paris-Sud 1.0 months ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 22. Methodology Using the traces We consider two cases: hosts that become unavailable never become available again (worst case assumption), and the complete host-churn (unavailable hosts can be re-acquired afterwards). For two different days of each trace. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 23. Methodology Host availability for 1 day of the ucb trace 25 20 15 Computers 10 5 0 0 50 100 150 200 250 300 Time Step Original Trace Trace without return ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 24. Experiments and Results ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 25. Experiments and Results Problems We conduct experiments with a 3-trap instance: a → − → − → − trap(u( x )) = z (z − u( x )), if u( x ) ≤ z (1) b → − l−z (u( x ) − z), otherwise ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 26. Experiments and Results GA Parameters for 3-Trap instance Trap instance Size of sub-function (k ) 3 Number of sub-functions (m) 10 Individual length (L) 30 GA settings GA GGA Population size 3000 Selection of Parents Binary Tournament Recombination Uniform crossover, pc = 1.0 1 Mutation Bit-Flip mutation, pm = L ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 27. Experiments and Results Population size vs. generation 4000 0 3500 3000 25 2500 Individuals % of Loss 2000 50 1500 1000 75 500 0 100 0 10 20 30 40 50 Generations entrfin 1 ucb 1 xwtr 1 entrfin 2 ucb 2 xwtr 2 ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 28. Experiments and Results Obtained Fitness for 3-Trap Day1 Error Free fitness = 23.56 Trace Fitness Wilcoxon Test Significantly different? Entrfin 23.30 W = 6093, p-value = 0.002688 yes Entrfin 10% 23.47 W = 5408.5, p-value = 0.2535 no Entrfin 20% 23.48 W = 5360, p-value = 0.3137 no Entrfin 30% 23.49 W = 5283.5, p-value = 0.4271 no Entrfin 40% 23.57 W = 4923.5, p-value = 0.8286 no Entrfin 50% 23.59 W = 4910.5, p-value = 0.7994 no Ucb 23.22 W = 6453, p-value = 6.877e-05 yes Ucb 10% 23.27 W = 6098.5, p-value = 0.002753 yes Ucb 20% 23.37 W = 5837.5, p-value = 0.02051 yes Ucb 30% 23.40 W = 5664, p-value = 0.06588 no Ucb 40% 23.51 W = 5186.5, p-value = 0.6004 no Ucb 50% 23.42 W = 5623, p-value = 0.08335 no Xwtr 23.56 W = 5056, p-value = 0.8748 no Xwtr 10% 23.57 W = 4923.5, p-value = 0.8286 no Xwtr 20% 23.68 W = 4474, p-value = 0.1245 no Xwtr 30% 23.73 W = 4259.5, p-value = 0.02812 yes Xwtr 40% 23.68 W = 4502, p-value = 0.1466 no Xwtr 50% 23.71 W = 4356.5, p-value = 0.05817 no ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 29. Experiments and Results Obtained fitness for 3-Trap Day2 Error Free fitness = 23.56 Trace Fitness Wilcoxon Test Significantly different? Entrfin 23.57 W = 4979.5, p-value = 0.9546 no Entrfin 10% 23.69 W = 4397.5, p-value = 0.07682 no Entrfin 20% 23.67 W = 4522.5, p-value = 0.1645 no Entrfin 30% 23.70 W = 4405, p-value = 0.08086 no Entrfin 40% 23.69 W = 4453.5, p-value = 0.11 no Entrfin 50% 23.75 W = 4162.5, p-value = 0.01234 yes Ucb 23.09 W = 6672.5, p-value = 7.486e-06 yes Ucb 10% 23.12 W = 6826, p-value = 6.647e-07 yes Ucb 20% 23.14 W = 6654, p-value = 7.223e-06 yes Ucb 30% 23.26 W = 6371, p-value = 0.0001507 yes Ucb 40% 23.37 W = 5893.5, p-value = 0.01316 yes Ucb 50% 23.32 W = 6108, p-value = 0.002166 yes Xwtr 23.60 W = 4806, p-value = 0.5791 no Xwtr 10% 23.62 W = 4765, p-value = 0.5002 no Xwtr 20% 23.69 W = 4453.5, p-value = 0.11 no Xwtr 30% 23.60 W = 4806, p-value = 0.5791 no Xwtr 40% 23.63 W = 4688.5, p-value = 0.3695 no Xwtr 50% 23.77 W = 4065.5, p-value = 0.004877 yes ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 30. Experiments and Results Obtained fitness with host-churn Table: Day1 Error Free fitness = 23.56 Trace Fitness Wilcoxon Test Significantly different? Entrfin 23.52 W = W = 5222, p-value = 0.5322 no Ucb 21.31 W = 9708.5, p-value < 2.2e-16 yes Xwtr 23.64 W = 4640, p-value = 0.2982 no Table: Day2 Error Free fitness = 23.56 Trace Fitness Wilcoxon Test Significantly different? Entrfin 23.58 W = 4931, p-value = 0.8452 no Ucb 23.03 W = 7038.5, p-value = 4.588e-08 yes Xwtr 23.7 W = 4405, p-value = 0.08086 no ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 31. Conclusions ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 32. Conclusions Summary of Results PGA applications are fault-tolerant by nature in DG platforms. PGA features the well-known fault-tolerant technique known as graceful degradation in DG platforms. We provided a new method to mitigate the effect of failures by increasing the initial population. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 33. Conclusions Conclusions We have studied and characterized the behavior of PGA applications running in distributed platforms with high failure rates. We have tested the PGA fault-tolerance using three real-world DG traces. Our main conclusion is that PGA inherently provides graceful degradation. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 34. Conclusions Questions daniellg@unex.es juanlu@geneura.ugr.es fcofdez@unex.es jmerelo@geneura.ugr.es Icons from Tango Desktop project and Gnome Desktop (Creative Commons & GPL License) ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010