SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Social Interactions around
Cross-System Bug Fixings:
        The Case of
  FreeBSD and OpenBSD
  Gerardo Canfora, Luigi Cerulo,
Marta Cimitile, Massimiliano Di Penta
       dipenta@unisannio.it
Context
  Source code is often reused across different systems
    Unixes (FreeBSD, OpenBSD, Linux)
    Office applications (NeoOffice, OpenOffice)
    Desktop environment apps (KDE or GNOME apps)
  Maintenance might require to propagate bug fixings
    We call this “Cross System Bug Fixing” (CSBF)


  Example:
     FreeBSD, 1996/01/19, file ip_icmp.h:
       – “Added definitions for ICMP router discovery. Reviewed by:
         wollman
     OpenBSD, 1996/08/02, file ip_icmp.h:
       – “ICMP Router Discovery definitions; from FreeBSD”
What we propose
  A method to track CSBFs
  A study on the social characteristics
   and development activity made by
   CSBF committers
    degree, betweenness, brokerage
    commits, lines changed
Detecting CSBF - I
  Step 1: mining cross-referencing commits
    openbsd, atphy.c,2008/09/25 20:47:16,brad,
     Add a driver for the Attansic F1 PHY. From FreeBSD via
     kevlo@
  Step 2: mine commits previously performed on files
   with same name in the other system
    freebsd,atphy.c,2008/05/19 01:12:10,yongari,
     Add Attansic/Atheros F1 PHY driver.
    openbsd, atphy.c,2008/09/25 20:47:16,brad,
     Add a driver for the Attansic F1 PHY. From FreeBSD via
     kevlo@
Detecting CSBF - II
  Step 3: compute file similarity with clone detection
    CCFinder
    Threshold: at least 10% of cloned lines
  Step 4: take the previous change with the highest
   textual similarity in the commit note
    Use of Vector Space models
    Cosine similarity; threshold (0.20) to filter out unrelated
     commits

                  Add Attansic/Atheros F1 PHY driver.

                                    =    0.72

         Add a driver for the Attansic F1 PHY. From FreeBSD via kevlo@
Building Committers' Network
  We extract communication from mailing
   lists
    Bug fixing mailing lists
  Heuristic similar to the one of Bird et al.
   [2006] to map inconsistent namings /
   emails
    Also, to map committer Ids to mailing list
     names/emails
  Nodes of the network labeled as:
    Committer / other mailing list contributors
    CSBFs committer
Empirical Study
 Goal: analyze the phenomenon of CSBFs
 Purpose: understanding its relevance with
  respect to the social characteristics of the
  involved developers
 Context: CVS repositories and mailing lists
  archives of FreeBSD and OpenBSD
   Period: 1993-2009 (FreeBSD), 1998-2009
    (OpenBSD)
   Commits: 119,000 (FreeBSD), 70,000 (OpenBSD)
Research Questions
  RQ1: How do the source code committers
   and contributors of the two systems
   overlap?
  RQ2: How frequent is the phenomenon of
   CSBFs?
  RQ3: Who are the contributors involved in
   CSBFs?
  RQ4: Are mailing list contributors involved
   in CSBFs more active than others?
RQ1 – Team overlap
                              FreeBSD OpenBSD Both
  Committers                      383      211       26
  Mailing list contribs          8035     3843   359
  Committers and                  213     122        17
  mailing list contributors


  The two projects have less than 10% of
 common contributors →
 the development team of Free and
 Open BSD is really different
RQ2 – Commit filtering
   1000                                           933
    900

    800

    700

    600

    500       439
    400
                                                          296
    300

    200               133                                         120
    100
                              59

     0
                    FreeBSD                             OpenBSD

              Referring commits    Cloned files     Linked commits



          At the end of the filtering not that many but...
RQ2 – Cloned lines in CSBF files




         C source files                        header files
  Percentage smaller for .h files
  Use of preprocessor conditional to make header files system-
   dependent
    #if defined(__FreeBSD__)
RQ3 – CSBF Graph (excerpt)
Blue/cyan: FreeBSD
Red/orange: OpenBSD
Yellow: common
RQ3: social characteristics
  Importance in terms of
    (in/out) degree: number of (incoming/outcoming)
     communication links
    Betweenness: number of communications for which the
     node is in the short path
  Brokerage metrics: useful to analyze the
   communication between two clusters

                                B is a coordinator

                                B is a gatekeeper

                                B is a representative
RQ3 – social characteristics
       Representative
          Gatekeeper
           12
       Coordinator /10
           10
   Betweenness / 1000
           8
          Out-degree
                                                                          Column 1
           6
                In-degree                                                 Column 2
                                                                          Column 3
           4
                  Degree
           2                0   5       10   15    20   25    30     35   40   45    50
           0
                   Row 1            CSBF
                                Row 2             Others
                                              Row 3          Row 4



  All differences statistically significant
  High effect size (Cohen d>1)
  Contributors involved in CSBF have a higher importance in
   the communication and in the flow of communication
   between systems
RQ3 – committers with highest
social metrics
RQ4 – change activity of CSBF
committers and others
        LOC added/removed                 Commits
40000                           1500
                                1000
20000
                                 500

    0                              0
         FreeBSD      OpenBSD          FreeBSD      OpenBSD

           CSBF    Others                CSBF    Others




    All differences statistically significant
    High effect size (Cohen d∌1)
    Contributors involved in CSBF are more active
     than others
Conclusions and Work-in-Progress
  We proposed method to mine CSBF
  We reported a study on FreeBSD and OpenBSD where:
    Development team is almost disjoint
    There is a small, though not negligible portion of CSBF
    Committers involved in CSBF have
     – Higher social importance
     – Higher brokerage level
     – Higher activity in source code commits
  Work-in-progress:
    Better approaches to identify implicit CSBF, tracking and
     linking changes occurring on both systems
    More extensive study on less obvious cases

Weitere Àhnliche Inhalte

Ähnlich wie Dipenta msr2011-csbf

Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1Adrian Preda
 
Basic networking 07-2012
Basic networking 07-2012Basic networking 07-2012
Basic networking 07-2012Samuel Dratwa
 
OSI - OSI Reference Model and TCP (Transmission Control Protocol)
OSI - OSI Reference Model and TCP (Transmission Control Protocol)OSI - OSI Reference Model and TCP (Transmission Control Protocol)
OSI - OSI Reference Model and TCP (Transmission Control Protocol)Dktechnozone.in
 
Chapter-2.pdf
Chapter-2.pdfChapter-2.pdf
Chapter-2.pdfMrMuneeb2
 
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdfMohamedshabana38
 
Network Evolution, Standards, & Layered Architectures 2012
Network Evolution, Standards, & Layered Architectures 2012Network Evolution, Standards, & Layered Architectures 2012
Network Evolution, Standards, & Layered Architectures 2012Tiffany Hamburg Hamburg
 
Assignment izaz sir
Assignment izaz sirAssignment izaz sir
Assignment izaz sirahmad iqbal
 
Network_Model. In the field of Computer Networking.ppt
Network_Model. In the field of Computer Networking.pptNetwork_Model. In the field of Computer Networking.ppt
Network_Model. In the field of Computer Networking.pptBlackHat41
 
Layer_arc_and_OSI_MODEL.ppt
Layer_arc_and_OSI_MODEL.pptLayer_arc_and_OSI_MODEL.ppt
Layer_arc_and_OSI_MODEL.pptBeniamTekeste
 
1b network models
1b network models1b network models
1b network modelskavish dani
 
Robot operating systems (ros) overview & (1)
Robot operating systems (ros) overview & (1)Robot operating systems (ros) overview & (1)
Robot operating systems (ros) overview & (1)Piyush Chand
 
Robot Operating Systems (Ros) Overview & (1)
Robot Operating Systems (Ros) Overview & (1)Robot Operating Systems (Ros) Overview & (1)
Robot Operating Systems (Ros) Overview & (1)Piyush Chand
 
Chapter 2 network models -computer_network
Chapter 2   network models -computer_networkChapter 2   network models -computer_network
Chapter 2 network models -computer_networkDhairya Joshi
 
OSI and TCPIP Model
OSI and TCPIP ModelOSI and TCPIP Model
OSI and TCPIP ModelTapan Khilar
 
1. Answer the following questions about OSI modela.At which layer.pdf
1. Answer the following questions about OSI modela.At which layer.pdf1. Answer the following questions about OSI modela.At which layer.pdf
1. Answer the following questions about OSI modela.At which layer.pdflohithkart
 

Ähnlich wie Dipenta msr2011-csbf (20)

Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1
 
Basic networking 07-2012
Basic networking 07-2012Basic networking 07-2012
Basic networking 07-2012
 
Ch02
Ch02Ch02
Ch02
 
OSI - OSI Reference Model and TCP (Transmission Control Protocol)
OSI - OSI Reference Model and TCP (Transmission Control Protocol)OSI - OSI Reference Model and TCP (Transmission Control Protocol)
OSI - OSI Reference Model and TCP (Transmission Control Protocol)
 
Chapter-2.pdf
Chapter-2.pdfChapter-2.pdf
Chapter-2.pdf
 
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
 
Chapter 2: Network Models
Chapter 2: Network ModelsChapter 2: Network Models
Chapter 2: Network Models
 
Network Evolution, Standards, & Layered Architectures 2012
Network Evolution, Standards, & Layered Architectures 2012Network Evolution, Standards, & Layered Architectures 2012
Network Evolution, Standards, & Layered Architectures 2012
 
Assignment izaz sir
Assignment izaz sirAssignment izaz sir
Assignment izaz sir
 
Network_Model. In the field of Computer Networking.ppt
Network_Model. In the field of Computer Networking.pptNetwork_Model. In the field of Computer Networking.ppt
Network_Model. In the field of Computer Networking.ppt
 
Layer_arc_and_OSI_MODEL.ppt
Layer_arc_and_OSI_MODEL.pptLayer_arc_and_OSI_MODEL.ppt
Layer_arc_and_OSI_MODEL.ppt
 
OSI Pankaj yadav
OSI  Pankaj yadavOSI  Pankaj yadav
OSI Pankaj yadav
 
1b network models
1b network models1b network models
1b network models
 
Ch 2 network
Ch 2 networkCh 2 network
Ch 2 network
 
Robot operating systems (ros) overview & (1)
Robot operating systems (ros) overview & (1)Robot operating systems (ros) overview & (1)
Robot operating systems (ros) overview & (1)
 
Robot Operating Systems (Ros) Overview & (1)
Robot Operating Systems (Ros) Overview & (1)Robot Operating Systems (Ros) Overview & (1)
Robot Operating Systems (Ros) Overview & (1)
 
Chapter 2 network models -computer_network
Chapter 2   network models -computer_networkChapter 2   network models -computer_network
Chapter 2 network models -computer_network
 
Network layers
Network layersNetwork layers
Network layers
 
OSI and TCPIP Model
OSI and TCPIP ModelOSI and TCPIP Model
OSI and TCPIP Model
 
1. Answer the following questions about OSI modela.At which layer.pdf
1. Answer the following questions about OSI modela.At which layer.pdf1. Answer the following questions about OSI modela.At which layer.pdf
1. Answer the following questions about OSI modela.At which layer.pdf
 

KĂŒrzlich hochgeladen

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 

KĂŒrzlich hochgeladen (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 

Dipenta msr2011-csbf

  • 1. Social Interactions around Cross-System Bug Fixings: The Case of FreeBSD and OpenBSD Gerardo Canfora, Luigi Cerulo, Marta Cimitile, Massimiliano Di Penta dipenta@unisannio.it
  • 2. Context  Source code is often reused across different systems  Unixes (FreeBSD, OpenBSD, Linux)  Office applications (NeoOffice, OpenOffice)  Desktop environment apps (KDE or GNOME apps)  Maintenance might require to propagate bug fixings  We call this “Cross System Bug Fixing” (CSBF)  Example:  FreeBSD, 1996/01/19, file ip_icmp.h: – “Added definitions for ICMP router discovery. Reviewed by: wollman  OpenBSD, 1996/08/02, file ip_icmp.h: – “ICMP Router Discovery definitions; from FreeBSD”
  • 3. What we propose  A method to track CSBFs  A study on the social characteristics and development activity made by CSBF committers  degree, betweenness, brokerage  commits, lines changed
  • 4. Detecting CSBF - I  Step 1: mining cross-referencing commits  openbsd, atphy.c,2008/09/25 20:47:16,brad, Add a driver for the Attansic F1 PHY. From FreeBSD via kevlo@  Step 2: mine commits previously performed on files with same name in the other system  freebsd,atphy.c,2008/05/19 01:12:10,yongari, Add Attansic/Atheros F1 PHY driver.  openbsd, atphy.c,2008/09/25 20:47:16,brad, Add a driver for the Attansic F1 PHY. From FreeBSD via kevlo@
  • 5. Detecting CSBF - II  Step 3: compute file similarity with clone detection  CCFinder  Threshold: at least 10% of cloned lines  Step 4: take the previous change with the highest textual similarity in the commit note  Use of Vector Space models  Cosine similarity; threshold (0.20) to filter out unrelated commits Add Attansic/Atheros F1 PHY driver. = 0.72 Add a driver for the Attansic F1 PHY. From FreeBSD via kevlo@
  • 6. Building Committers' Network  We extract communication from mailing lists  Bug fixing mailing lists  Heuristic similar to the one of Bird et al. [2006] to map inconsistent namings / emails  Also, to map committer Ids to mailing list names/emails  Nodes of the network labeled as:  Committer / other mailing list contributors  CSBFs committer
  • 7. Empirical Study  Goal: analyze the phenomenon of CSBFs  Purpose: understanding its relevance with respect to the social characteristics of the involved developers  Context: CVS repositories and mailing lists archives of FreeBSD and OpenBSD  Period: 1993-2009 (FreeBSD), 1998-2009 (OpenBSD)  Commits: 119,000 (FreeBSD), 70,000 (OpenBSD)
  • 8. Research Questions  RQ1: How do the source code committers and contributors of the two systems overlap?  RQ2: How frequent is the phenomenon of CSBFs?  RQ3: Who are the contributors involved in CSBFs?  RQ4: Are mailing list contributors involved in CSBFs more active than others?
  • 9. RQ1 – Team overlap FreeBSD OpenBSD Both Committers 383 211 26 Mailing list contribs 8035 3843 359 Committers and 213 122 17 mailing list contributors The two projects have less than 10% of common contributors → the development team of Free and Open BSD is really different
  • 10. RQ2 – Commit filtering 1000 933 900 800 700 600 500 439 400 296 300 200 133 120 100 59 0 FreeBSD OpenBSD Referring commits Cloned files Linked commits At the end of the filtering not that many but...
  • 11. RQ2 – Cloned lines in CSBF files C source files header files  Percentage smaller for .h files  Use of preprocessor conditional to make header files system- dependent  #if defined(__FreeBSD__)
  • 12. RQ3 – CSBF Graph (excerpt) Blue/cyan: FreeBSD Red/orange: OpenBSD Yellow: common
  • 13. RQ3: social characteristics  Importance in terms of  (in/out) degree: number of (incoming/outcoming) communication links  Betweenness: number of communications for which the node is in the short path  Brokerage metrics: useful to analyze the communication between two clusters B is a coordinator B is a gatekeeper B is a representative
  • 14. RQ3 – social characteristics Representative Gatekeeper 12 Coordinator /10 10 Betweenness / 1000 8 Out-degree Column 1 6 In-degree Column 2 Column 3 4 Degree 2 0 5 10 15 20 25 30 35 40 45 50 0 Row 1 CSBF Row 2 Others Row 3 Row 4  All differences statistically significant  High effect size (Cohen d>1)  Contributors involved in CSBF have a higher importance in the communication and in the flow of communication between systems
  • 15. RQ3 – committers with highest social metrics
  • 16. RQ4 – change activity of CSBF committers and others LOC added/removed Commits 40000 1500 1000 20000 500 0 0 FreeBSD OpenBSD FreeBSD OpenBSD CSBF Others CSBF Others  All differences statistically significant  High effect size (Cohen d∌1)  Contributors involved in CSBF are more active than others
  • 17. Conclusions and Work-in-Progress  We proposed method to mine CSBF  We reported a study on FreeBSD and OpenBSD where:  Development team is almost disjoint  There is a small, though not negligible portion of CSBF  Committers involved in CSBF have – Higher social importance – Higher brokerage level – Higher activity in source code commits  Work-in-progress:  Better approaches to identify implicit CSBF, tracking and linking changes occurring on both systems  More extensive study on less obvious cases