SlideShare ist ein Scribd-Unternehmen logo
1 von 13
Exploring the Influence of Identifier Names
                             on Code Quality:
                            an empirical study

          Simon Butler, Michel Wermelinger, Yijun Yu and Helen Sharp

                                       Centre for Research in Computing
                                           The Open University, UK


                                   CSMR, Madrid, 18 March 2010




              Centre for
              Research in Computing

Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality   CSMR’10   1 / 13
Introduction


Identifier names
        primary source of concepts in source code
        crucial to program comprehension and readability
        reflect cognitive processes

A wider influence?
        connection between readability and defects (Buse & Weimer)

Research Question
                  ‘What is the influence of identifier name quality
                             on source code quality?’



Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality   CSMR’10   2 / 13
Evaluating Identifier Name Quality


Relf’s Identifier Naming Style Guidelines
        21 guidelines for Ada & Java
        evaluated empirically
        focus on typography of names
        simple approach to use of natural language

Applying the Guidelines
        adapted 9 guidelines as naming flaw indicators
               length: too few/many words/characters
               typographical conventions: capitalization, type encoding
               natural language: English and extended dictionaries




Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality   CSMR’10   3 / 13
Evaluating Code Quality

Static analysis
        FindBugs
               Java specific static analysis tool
               Identifies a range of priority 1 and 2 bug patterns
               Google: most identified issues required correction

Metrics
        Readability
               human-trained layout metric (Buse & Weimer)
        Cyclomatic Complexity
               to measure branching complexity
        Maintainability Index
               based on LOC, cyclomatic complexity, Halstead volume (Welker et al.)


Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality   CSMR’10   4 / 13
Methodology




Data Collection
        8 mature FLOSS Java projects from different domains
        each with 1-12 thousand methods
        computed metrics and extracted names from source code
        ran FindBugs on corresponding bytecode




Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality   CSMR’10   5 / 13
Methodology

Naming Quality
        Names split into hard words on typographical boundaries
               NullPointerException is split into {Null, Pointer, Exception}
               MOUSE EVENT MASK is split into {MOUSE, EVENT, MASK}
        Extended dictionaries created with unrecognised hard words
               built dictionaries for words used in 3, 5 or 10 unique identifiers
        Identifier names analysed for compliance with each guideline

Code Quality
        binary classification of methods into
               with/without FindBugs priority 1 (or 2) warnings
               readability below/above 0.5
               cyclomatic complexity below/above 6 (or 10)
               maintainability index below/above 65


Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality   CSMR’10   6 / 13
Statistical Analysis

Null hypothesis: independent distributions
        χ2 test applied to assess independence of identifier flaws and:
               FindBugs warnings
               less readable methods
               less maintainable methods
               less complex methods
        null hypothesis was rejected if p < 5%

Guidelines as classifiers?
        Applied diagnostic test evaluation used in medicine
        Compared each guideline vs reference classifiers
    JFreeChart                 FindBugs Priority Two Warnings
    Non-Dictionary Words      methods with       methods with-          sensitivity = 103 ÷ (103 + 37) = 0.74
                                                 out                    specificity = 5165 ÷ (2925 + 5165) = 0.64
                                                                        AUC = 0.69
    methods with              103               2925
    methods without           37                5165


Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality                   CSMR’10        7 / 13
Non-Dictionary Words Flaw




Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality   CSMR’10   8 / 13
Identifier flaws and FindBugs priority 2 warnings




                                                                                             JasperReports




                                                                                                                       JFreeChart
                                                                                 Hibernate
                                                                     Freemind




                                                                                                                                     Tomcat
                                                           Cactus




                                                                                                              jEdit
                                                   Ant
                  Capitalisation Anomaly          .62               .62           –               –                                 .57
                  Excessive Words                 .55    .55                    .58               –
                  External Underscores                     *                      *               *                     *
                  Long Identifier                         .59                    .57               –
                  Naming Convention Anomaly
                  Number of Words                 .56               .59           –                                   .55           .55
                  Numeric Identifier                        *          *                                         *                     *
                  Short Identifier Name            .56    .58        .62           –                                   .56           .57
                  Type Encoding                            *                      *                                                   *
                  Non-Dictionary Words            .60    .64        .62                           –          .63      .69           .59
                    Extended 3                    .64    .66        .59                                      .63                    .59
                    Extended 5                    .64    .65        .64           –                          .63      .72           .59
                    Extended 10                   .63    .64        .64           –                          .61      .72           .61

                  Less-readable                   .67               .67         .67               –                   .66           .68

                                                         p < 0.001                                           p < 0.05
                                                         p >= 0.05                                *          No flaw




Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality                                                        CSMR’10   9 / 13
Identifier flaws and Cyclomatic Complexity >= 10




                                                                                             JasperReports




                                                                                                                       JFreeChart
                                                                                Hibernate
                                                                    Freemind




                                                                                                                                     Tomcat
                                                          Cactus




                                                                                                              jEdit
                                                   Ant
                  Capitalisation Anomaly         .67     .72       .63         .64          .66              .61      .73           .75
                  Excessive Words                .55     .55       .58         .65          .58                       .60
                  External Underscores                     *                     *            *                         *
                  Long Identifier                         .56       .57         .68          .66                       .58           .57
                  Naming Convention Anomaly                                                                  .55
                  Number of Words                .55     .61       .57         .60          .64                       .58           .59
                  Numeric Identifier                        *         *                                         *                      *
                  Short Identifier Name           .63     .65       .57         .62          .62              .55      .60           .62
                  Type Encoding                            *                     *                                                    *
                  Non-Dictionary Words           .67     .70       .67         .74          .70              .64      .78           .76
                    Extended 3                   .69     .70       .61         .73          .68              .64      .75           .75
                    Extended 5                   .70     .69       .65         .75          .73              .66      .82           .76
                    Extended 10                  .70     .70       .66         .76          .74              .66      .81           .77

                                                         p < 0.001                                           p < 0.05
                                                         p >= 0.05                                *          No flaw




Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality                                                    CSMR’10   10 / 13
Identifier flaws and Less-Readable methods




                                                                                             JasperReports




                                                                                                                       JFreeChart
                                                                                Hibernate
                                                                    Freemind




                                                                                                                                     Tomcat
                                                          Cactus




                                                                                                              jEdit
                                                   Ant
                  Capitalisation Anomaly         .62     .55       .61         .60          .62              .62      .63           .66
                  Excessive Words                                  .59         .58          .61                       .57
                  External Underscores                     *                     *            *                         *
                  Long Identifier                         .56       .58         .60          .58                       .56           .56
                  Naming Convention Anomaly
                  Number of Words                                  .56                      .60                                     .55
                  Numeric Identifier                        *         *                                          *                     *
                  Short Identifier Name                                                                                              .57
                  Type Encoding                            *                     *                                                    *
                  Non-Dictionary Words           .65     .56       .61         .66          .65              .65      .62           .68
                    Extended 3                   .62               .56         .58                           .62      .60           .65
                    Extended 5                   .64               .57         .60                           .63      .63           .66
                    Extended 10                  .65     .56       .58         .63                           .65      .63           .68

                                                         p < 0.001                                           p < 0.05
                                                         p >= 0.05                               *           No flaw




Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality                                                    CSMR’10   11 / 13
Identifier flaws and Less-Maintainable methods




                                                                                             JasperReports




                                                                                                                       JFreeChart
                                                                                Hibernate
                                                                    Freemind




                                                                                                                                     Tomcat
                                                          Cactus




                                                                                                              jEdit
                                                   Ant
                  Capitalisation Anomaly         .78     .78       .76         .67          .67              .64      .81           .77
                  Excessive Words                .59     .58       .67         .68          .62              .57      .63           .55
                  External Underscores                     *                     *            *              .57        *
                  Long Identifier                 .57     .68       .67         .73          .71              .57      .61           .58
                  Naming Convention Anomaly      .55                           .57          .56              .55
                  Number of Words                .57     .61       .62         .62          .65              .56      .59           .60
                  Numeric Identifier                        *         *                                         *                      *
                  Short Identifier Name           .59     .65       .62         .65          .66              .56      .61           .63
                  Type Encoding                            *                     *                                                    *
                  Non-Dictionary Words           .76     .77       .79         .82          .72              .72      .80           .78
                    Extended 3                   .81     .76       .69         .83          .72              .71      .84           .80
                    Extended 5                   .82     .76       .75         .85          .78              .74      .85           .80
                    Extended 10                  .80     .77       .77         .85          .80              .74      .84           .80

                                                         p < 0.001                                           p < 0.05
                                                         p >= 0.05                               *           No flaw




Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality                                                    CSMR’10   12 / 13
Conclusions


We found:
        Poor quality identifier names are associated with:
               more complex
               less readable
               less maintainable
               potentially more buggy code
        Natural language content of identifier names is a classifier for source
        code quality
        Identifier name length is a classifier for complexity and maintainability
        Opposite associations only in commercialised projects suggesting
        differences between open source and commercial code



Simon Butler et al. (Open Univ., UK)   The Influence of Identifiers on Code Quality   CSMR’10   13 / 13

Weitere ähnliche Inhalte

Ähnlich wie The influence of identifiers on code quality

Introduction to automated quality assurance
Introduction to automated quality assuranceIntroduction to automated quality assurance
Introduction to automated quality assurancePhilip Johnson
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionMartin Pinzger
 
Generics Tutorial
Generics TutorialGenerics Tutorial
Generics Tutorialwasntgosu
 
117 A Outline 25
117 A Outline 25117 A Outline 25
117 A Outline 25wasntgosu
 
Generics Tutorial
Generics TutorialGenerics Tutorial
Generics Tutorialwasntgosu
 
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...IJNSA Journal
 
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...IJNSA Journal
 
A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug predictionMartin Pinzger
 
How To Tidy Up Your Test Code
How To Tidy Up Your Test CodeHow To Tidy Up Your Test Code
How To Tidy Up Your Test CodeRock Interview
 
20191116 DevFest 2019 The Legacy Code came to stay (El legacy vino para queda...
20191116 DevFest 2019 The Legacy Code came to stay (El legacy vino para queda...20191116 DevFest 2019 The Legacy Code came to stay (El legacy vino para queda...
20191116 DevFest 2019 The Legacy Code came to stay (El legacy vino para queda...Antonio de la Torre Fernández
 

Ähnlich wie The influence of identifiers on code quality (11)

Introduction to automated quality assurance
Introduction to automated quality assuranceIntroduction to automated quality assurance
Introduction to automated quality assurance
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
 
Generics Tutorial
Generics TutorialGenerics Tutorial
Generics Tutorial
 
Code review
Code reviewCode review
Code review
 
117 A Outline 25
117 A Outline 25117 A Outline 25
117 A Outline 25
 
Generics Tutorial
Generics TutorialGenerics Tutorial
Generics Tutorial
 
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...
 
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...
Feature Extraction using Sparse SVD for Biometric Fusion in Multimodal Authen...
 
A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug prediction
 
How To Tidy Up Your Test Code
How To Tidy Up Your Test CodeHow To Tidy Up Your Test Code
How To Tidy Up Your Test Code
 
20191116 DevFest 2019 The Legacy Code came to stay (El legacy vino para queda...
20191116 DevFest 2019 The Legacy Code came to stay (El legacy vino para queda...20191116 DevFest 2019 The Legacy Code came to stay (El legacy vino para queda...
20191116 DevFest 2019 The Legacy Code came to stay (El legacy vino para queda...
 

Mehr von Michel Wermelinger

Learn to Code for Data Analysis
Learn to Code for Data AnalysisLearn to Code for Data Analysis
Learn to Code for Data AnalysisMichel Wermelinger
 
Challenges in Model-Based Evolution of Access Control Properties
Challenges in Model-Based Evolution of Access Control Properties Challenges in Model-Based Evolution of Access Control Properties
Challenges in Model-Based Evolution of Access Control Properties Michel Wermelinger
 
Quality & Evolution: some relationships
Quality & Evolution: some relationshipsQuality & Evolution: some relationships
Quality & Evolution: some relationshipsMichel Wermelinger
 
Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Michel Wermelinger
 

Mehr von Michel Wermelinger (7)

Learn to Code for Data Analysis
Learn to Code for Data AnalysisLearn to Code for Data Analysis
Learn to Code for Data Analysis
 
CAS data literacy
CAS data literacyCAS data literacy
CAS data literacy
 
Challenges in Model-Based Evolution of Access Control Properties
Challenges in Model-Based Evolution of Access Control Properties Challenges in Model-Based Evolution of Access Control Properties
Challenges in Model-Based Evolution of Access Control Properties
 
Quality & Evolution: some relationships
Quality & Evolution: some relationshipsQuality & Evolution: some relationships
Quality & Evolution: some relationships
 
My Research in a Nutshell
My Research in a NutshellMy Research in a Nutshell
My Research in a Nutshell
 
Of Bugs and Men
Of Bugs and MenOf Bugs and Men
Of Bugs and Men
 
Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)
 

Kürzlich hochgeladen

Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 

Kürzlich hochgeladen (20)

Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 

The influence of identifiers on code quality

  • 1. Exploring the Influence of Identifier Names on Code Quality: an empirical study Simon Butler, Michel Wermelinger, Yijun Yu and Helen Sharp Centre for Research in Computing The Open University, UK CSMR, Madrid, 18 March 2010 Centre for Research in Computing Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 1 / 13
  • 2. Introduction Identifier names primary source of concepts in source code crucial to program comprehension and readability reflect cognitive processes A wider influence? connection between readability and defects (Buse & Weimer) Research Question ‘What is the influence of identifier name quality on source code quality?’ Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 2 / 13
  • 3. Evaluating Identifier Name Quality Relf’s Identifier Naming Style Guidelines 21 guidelines for Ada & Java evaluated empirically focus on typography of names simple approach to use of natural language Applying the Guidelines adapted 9 guidelines as naming flaw indicators length: too few/many words/characters typographical conventions: capitalization, type encoding natural language: English and extended dictionaries Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 3 / 13
  • 4. Evaluating Code Quality Static analysis FindBugs Java specific static analysis tool Identifies a range of priority 1 and 2 bug patterns Google: most identified issues required correction Metrics Readability human-trained layout metric (Buse & Weimer) Cyclomatic Complexity to measure branching complexity Maintainability Index based on LOC, cyclomatic complexity, Halstead volume (Welker et al.) Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 4 / 13
  • 5. Methodology Data Collection 8 mature FLOSS Java projects from different domains each with 1-12 thousand methods computed metrics and extracted names from source code ran FindBugs on corresponding bytecode Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 5 / 13
  • 6. Methodology Naming Quality Names split into hard words on typographical boundaries NullPointerException is split into {Null, Pointer, Exception} MOUSE EVENT MASK is split into {MOUSE, EVENT, MASK} Extended dictionaries created with unrecognised hard words built dictionaries for words used in 3, 5 or 10 unique identifiers Identifier names analysed for compliance with each guideline Code Quality binary classification of methods into with/without FindBugs priority 1 (or 2) warnings readability below/above 0.5 cyclomatic complexity below/above 6 (or 10) maintainability index below/above 65 Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 6 / 13
  • 7. Statistical Analysis Null hypothesis: independent distributions χ2 test applied to assess independence of identifier flaws and: FindBugs warnings less readable methods less maintainable methods less complex methods null hypothesis was rejected if p < 5% Guidelines as classifiers? Applied diagnostic test evaluation used in medicine Compared each guideline vs reference classifiers JFreeChart FindBugs Priority Two Warnings Non-Dictionary Words methods with methods with- sensitivity = 103 ÷ (103 + 37) = 0.74 out specificity = 5165 ÷ (2925 + 5165) = 0.64 AUC = 0.69 methods with 103 2925 methods without 37 5165 Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 7 / 13
  • 8. Non-Dictionary Words Flaw Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 8 / 13
  • 9. Identifier flaws and FindBugs priority 2 warnings JasperReports JFreeChart Hibernate Freemind Tomcat Cactus jEdit Ant Capitalisation Anomaly .62 .62 – – .57 Excessive Words .55 .55 .58 – External Underscores * * * * Long Identifier .59 .57 – Naming Convention Anomaly Number of Words .56 .59 – .55 .55 Numeric Identifier * * * * Short Identifier Name .56 .58 .62 – .56 .57 Type Encoding * * * Non-Dictionary Words .60 .64 .62 – .63 .69 .59 Extended 3 .64 .66 .59 .63 .59 Extended 5 .64 .65 .64 – .63 .72 .59 Extended 10 .63 .64 .64 – .61 .72 .61 Less-readable .67 .67 .67 – .66 .68 p < 0.001 p < 0.05 p >= 0.05 * No flaw Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 9 / 13
  • 10. Identifier flaws and Cyclomatic Complexity >= 10 JasperReports JFreeChart Hibernate Freemind Tomcat Cactus jEdit Ant Capitalisation Anomaly .67 .72 .63 .64 .66 .61 .73 .75 Excessive Words .55 .55 .58 .65 .58 .60 External Underscores * * * * Long Identifier .56 .57 .68 .66 .58 .57 Naming Convention Anomaly .55 Number of Words .55 .61 .57 .60 .64 .58 .59 Numeric Identifier * * * * Short Identifier Name .63 .65 .57 .62 .62 .55 .60 .62 Type Encoding * * * Non-Dictionary Words .67 .70 .67 .74 .70 .64 .78 .76 Extended 3 .69 .70 .61 .73 .68 .64 .75 .75 Extended 5 .70 .69 .65 .75 .73 .66 .82 .76 Extended 10 .70 .70 .66 .76 .74 .66 .81 .77 p < 0.001 p < 0.05 p >= 0.05 * No flaw Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 10 / 13
  • 11. Identifier flaws and Less-Readable methods JasperReports JFreeChart Hibernate Freemind Tomcat Cactus jEdit Ant Capitalisation Anomaly .62 .55 .61 .60 .62 .62 .63 .66 Excessive Words .59 .58 .61 .57 External Underscores * * * * Long Identifier .56 .58 .60 .58 .56 .56 Naming Convention Anomaly Number of Words .56 .60 .55 Numeric Identifier * * * * Short Identifier Name .57 Type Encoding * * * Non-Dictionary Words .65 .56 .61 .66 .65 .65 .62 .68 Extended 3 .62 .56 .58 .62 .60 .65 Extended 5 .64 .57 .60 .63 .63 .66 Extended 10 .65 .56 .58 .63 .65 .63 .68 p < 0.001 p < 0.05 p >= 0.05 * No flaw Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 11 / 13
  • 12. Identifier flaws and Less-Maintainable methods JasperReports JFreeChart Hibernate Freemind Tomcat Cactus jEdit Ant Capitalisation Anomaly .78 .78 .76 .67 .67 .64 .81 .77 Excessive Words .59 .58 .67 .68 .62 .57 .63 .55 External Underscores * * * .57 * Long Identifier .57 .68 .67 .73 .71 .57 .61 .58 Naming Convention Anomaly .55 .57 .56 .55 Number of Words .57 .61 .62 .62 .65 .56 .59 .60 Numeric Identifier * * * * Short Identifier Name .59 .65 .62 .65 .66 .56 .61 .63 Type Encoding * * * Non-Dictionary Words .76 .77 .79 .82 .72 .72 .80 .78 Extended 3 .81 .76 .69 .83 .72 .71 .84 .80 Extended 5 .82 .76 .75 .85 .78 .74 .85 .80 Extended 10 .80 .77 .77 .85 .80 .74 .84 .80 p < 0.001 p < 0.05 p >= 0.05 * No flaw Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 12 / 13
  • 13. Conclusions We found: Poor quality identifier names are associated with: more complex less readable less maintainable potentially more buggy code Natural language content of identifier names is a classifier for source code quality Identifier name length is a classifier for complexity and maintainability Opposite associations only in commercialised projects suggesting differences between open source and commercial code Simon Butler et al. (Open Univ., UK) The Influence of Identifiers on Code Quality CSMR’10 13 / 13