SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
0/10
26th International Conference on Software Engineering (ICSE), Edinburgh, 28.05.2004




Mining Version Histories
to Guide Software Changes

Thomas Zimmermann
(with Peter Weißgerber, Stephan Diehl, and Andreas Zeller)
Lehrstuhl Softwaretechnik
Universität des Saarlandes, Saarbrücken
Extending ECLIPSE Preferences                      1/10


Your task: Extend ECLIPSE with a new preference.
Extending ECLIPSE Preferences                      1/10


Your task: Extend ECLIPSE with a new preference.




Preferences are stored in field fKeys[]:
Extending ECLIPSE Preferences              2/10


What else do you need to change?

Which of the 27,000 files
             20,000 classes
             200,000 methods of ECLIPSE?
Extending ECLIPSE Preferences                           2/10


What else do you need to change?

Which of the 27,000 files
             20,000 classes
             200,000 methods of ECLIPSE?

Program analysis.
   fKeys[] and initDefaults() use the same variables.

    – Usage does not induce change.
    – Usage can be detected only within program code.
      ECLIPSE has 12,000 non-JAVA files
Extending ECLIPSE Preferences                           2/10


What else do you need to change?

Which of the 27,000 files
             20,000 classes
             200,000 methods of ECLIPSE?

Program analysis.
   fKeys[] and initDefaults() use the same variables.

    – Usage does not induce change.
    – Usage can be detected only within program code.
      ECLIPSE has 12,000 non-JAVA files

Learning from history.
   Programmers who changed fKeys[] also changed…
Guiding the Programmer     3/10




 A) The user inserts a
 new preference into
 the field fKeys[]




 B) ROSE suggests
 locations for further
 changes, e.g. the
 function initDefaults()
From CVS to Transactions                                     4/10


The ECLIPSE CVS archive has more than 47,000 transactions.
From CVS to Transactions                                     4/10


The ECLIPSE CVS archive has more than 47,000 transactions.




                             !
Mining Association Rules                                                       5/10


ROSE takes all transactions as input:

     T42   =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
    T752   =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
   T9872   =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
  T11386   =   {   fKeys[],   initDefaults(),   …}
  T20814   =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T30989   =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T41999   =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T47423   =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
           .
           .
           .
Mining Association Rules                                                        5/10


ROSE takes all transactions as input:

     T42    =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
    T752    =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
   T9872    =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
  T11386    =   {   fKeys[],   initDefaults(),   …}
  T20814    =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T30989    =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T41999    =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T47423    =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
            .
            .
            .

ROSE mines association rules from these transactions:

           { fKeys[], initDefaults() } ⇒ { plugin.properties }
                 [Support 7, Confidence 7/8 = 0.875]
Effective Mining                                                   6/10


The classical association mining approach is to mine all rules:

 – Helpful in understanding general patterns.
 – Requires high support thresholds (>2n possible rules).
 – Takes time to compute (3 days and more).
Effective Mining                                                   6/10


The classical association mining approach is to mine all rules:

 – Helpful in understanding general patterns.
 – Requires high support thresholds (>2n possible rules).
 – Takes time to compute (3 days and more).


Alternative — mine only matching rules on demand:

Constraints on antecedent. Mine only rules which are related
  to the situation Σ, e.g. Σ ⇒ X
Single consequent rules. Mine only rules which have a
   singleton as consequent, e.g. Σ ⇒ {x}

Average runtime of a query: 0.5 seconds.
Precision vs. Recall                                                7/10


What ROSE finds                               What it should find




 False positives                                 False negatives
                         Correct prediction


Precision How many of the returned entities are relevant?
   High precision = few false positives
Recall How many relevant entities are returned?
   High recall = few false negatives
Evaluation                                                      8/10


The programmer has changed one single entity.
Can ROSE suggest other entities that should be changed?

       Granularity        Entities
       Project     Recall Precision Top3
       ECLIPSE     0.15     0.26    0.53
       GCC         0.28     0.39    0.89
       GIMP        0.12     0.25    0.91
       JBOSS       0.16     0.38    0.69
       JEDIT       0.07     0.16    0.52
       KOFFICE     0.08     0.17    0.46
       POSTGRES    0.13     0.23    0.59
       PYTHON      0.14     0.24    0.51
       Average     0.15     0.26    0.64


     ROSE predicts 15% of all changed entities
 In 64% of all transactions, ROSE’s topmost three suggestions
              contain a correct entity
Evaluation                                                       8/10


The programmer has changed one single entity.
Can ROSE suggest other entities that should be changed?

       Granularity        Entities                Files
       Project     Recall Precision Top3 Recall Precision Top3
       ECLIPSE     0.15     0.26    0.53 0.17     0.26    0.54
       GCC         0.28     0.39    0.89 0.44     0.42    0.87
       GIMP        0.12     0.25    0.91 0.27     0.26    0.90
       JBOSS       0.16     0.38    0.69 0.25     0.37    0.64
       JEDIT       0.07     0.16    0.52 0.25     0.22    0.68
       KOFFICE     0.08     0.17    0.46 0.24     0.26    0.67
       POSTGRES    0.13     0.23    0.59 0.23     0.24    0.68
       PYTHON      0.14     0.24    0.51 0.24     0.36    0.60
       Average     0.15     0.26    0.64 0.26     0.30    0.70


     ROSE predicts 15% of all changed entities (files: 26%).
 In 64% of all transactions, ROSE’s topmost three suggestions
              contain a correct entity (files: 70%).
Challenges                                               9/10


Further Data Sources.
   Test outcomes, Mailing lists, Newsgroups, Chat logs
   How do we leverage these sources?
Challenges                                               9/10


Further Data Sources.
   Test outcomes, Mailing lists, Newsgroups, Chat logs
   How do we leverage these sources?
Further Analyses.
   Program analysis, Sequence analysis, Clustering
   How do we integrate different analyses?
Challenges                                                9/10


Further Data Sources.
   Test outcomes, Mailing lists, Newsgroups, Chat logs
   How do we leverage these sources?
Further Analyses.
   Program analysis, Sequence analysis, Clustering
   How do we integrate different analyses?
From Locations to Actions.
   You have extended fKeys[] with UI_SPLINES;
   ROSE suggests:
         Insert store.setDefaults(UI_SPLINES, false);
         in function initDefaults();
   The user can accept this at the touch of one button.
   How much can we learn from history?
Conclusion                                                     10/10


5 ROSE detects coupling between non-program entities
  (e.g. programs and documentation).
5 ROSE effectively guides users along related changes.
5 In 64% of all transactions, ROSE’s topmost three
  suggestions contain a correct entity (files: 70%).
5 Research has just begun to exploit non-program artefacts:
   – Similar results by A. Ying (2004); A. Hassan (2004);
     and J. Sayyad-Shirabad (2003).
   – ICSE Workshop on Mining Software Repositories, 2004.

5 ROSE will be available as an ECLIPSE plug-in in Fall 2004:
          http://www.st.cs.uni-sb.de/softevo/

Weitere ähnliche Inhalte

Was ist angesagt?

Running Siebel on AWS - Oracle Open World 13
Running Siebel on AWS - Oracle Open World 13Running Siebel on AWS - Oracle Open World 13
Running Siebel on AWS - Oracle Open World 13Milind Waikul
 
Under the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureUnder the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureScyllaDB
 
Deep Dive - Amazon Kinesis Video Streams - AWS Online Tech Talks
Deep Dive - Amazon Kinesis Video Streams - AWS Online Tech TalksDeep Dive - Amazon Kinesis Video Streams - AWS Online Tech Talks
Deep Dive - Amazon Kinesis Video Streams - AWS Online Tech TalksAmazon Web Services
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB
 
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsBest Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsCloudera, Inc.
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational DatabasesChris Baglieri
 
The easiest consistent hashing
The easiest consistent hashingThe easiest consistent hashing
The easiest consistent hashingDaeMyung Kang
 
Introduction to redis - version 2
Introduction to redis - version 2Introduction to redis - version 2
Introduction to redis - version 2Dvir Volk
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - DatalakeLam Le
 
DevOps: Benefits & Future Trends
DevOps: Benefits & Future TrendsDevOps: Benefits & Future Trends
DevOps: Benefits & Future Trends9 series
 
Achieve Blazing-Fast Ingest Speeds with Apache Arrow
Achieve Blazing-Fast Ingest Speeds with Apache ArrowAchieve Blazing-Fast Ingest Speeds with Apache Arrow
Achieve Blazing-Fast Ingest Speeds with Apache ArrowNeo4j
 
CI/CD Overview
CI/CD OverviewCI/CD Overview
CI/CD OverviewAn Nguyen
 
SRE Demystified - 16 - NALSD - Non-Abstract Large System Design
SRE Demystified - 16 - NALSD - Non-Abstract Large System DesignSRE Demystified - 16 - NALSD - Non-Abstract Large System Design
SRE Demystified - 16 - NALSD - Non-Abstract Large System DesignDr Ganesh Iyer
 
Flink Batch Processing and Iterations
Flink Batch Processing and IterationsFlink Batch Processing and Iterations
Flink Batch Processing and IterationsSameer Wadkar
 
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesBest Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesMarkus Michalewicz
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...Redis Labs
 
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and ParquetBig Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and ParquetDataWorks Summit
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture ForumChristopher Spring
 

Was ist angesagt? (20)

Azure redis cache
Azure redis cacheAzure redis cache
Azure redis cache
 
Running Siebel on AWS - Oracle Open World 13
Running Siebel on AWS - Oracle Open World 13Running Siebel on AWS - Oracle Open World 13
Running Siebel on AWS - Oracle Open World 13
 
Under the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureUnder the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database Architecture
 
Deep Dive - Amazon Kinesis Video Streams - AWS Online Tech Talks
Deep Dive - Amazon Kinesis Video Streams - AWS Online Tech TalksDeep Dive - Amazon Kinesis Video Streams - AWS Online Tech Talks
Deep Dive - Amazon Kinesis Video Streams - AWS Online Tech Talks
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation Performance
 
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsBest Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
 
The easiest consistent hashing
The easiest consistent hashingThe easiest consistent hashing
The easiest consistent hashing
 
Introduction to redis - version 2
Introduction to redis - version 2Introduction to redis - version 2
Introduction to redis - version 2
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - Datalake
 
DevOps: Benefits & Future Trends
DevOps: Benefits & Future TrendsDevOps: Benefits & Future Trends
DevOps: Benefits & Future Trends
 
Achieve Blazing-Fast Ingest Speeds with Apache Arrow
Achieve Blazing-Fast Ingest Speeds with Apache ArrowAchieve Blazing-Fast Ingest Speeds with Apache Arrow
Achieve Blazing-Fast Ingest Speeds with Apache Arrow
 
CI/CD Overview
CI/CD OverviewCI/CD Overview
CI/CD Overview
 
SRE Demystified - 16 - NALSD - Non-Abstract Large System Design
SRE Demystified - 16 - NALSD - Non-Abstract Large System DesignSRE Demystified - 16 - NALSD - Non-Abstract Large System Design
SRE Demystified - 16 - NALSD - Non-Abstract Large System Design
 
Flink Batch Processing and Iterations
Flink Batch Processing and IterationsFlink Batch Processing and Iterations
Flink Batch Processing and Iterations
 
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesBest Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
 
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and ParquetBig Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture Forum
 
devops
devops devops
devops
 

Ähnlich wie Mining Version Histories to Guide Software Changes

Industrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spacesIndustrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spacesCapstone
 
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay PlatonovSenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay PlatonovSencha
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software developmentMartin Pinzger
 
muCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos EngineeringmuCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos EngineeringSylvain Hellegouarch
 
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...James Salter
 
MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0oysteing
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey J On The Beach
 
Junhua wang ai_next_con
Junhua wang ai_next_conJunhua wang ai_next_con
Junhua wang ai_next_conJunhua Wang
 
Deep Learning Inference at speed and scale
Deep Learning Inference at speed and scaleDeep Learning Inference at speed and scale
Deep Learning Inference at speed and scaleBill Liu
 
MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)Kenny Gryp
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...Rafael Ferreira da Silva
 
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...Juan Cruz Nores
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5Peter Lawrey
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoJoel Falcou
 
Ducasse's Maintenance Expertise
Ducasse's Maintenance ExpertiseDucasse's Maintenance Expertise
Ducasse's Maintenance ExpertiseStéphane Ducasse
 
Simplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterSimplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterHarsh Kevadia
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffMartin Pinzger
 
Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013Sri Ambati
 

Ähnlich wie Mining Version Histories to Guide Software Changes (20)

Limits Profiling
Limits ProfilingLimits Profiling
Limits Profiling
 
Industrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spacesIndustrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spaces
 
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay PlatonovSenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software development
 
muCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos EngineeringmuCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos Engineering
 
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
 
MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
 
Junhua wang ai_next_con
Junhua wang ai_next_conJunhua wang ai_next_con
Junhua wang ai_next_con
 
Deep Learning Inference at speed and scale
Deep Learning Inference at speed and scaleDeep Learning Inference at speed and scale
Deep Learning Inference at speed and scale
 
MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...
 
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.Proto
 
Ducasse's Maintenance Expertise
Ducasse's Maintenance ExpertiseDucasse's Maintenance Expertise
Ducasse's Maintenance Expertise
 
Simplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterSimplified Data Processing On Large Cluster
Simplified Data Processing On Large Cluster
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
 
Wcre12c.ppt
Wcre12c.pptWcre12c.ppt
Wcre12c.ppt
 
Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013
 

Mehr von Thomas Zimmermann

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing InformationThomas Zimmermann
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsThomas Zimmermann
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development Thomas Zimmermann
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedThomas Zimmermann
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user researchThomas Zimmermann
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsThomas Zimmermann
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchThomas Zimmermann
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic modelsThomas Zimmermann
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software developmentThomas Zimmermann
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedThomas Zimmermann
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect predictionThomas Zimmermann
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsThomas Zimmermann
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceThomas Zimmermann
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Thomas Zimmermann
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringThomas Zimmermann
 

Mehr von Thomas Zimmermann (20)

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
 
MSR 2013 Preview
MSR 2013 PreviewMSR 2013 Preview
MSR 2013 Preview
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode Operations
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get Reopened
 
Klingon Countdown Timer
Klingon Countdown TimerKlingon Countdown Timer
Klingon Countdown Timer
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user research
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignments
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft Research
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic models
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software development
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixed
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect prediction
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open Source
 
Meet Tom and his Fish
Meet Tom and his FishMeet Tom and his Fish
Meet Tom and his Fish
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software Engineering
 

Kürzlich hochgeladen

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - AvrilIvanti
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 

Kürzlich hochgeladen (20)

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - Avril
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 

Mining Version Histories to Guide Software Changes

  • 1. 0/10 26th International Conference on Software Engineering (ICSE), Edinburgh, 28.05.2004 Mining Version Histories to Guide Software Changes Thomas Zimmermann (with Peter Weißgerber, Stephan Diehl, and Andreas Zeller) Lehrstuhl Softwaretechnik Universität des Saarlandes, Saarbrücken
  • 2. Extending ECLIPSE Preferences 1/10 Your task: Extend ECLIPSE with a new preference.
  • 3. Extending ECLIPSE Preferences 1/10 Your task: Extend ECLIPSE with a new preference. Preferences are stored in field fKeys[]:
  • 4. Extending ECLIPSE Preferences 2/10 What else do you need to change? Which of the 27,000 files 20,000 classes 200,000 methods of ECLIPSE?
  • 5. Extending ECLIPSE Preferences 2/10 What else do you need to change? Which of the 27,000 files 20,000 classes 200,000 methods of ECLIPSE? Program analysis. fKeys[] and initDefaults() use the same variables. – Usage does not induce change. – Usage can be detected only within program code. ECLIPSE has 12,000 non-JAVA files
  • 6. Extending ECLIPSE Preferences 2/10 What else do you need to change? Which of the 27,000 files 20,000 classes 200,000 methods of ECLIPSE? Program analysis. fKeys[] and initDefaults() use the same variables. – Usage does not induce change. – Usage can be detected only within program code. ECLIPSE has 12,000 non-JAVA files Learning from history. Programmers who changed fKeys[] also changed…
  • 7. Guiding the Programmer 3/10 A) The user inserts a new preference into the field fKeys[] B) ROSE suggests locations for further changes, e.g. the function initDefaults()
  • 8. From CVS to Transactions 4/10 The ECLIPSE CVS archive has more than 47,000 transactions.
  • 9. From CVS to Transactions 4/10 The ECLIPSE CVS archive has more than 47,000 transactions. !
  • 10. Mining Association Rules 5/10 ROSE takes all transactions as input: T42 = { fKeys[], initDefaults(), …, plugin.properties, …} T752 = { fKeys[], initDefaults(), …, plugin.properties, …} T9872 = { fKeys[], initDefaults(), …, plugin.properties, …} T11386 = { fKeys[], initDefaults(), …} T20814 = { fKeys[], initDefaults(), …, plugin.properties, …} T30989 = { fKeys[], initDefaults(), …, plugin.properties, …} T41999 = { fKeys[], initDefaults(), …, plugin.properties, …} T47423 = { fKeys[], initDefaults(), …, plugin.properties, …} . . .
  • 11. Mining Association Rules 5/10 ROSE takes all transactions as input: T42 = { fKeys[], initDefaults(), …, plugin.properties, …} T752 = { fKeys[], initDefaults(), …, plugin.properties, …} T9872 = { fKeys[], initDefaults(), …, plugin.properties, …} T11386 = { fKeys[], initDefaults(), …} T20814 = { fKeys[], initDefaults(), …, plugin.properties, …} T30989 = { fKeys[], initDefaults(), …, plugin.properties, …} T41999 = { fKeys[], initDefaults(), …, plugin.properties, …} T47423 = { fKeys[], initDefaults(), …, plugin.properties, …} . . . ROSE mines association rules from these transactions: { fKeys[], initDefaults() } ⇒ { plugin.properties } [Support 7, Confidence 7/8 = 0.875]
  • 12. Effective Mining 6/10 The classical association mining approach is to mine all rules: – Helpful in understanding general patterns. – Requires high support thresholds (>2n possible rules). – Takes time to compute (3 days and more).
  • 13. Effective Mining 6/10 The classical association mining approach is to mine all rules: – Helpful in understanding general patterns. – Requires high support thresholds (>2n possible rules). – Takes time to compute (3 days and more). Alternative — mine only matching rules on demand: Constraints on antecedent. Mine only rules which are related to the situation Σ, e.g. Σ ⇒ X Single consequent rules. Mine only rules which have a singleton as consequent, e.g. Σ ⇒ {x} Average runtime of a query: 0.5 seconds.
  • 14. Precision vs. Recall 7/10 What ROSE finds What it should find False positives False negatives Correct prediction Precision How many of the returned entities are relevant? High precision = few false positives Recall How many relevant entities are returned? High recall = few false negatives
  • 15. Evaluation 8/10 The programmer has changed one single entity. Can ROSE suggest other entities that should be changed? Granularity Entities Project Recall Precision Top3 ECLIPSE 0.15 0.26 0.53 GCC 0.28 0.39 0.89 GIMP 0.12 0.25 0.91 JBOSS 0.16 0.38 0.69 JEDIT 0.07 0.16 0.52 KOFFICE 0.08 0.17 0.46 POSTGRES 0.13 0.23 0.59 PYTHON 0.14 0.24 0.51 Average 0.15 0.26 0.64 ROSE predicts 15% of all changed entities In 64% of all transactions, ROSE’s topmost three suggestions contain a correct entity
  • 16. Evaluation 8/10 The programmer has changed one single entity. Can ROSE suggest other entities that should be changed? Granularity Entities Files Project Recall Precision Top3 Recall Precision Top3 ECLIPSE 0.15 0.26 0.53 0.17 0.26 0.54 GCC 0.28 0.39 0.89 0.44 0.42 0.87 GIMP 0.12 0.25 0.91 0.27 0.26 0.90 JBOSS 0.16 0.38 0.69 0.25 0.37 0.64 JEDIT 0.07 0.16 0.52 0.25 0.22 0.68 KOFFICE 0.08 0.17 0.46 0.24 0.26 0.67 POSTGRES 0.13 0.23 0.59 0.23 0.24 0.68 PYTHON 0.14 0.24 0.51 0.24 0.36 0.60 Average 0.15 0.26 0.64 0.26 0.30 0.70 ROSE predicts 15% of all changed entities (files: 26%). In 64% of all transactions, ROSE’s topmost three suggestions contain a correct entity (files: 70%).
  • 17. Challenges 9/10 Further Data Sources. Test outcomes, Mailing lists, Newsgroups, Chat logs How do we leverage these sources?
  • 18. Challenges 9/10 Further Data Sources. Test outcomes, Mailing lists, Newsgroups, Chat logs How do we leverage these sources? Further Analyses. Program analysis, Sequence analysis, Clustering How do we integrate different analyses?
  • 19. Challenges 9/10 Further Data Sources. Test outcomes, Mailing lists, Newsgroups, Chat logs How do we leverage these sources? Further Analyses. Program analysis, Sequence analysis, Clustering How do we integrate different analyses? From Locations to Actions. You have extended fKeys[] with UI_SPLINES; ROSE suggests: Insert store.setDefaults(UI_SPLINES, false); in function initDefaults(); The user can accept this at the touch of one button. How much can we learn from history?
  • 20. Conclusion 10/10 5 ROSE detects coupling between non-program entities (e.g. programs and documentation). 5 ROSE effectively guides users along related changes. 5 In 64% of all transactions, ROSE’s topmost three suggestions contain a correct entity (files: 70%). 5 Research has just begun to exploit non-program artefacts: – Similar results by A. Ying (2004); A. Hassan (2004); and J. Sayyad-Shirabad (2003). – ICSE Workshop on Mining Software Repositories, 2004. 5 ROSE will be available as an ECLIPSE plug-in in Fall 2004: http://www.st.cs.uni-sb.de/softevo/