This document discusses how to improve experiments in software engineering (SE) to better enable transferring lessons learned across different contexts. It notes that a lack of transfer is a major issue, leading to instability and non-reproducibility of results. The document recommends several approaches to improve transfer, including filtering data by relevance or variance, transforming data using techniques like principal component analysis, and using ensembles of models. It argues that past issues were partly due to obsessing over raw data dimensions and sharing single models, rather than combinations of human and automated analysis. With new technologies, a truer picture can emerge to understand what factors generally influence outcomes across varying conditions.
5. WHAT’S AT
STAKE?
• “Transfer” is a core
scientific issue
• Lack of transfer of causal
effects is the
scandal of SE
• Replication is
Empirical SE is rare
• Conclusion instability
• It all depends.
• The full stop
syndrome
• The result?
• A funding crisis
5
11. Q: HOW TO TRANSFER
LESSONS LEARNED?
Ignore most of the data
• relevancy filtering: Turhan ESEj’09; Peters TSE’13
• variance filtering: Kocaguneli TSE’12,TSE’13
• performance similarities: He ESEM’13
Contort the data
• spectral learning (working in PCA
space or some other rotation)
Menzies, TSE’13; Nam, ICSE’13
Buildi a bickering committee
• Ensembles Minku, PROMISE’12
11
16. WHAT’S CHANGED?
Mark of the old novice:
• Mostly manual analysis
• Obsesses on all the raw data
• Shares “the” model (the only, the single)
• E.g. “Depth of inheritance
is “the” most important
predictor for defects.”
Mark of the new expert:
• Manual and automatic analysis
• Combinations of Human + AI:
• Each offering input and insights to the other
• Filters most of the data, transforms the rest
• Shares analysis methods
• Cost effective methods for generating local lessons
12/1/2011
16
Most
probably
wrong
17. NOT EXTERNAL VALIDITY
BUT “META-EXTERNAL VALIDITY”
No pair
programming,
CMM5, agile
programming,
etc etc
But
conclusion
stability,
generality
10/11/2013
17
18. With new data mining technologies, true picture
emerges, where we can see what is going on
12/1/2011
18
SO THERE IS HOPE