SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
Mining source code for structural
regularities
Kim Mens, 	

Andy Kellens, 	

Gabriela Arevalo, 	

Angela Lozano
Problem Context
• Context: 	

• Programmers often use regularities: coding

conventions, design patterns, crosscutting concerns...	


• Enforcing such regularities facilitates maintenance,
evolution & comprehension 	


• Regularities are often not fully respected (implicit, no
support)	


• Goal: provide automated (tool) support to discover
source code (ir)regularities
ATTRIBUTES
4 legged
Characterizing
Mammals

hair covered
intelligent
marine

cats
dogs
Objects

*
*

thumbed
*
*

dolphins
gibbons

*
*

*

*

*

humans

*

*

whales

*

*

Formal Concept Analysis FCA
FCA: Lessons learnt
from mining aspects
•

DISADVANTAGES	


•

almost combinatorial
amount of results	


•
•
•

•
•

does not detect
exceptional cases	


description of the
concept	


•

redundancy	


requires traversal
heuristics (ad-hoc)	


ADVANTAGES	


•

shared properties
= hint of concept
specification	


does not require apriori knowledge
Rules Notation
Concept1 (k) --n m% --> (l) Concept2	

should be read as: 	


• n elements in Concept1 also appear in Concept2	

• m% of the elements in Concept1 (a.k.a. confidence) are also in Concept2	

• k elements in Concept1 are NOT in Concept2	

• l elements in Concept2 are NOT in Concept1	

!
Visually:	


Concept1

Concept 2

k

n

l

• Special case: when m = 100%	

Concept1 (0) ==n 100% ==> (l) Concept2	


!
Visually:

Concept1
n

Concept 2
l
Implications	


Marine
dolphins	

whales

Intelligent
humans	

gibbons

Marine (0) ==2 (100%)==> (2) Intelligent!
Marine mammals (closed world)	

Subset relation in lattice

Asssociations	

Hair covered (1) --2 (66%)--> (0) 4 legged !
Most hair-covered mammals are 4 legged;
gibbons aren’t	

Superset or siblings in the lattice

Hair covered
gibbons

cats	

dogs

4 legged
CASES
• Case 1 : IntensiVE 	

• Tool to validate if regularities documented through
several views are respected	


• Smalltalk	

• 270 classes; 2729 methods	


• Case 2 : Freecol	

• Colonization game (graphic, multiplayer)	

• Java	

• 382 classes; 3252 methods
OUR Approach
• Objects = Classes, & Attributes =	

• (K) has keyword (in class name)	

• (I) implements a particular method / message	

• (H) in hierarchy of class	

!

FreeColAction

• E.g. 	

• FreeColAction: 	

• K>Action, H>FreeColAction, I>getId, I>toXMLImpl
FreeColAction
getId()	

toXMLImpl()
Source
Code

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Eliminate
redundant
rules

Extract FCA
context

Calculate
concept lattice

Simplify
rules

Calculate
implication &
association
rules
Confidence >= 50%
and
Support >= 3

Rule
groups

algorithm
Source
Code

eliminate SCEs

Group by
overlaping
elements

Rule
groups

!

• Irrelevant entities: 	

• Object class	

• Test classes & methods

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3
Source
Code

rule calculation

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Rule
groups

• Calculate implications:	

• traverse child-parent relations of key nodes*	

!

• Calculate associations:	

• traverse all parent-child relation of key nodes*	

• traverse all key nodes* pairs that have no connection in the lattice 	

!

• Filter relations with confidence & support below thresholds	

• confidence ≤ 75% and support ≥ 3	

!

* those that add attributes or objects to the lattice

Confidence >= 50%
and
Support >= 3
Source
Code

simplify rules

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3

Rule
groups

GOAL: Eliminate redundant properties of a rule	

!
H>IVEditorNodeFigure H>IVEditorFigure ----> K>'IVEditor' K>'Figure'!

!
H>IVEditorNodeFigure ====> H>IVEditorFigure!

IVEditorFigure

IVEditorNodeFigure

because IVEditorFigure does not add information to
the rule
Source
Code

simplify rules

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Rule
groups

Confidence >= 50%
and
Support >= 3

! IVEditorFigure
H>IVEditorNodeFigure ==100%==> H>IVEditorFigure!
IVEditorNodeFigure

!
H>IVEditorNodeFigure, H>IVEditorFigure ----> K>'IVEditor', K>'Figure'!
K>'IVEditor', K>'Figure' ----> H>IVEditorNodeFigure, H>IVEditorFigure!
K>'IVEditor', K>'Figure', H>IVEditorFigure ----> H>IVEditorNodeFigure!
K>'IVEditor', K>'Figure', H>IVEditorNodeFigure ----> H>IVEditorFigure
H>Intensional.IVIntensiVEAction (0) ==24 (100%)==> 	

(0) I>#undoAction,H>Classifications2.AbstractAction,I>#performAction
Source
Code

Group by
overlaping
elements

simplify rules:

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3

Rule
groups

Priority to apply implications
b

Suppose

	
(R) a,b,c,e,f → h	
	
(Z) f
b,e	
(Y) a

b

c	

a

b	

(X) b

c

!
Is there an order to apply several
implications to remove
redundancies from R?

b

f

e
Source
Code

simplify rules:

Eliminate
redundant
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3

b,e	

h

(R) a,b,c,e,f → h	

b

f

a

!

Y, X, Z:
Y, Z, X:
Z, X, Y:
Z, Y, X:
X, Y, Z:
X, Z, Y:

Simplify
rules

Rule
groups

b	

(Z) f

Calculate
concept lattice

c	

(Y) a

Extract FCA
context

Group by
overlaping
elements

Priority to apply implications
(X) b

Eliminate
irrelevant SCEs

(R)
(R)
(R)
(R)
(R)
(R)

a,b,c,e,f
a,b,c,e,f
a,b,c,e,f
a,b,c,e,f
a,b,c,e,f
a,b,c,e,f

→
→
→
→
→
→

h	

h	

h	

h	

h	

h

c

e
simplify rules:

Priority to apply implications
A≤B

Source
Code

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Group by
overlaping
elements

Confidence >= 50%
and
Support >= 3

Rule
groups

A.condition ⊆ B.conclusion 	


∨ B. conclusion ⊆ A. conclusion	

	
(X) b

c	

(Y) a

b	

(Z) f

b,e	

(R) a,b,c,e,f → h	

h

b

f

a

!
X 1st: 	

X≤Z because b ⊆ b,e (X.condition ⊆ Z.conclusion)	

X≤Y because b ⊆ b (X.condition ⊆ Y.conclusion)	

applying X:
(XtoR) a,b,c,e,f → h	


!

Z 2nd:	

Z≤Y because b ⊆ b+e (Y. conclusion ⊆ Z. conclusion)	

applying Z:
(Zto(XtoR)) a,b,c,e,f → h

c

e
Source
Code

eliminate rules
•

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Rule
groups

Confidence >= 50%
and
Support >= 3

Eliminate unrelated sets	


!
K>'Colopedia' (0) ==7 (100%)==> (86) I>'actionPerformed'

!

a (exc. condition) --(matches)→ (exc. conclusion) b

!

i.e. the average number of exceptions is below a quarter of any of the sets
(exc. condition)! (exc. conclusion)!
(condition size)

(conclusion size)

2

(exc. condition)!
(exc. condition+matches)

(exc. conclusion)!
(exc. conclusion +matches)

2

0.25
Source
Code

eliminate rules

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Group by
overlaping
elements

Confidence >= 50%
and
Support >= 3

Rule
groups

• Similar rules that conclude SubClass or SuperClass.	

Property>X

-- matchesSub--> (exc. conclusion sub) H>SubClass

Property>X

-- matchesSuper--> (exc. conclusion super)

!

H>SuperClass

eliminate the super rule !

(matchesSub)!

if it just adds noise !

!

(condition size)

(--matches, ++ exceptions):!

(matchesSuper)!

(exc. conclusion sub)!

0.9 and

0.9

(exc. conclusion super)

(condition size)

!
deleted K>'Classification' (0) --5 (100%)--> (24) H>Classifications2.AbstractClassification
K>'Classification' (0) --5 (100%)--> (0) H>Classifications2.Classification
!

eliminate the sub rule!

(matchesSub)!

(matchesSuper)!

if it has lower confidence:

(condition size)

(condition size)

I>'shouldBeEnabled' (0) --46 (100%)--> (7) H>'FreeColAction'
deleted I>'shouldBeEnabled' (2) --44 (96%)--> (7) H>'MapboardAction'
Source
Code

eliminate rules

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3

Rule
groups

• Conclude the root of the classes in the app. does not add any
information:	


Property>X

----> H>RootClass

!

• When having converse pairs of rules, 	

Property>X
Property>Y

----> Property>Y !
---->

Property>X

	


• prefer the one with better confidence. 	

• If similar confidence, prefer the one that starts with a the
condition of **stronger semantics**.!

H > I, H > K, I=K, U > k, R > k, U=R

H
I

I
H
Source
Code

group rules
•

•

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3

Rule
groups

Those rules that share at least
85% of their matches are be
grouped together	


•
•

Group by
overlaping
elements

Eliminate
irrelevant SCEs

threshold comes from
analysis of results (might
change depending on the
case study)	


They represent common
properties of a set of source
code entities	

These groups are ordered by
number of matches

I>getID	

H>FreeColAction

I>getActionPerformed

K>Action

H>FreeColAction

H>MapboardAction
I>getId
RESULTS:
Reduction of information to process

• IntensiVE [270 classes; 2729 methods]	

• Concepts: 1289 / Relations: 4390	

• Rules: 325 / Groups: 50	

!

• Freecol [382 classes; 3252 methods]	

• Concepts: 1261 / Relations: 5149	

• Rules: 134 / Groups: 42
RESULTS:
Rules Freecol
• K → H = 2 rules	

• The concept described by the keyword is confined to classes in
the hierarchy K>'Mission'→ H>'Mission'	


• K → K = 4 rules	

• Combined words Free+Col, Trade+Route, Free+Col+Menu, etc.	

• K → I = 8 rules	

• Classes named *Keyword* should implement the method I	

e.g. K>'Info'→ I>'update', K>'Action'→ I>'actionPerformed',
K>'Thread'→ I>'run', K>'Mission'→ I>'doMission', etc.
RESULTS:
Rules Freecol
• H → K = 9 rules	

• Classes in the hierarchy can be described by the
keyword. e.g. 	


• H>ReportPanel → K>'Panel' K>'Report'	

• H>'NetworkRequestHandler'→ K>'Handler'	

• H>'InputHandler'→ K>'Input'	

• H>'OptionUpdater'→ K>'UI'	

• 'TradeItem'→ K>'Item'
RESULTS:
Rules Freecol
• H->I = 28 rules	

• Classes in the hierarchy should implement the method. e.g.	

• H>'NetworkRequestHandler'→ I>'handle'	

• H>'OptionMap' → I>'addDefaultOptions'	

• H>'Location' → I>'getGoodsContainer'→ I>'getLocationName'	

• H>'MapIterator' → I>'nextPosition'	

• H>'MapboardAction' → I>'getId' → I>'shouldBeEnabled'	

• H>'OptionUpdater' → I>'updateOption',I>'unregister'	

• H>'TradeItem' → I>'makeTrade'	

• H>'PersistentObject'→ I>'readFromXMLImpl' → I>'toXML'
RESULTS:
Rules Freecol
• I->I = 71 rules	

• Implementation protocols. e.g.	

• I>'getColony'→ I>'getXMLElementTagName'→ I>'toXMLImpl'
→I>'readFromXMLImpl'	


• I>'installUI' → I>'createUI'	

• I>'getTransportDestination'→ I>'doMission' →I>'dispose'	

• I>'contains' →I>'add' →I>'newTurn' →I>'remove' 	

• I>'toXML' →I>'getXMLElementTagName' →I>'readFromXMLImpl' 	

• I>'requestFocus' → I>'actionPerformed' →I>'initialize'	

• I>'setOwner' → I>'newTurn' → I>'getTile'	

• I>'setName'→ I>'getName'
RESULTS:
Groups Freecol
• Classes in the hierarchy FreeColAction are named
*Action*, and tend to implement getId and
actionPerformed 	


• 50 matches, 6 rules	

•
•
•
•
•
•

H>'MapboardAsction' (1) --50 (98%)----> (3) I>'getId'!
H>'FreeColAction' (0) ==53 (100%)==> (3) K>'Action'!
H>'FreeColAction' (3) --50 (94%)--> (43) I>'actionPerformed'!
I>'getId' (3) --50 (94%)--> (43) I>'actionPerformed'!
I>'getId',H>'FreeColAction' (0) --52 (100%)--> (4) K>'Action'!
K>'Action' (5) --51 (91%)--> (42) I>'actionPerformed'
RESULTS:
Groups Freecol
• Most of the classes that implement initialize belong to the hierarchy of FreeColPanel	

• initialize prepares a panel to be displayed 	

• 36 matches, 1 rules	

• I>'initialize' (7) --36 (84%)--> (9) H>'FreeColPanel'!
!

• Classes that implement toXMLImpl also implement getXMLElementTagName	

• toXMLImpl writes an XML representation of the object to a stream.	

• getXMLElementTagName gets the tag name that represents the object	

• Exception is FreeColAction, which is the XML root	

• 44 matches, 1 rules	

• I>'toXMLImpl' (1) --44 (98%)--> (16) I>'getXMLElementTagName'
RESULTS:
IntensiVE
• Regularities documented & found!
• Interface!
• Action protocol & undoable protocol!
• Compilation!
• Relation evaluators!
• Cache / Save / Remove on definitions !
• Intension Editors (partial protocol)!
• Instantiable views!
• Constraint editors!
• Evaluators!
• Naming convention!
• Unit testing, View hierarchy !
• Interface + Naming convention!
• Quantifiers (naming + partial interface)
RESULTS:
IntensiVE
• Regularities found & NOT documented!
• Interfaces!
• IntensiVE Explorer Visualization !
• Checkable entities!
• Fuzzy quantifiers!
• Query generation for visual querying!
• Context-menu in visual query language!
• Figure rendering!
• Special classifications!
• Naming conventions!
• Figures, Exceptions, Visualization, Classifications, Exceptions to views, Result pairs,
Reporters.!

• Interfaces + Naming conventions!
• Starbrowser shells
CONCLUSIONS
• Use FCA with objects = source code entities	

• As attributes = several types of properties	

• Calculate implications	

• to mine for intension of regularity rather than extension	

• not just entities that match regularity but explicit
specification of regularity	


• Allow for variations and irregularities = association rules	

• To overcome previous pitfalls and make regularities explicit
THREATS &
LIMITATIONS
• Redundant information	

• There are groups that are sub-sets of other groups	

• All results are correct but....	

• some regularities found might be due to chance and not
as conscientious development decision	


• interpretation of results require to assume a close world	


• Usefulness is subjective	

• i.e. separating useful from useless results	

• Data analyzed could be more of semantic
CURRENT
& FUTURE WORK:
• ...Running the same case studies mixing the results of
Classes and Methods	


• ...Comparing the regularities found with those
previously documented in IntensiVE	


• Calculate which percentage of the irregularities of a
group are indeed an error	


• Use the results to guide the developer while adding
or modifying SCEs	


• Use a similar approach to mine feature dependencies

Weitere ähnliche Inhalte

Ähnlich wie Mining source code for structural regularities (SATTOSE2010)

Associations.ppt
Associations.pptAssociations.ppt
Associations.pptQuyn590023
 
Associations1
Associations1Associations1
Associations1mancnilu
 
Developer testing 201: When to Mock and When to Integrate
Developer testing 201: When to Mock and When to IntegrateDeveloper testing 201: When to Mock and When to Integrate
Developer testing 201: When to Mock and When to IntegrateLB Denker
 
Market Basket Analysis in SQL Server Machine Learning Services
Market Basket Analysis in SQL Server Machine Learning ServicesMarket Basket Analysis in SQL Server Machine Learning Services
Market Basket Analysis in SQL Server Machine Learning ServicesLuca Zavarella
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerAndrey Karpov
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryGiuseppe Rizzo
 
Ot regularization and_gradient_descent
Ot regularization and_gradient_descentOt regularization and_gradient_descent
Ot regularization and_gradient_descentankit_ppt
 
Association Analysis in Data Mining
Association Analysis in Data MiningAssociation Analysis in Data Mining
Association Analysis in Data MiningKamal Acharya
 
Research overview Oct. 2018
Research overview Oct. 2018Research overview Oct. 2018
Research overview Oct. 2018XavierDevroey
 
Optimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex PlansOptimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex PlansDatabricks
 
Developer testing 101: Become a Testing Fanatic
Developer testing 101: Become a Testing FanaticDeveloper testing 101: Become a Testing Fanatic
Developer testing 101: Become a Testing FanaticLB Denker
 
EKON 23 Code_review_checklist
EKON 23 Code_review_checklistEKON 23 Code_review_checklist
EKON 23 Code_review_checklistMax Kleiner
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Lionel Briand
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldAndrey Karpov
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...ORAU
 
The fundamentals of regression
The fundamentals of regressionThe fundamentals of regression
The fundamentals of regressionStephanie Locke
 
Improving the Quality of Existing Software
Improving the Quality of Existing SoftwareImproving the Quality of Existing Software
Improving the Quality of Existing SoftwareSteven Smith
 

Ähnlich wie Mining source code for structural regularities (SATTOSE2010) (20)

Associations.ppt
Associations.pptAssociations.ppt
Associations.ppt
 
Writing clean code
Writing clean codeWriting clean code
Writing clean code
 
Associations1
Associations1Associations1
Associations1
 
Developer testing 201: When to Mock and When to Integrate
Developer testing 201: When to Mock and When to IntegrateDeveloper testing 201: When to Mock and When to Integrate
Developer testing 201: When to Mock and When to Integrate
 
Market Basket Analysis in SQL Server Machine Learning Services
Market Basket Analysis in SQL Server Machine Learning ServicesMarket Basket Analysis in SQL Server Machine Learning Services
Market Basket Analysis in SQL Server Machine Learning Services
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzer
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom Discovery
 
Ot regularization and_gradient_descent
Ot regularization and_gradient_descentOt regularization and_gradient_descent
Ot regularization and_gradient_descent
 
Association Analysis in Data Mining
Association Analysis in Data MiningAssociation Analysis in Data Mining
Association Analysis in Data Mining
 
Research overview Oct. 2018
Research overview Oct. 2018Research overview Oct. 2018
Research overview Oct. 2018
 
Code Metrics
Code MetricsCode Metrics
Code Metrics
 
Optimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex PlansOptimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex Plans
 
Explainable AI Workshop
Explainable AI WorkshopExplainable AI Workshop
Explainable AI Workshop
 
Developer testing 101: Become a Testing Fanatic
Developer testing 101: Become a Testing FanaticDeveloper testing 101: Become a Testing Fanatic
Developer testing 101: Become a Testing Fanatic
 
EKON 23 Code_review_checklist
EKON 23 Code_review_checklistEKON 23 Code_review_checklist
EKON 23 Code_review_checklist
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security world
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
 
The fundamentals of regression
The fundamentals of regressionThe fundamentals of regression
The fundamentals of regression
 
Improving the Quality of Existing Software
Improving the Quality of Existing SoftwareImproving the Quality of Existing Software
Improving the Quality of Existing Software
 

Mehr von kim.mens

Context-Oriented Programming
Context-Oriented ProgrammingContext-Oriented Programming
Context-Oriented Programmingkim.mens
 
Object-Oriented Design Heuristics
Object-Oriented Design HeuristicsObject-Oriented Design Heuristics
Object-Oriented Design Heuristicskim.mens
 
Software Patterns
Software PatternsSoftware Patterns
Software Patternskim.mens
 
Code Refactoring
Code RefactoringCode Refactoring
Code Refactoringkim.mens
 
Domain Modelling
Domain ModellingDomain Modelling
Domain Modellingkim.mens
 
Object-Oriented Application Frameworks
Object-Oriented Application FrameworksObject-Oriented Application Frameworks
Object-Oriented Application Frameworkskim.mens
 
Towards a Context-Oriented Software Implementation Framework
Towards a Context-Oriented Software Implementation FrameworkTowards a Context-Oriented Software Implementation Framework
Towards a Context-Oriented Software Implementation Frameworkkim.mens
 
Towards a Taxonomy of Context-Aware Software Variabilty Approaches
Towards a Taxonomy of Context-Aware Software Variabilty ApproachesTowards a Taxonomy of Context-Aware Software Variabilty Approaches
Towards a Taxonomy of Context-Aware Software Variabilty Approacheskim.mens
 
Breaking the Walls: A Unified Vision on Context-Oriented Software Engineering
Breaking the Walls: A Unified Vision on Context-Oriented Software EngineeringBreaking the Walls: A Unified Vision on Context-Oriented Software Engineering
Breaking the Walls: A Unified Vision on Context-Oriented Software Engineeringkim.mens
 
Context-oriented programming
Context-oriented programmingContext-oriented programming
Context-oriented programmingkim.mens
 
Basics of reflection
Basics of reflectionBasics of reflection
Basics of reflectionkim.mens
 
Advanced Reflection in Java
Advanced Reflection in JavaAdvanced Reflection in Java
Advanced Reflection in Javakim.mens
 
Basics of reflection in java
Basics of reflection in javaBasics of reflection in java
Basics of reflection in javakim.mens
 
Reflection in Ruby
Reflection in RubyReflection in Ruby
Reflection in Rubykim.mens
 
Introduction to Ruby
Introduction to RubyIntroduction to Ruby
Introduction to Rubykim.mens
 
Introduction to Smalltalk
Introduction to SmalltalkIntroduction to Smalltalk
Introduction to Smalltalkkim.mens
 
A gentle introduction to reflection
A gentle introduction to reflectionA gentle introduction to reflection
A gentle introduction to reflectionkim.mens
 
Managing the Evolution of Information Systems with Intensional Views and Rela...
Managing the Evolution of Information Systems with Intensional Views and Rela...Managing the Evolution of Information Systems with Intensional Views and Rela...
Managing the Evolution of Information Systems with Intensional Views and Rela...kim.mens
 
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)kim.mens
 
Usage contracts in a nutshell
Usage contracts in a nutshellUsage contracts in a nutshell
Usage contracts in a nutshellkim.mens
 

Mehr von kim.mens (20)

Context-Oriented Programming
Context-Oriented ProgrammingContext-Oriented Programming
Context-Oriented Programming
 
Object-Oriented Design Heuristics
Object-Oriented Design HeuristicsObject-Oriented Design Heuristics
Object-Oriented Design Heuristics
 
Software Patterns
Software PatternsSoftware Patterns
Software Patterns
 
Code Refactoring
Code RefactoringCode Refactoring
Code Refactoring
 
Domain Modelling
Domain ModellingDomain Modelling
Domain Modelling
 
Object-Oriented Application Frameworks
Object-Oriented Application FrameworksObject-Oriented Application Frameworks
Object-Oriented Application Frameworks
 
Towards a Context-Oriented Software Implementation Framework
Towards a Context-Oriented Software Implementation FrameworkTowards a Context-Oriented Software Implementation Framework
Towards a Context-Oriented Software Implementation Framework
 
Towards a Taxonomy of Context-Aware Software Variabilty Approaches
Towards a Taxonomy of Context-Aware Software Variabilty ApproachesTowards a Taxonomy of Context-Aware Software Variabilty Approaches
Towards a Taxonomy of Context-Aware Software Variabilty Approaches
 
Breaking the Walls: A Unified Vision on Context-Oriented Software Engineering
Breaking the Walls: A Unified Vision on Context-Oriented Software EngineeringBreaking the Walls: A Unified Vision on Context-Oriented Software Engineering
Breaking the Walls: A Unified Vision on Context-Oriented Software Engineering
 
Context-oriented programming
Context-oriented programmingContext-oriented programming
Context-oriented programming
 
Basics of reflection
Basics of reflectionBasics of reflection
Basics of reflection
 
Advanced Reflection in Java
Advanced Reflection in JavaAdvanced Reflection in Java
Advanced Reflection in Java
 
Basics of reflection in java
Basics of reflection in javaBasics of reflection in java
Basics of reflection in java
 
Reflection in Ruby
Reflection in RubyReflection in Ruby
Reflection in Ruby
 
Introduction to Ruby
Introduction to RubyIntroduction to Ruby
Introduction to Ruby
 
Introduction to Smalltalk
Introduction to SmalltalkIntroduction to Smalltalk
Introduction to Smalltalk
 
A gentle introduction to reflection
A gentle introduction to reflectionA gentle introduction to reflection
A gentle introduction to reflection
 
Managing the Evolution of Information Systems with Intensional Views and Rela...
Managing the Evolution of Information Systems with Intensional Views and Rela...Managing the Evolution of Information Systems with Intensional Views and Rela...
Managing the Evolution of Information Systems with Intensional Views and Rela...
 
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)
 
Usage contracts in a nutshell
Usage contracts in a nutshellUsage contracts in a nutshell
Usage contracts in a nutshell
 

Kürzlich hochgeladen

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Kürzlich hochgeladen (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

Mining source code for structural regularities (SATTOSE2010)

  • 1. Mining source code for structural regularities Kim Mens, Andy Kellens, Gabriela Arevalo, Angela Lozano
  • 2. Problem Context • Context: • Programmers often use regularities: coding conventions, design patterns, crosscutting concerns... • Enforcing such regularities facilitates maintenance, evolution & comprehension • Regularities are often not fully respected (implicit, no support) • Goal: provide automated (tool) support to discover source code (ir)regularities
  • 4. FCA: Lessons learnt from mining aspects • DISADVANTAGES • almost combinatorial amount of results • • • • • does not detect exceptional cases description of the concept • redundancy requires traversal heuristics (ad-hoc) ADVANTAGES • shared properties = hint of concept specification does not require apriori knowledge
  • 5. Rules Notation Concept1 (k) --n m% --> (l) Concept2 should be read as: • n elements in Concept1 also appear in Concept2 • m% of the elements in Concept1 (a.k.a. confidence) are also in Concept2 • k elements in Concept1 are NOT in Concept2 • l elements in Concept2 are NOT in Concept1 ! Visually: Concept1 Concept 2 k n l • Special case: when m = 100% Concept1 (0) ==n 100% ==> (l) Concept2 ! Visually: Concept1 n Concept 2 l
  • 6. Implications Marine dolphins whales Intelligent humans gibbons Marine (0) ==2 (100%)==> (2) Intelligent! Marine mammals (closed world) Subset relation in lattice Asssociations Hair covered (1) --2 (66%)--> (0) 4 legged ! Most hair-covered mammals are 4 legged; gibbons aren’t Superset or siblings in the lattice Hair covered gibbons cats dogs 4 legged
  • 7. CASES • Case 1 : IntensiVE • Tool to validate if regularities documented through several views are respected • Smalltalk • 270 classes; 2729 methods • Case 2 : Freecol • Colonization game (graphic, multiplayer) • Java • 382 classes; 3252 methods
  • 8. OUR Approach • Objects = Classes, & Attributes = • (K) has keyword (in class name) • (I) implements a particular method / message • (H) in hierarchy of class ! FreeColAction • E.g. • FreeColAction: • K>Action, H>FreeColAction, I>getId, I>toXMLImpl FreeColAction getId() toXMLImpl()
  • 9. Source Code Group by overlaping elements Eliminate irrelevant SCEs Eliminate redundant rules Extract FCA context Calculate concept lattice Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3 Rule groups algorithm
  • 10. Source Code eliminate SCEs Group by overlaping elements Rule groups ! • Irrelevant entities: • Object class • Test classes & methods Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3
  • 11. Source Code rule calculation Group by overlaping elements Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Rule groups • Calculate implications: • traverse child-parent relations of key nodes* ! • Calculate associations: • traverse all parent-child relation of key nodes* • traverse all key nodes* pairs that have no connection in the lattice ! • Filter relations with confidence & support below thresholds • confidence ≤ 75% and support ≥ 3 ! * those that add attributes or objects to the lattice Confidence >= 50% and Support >= 3
  • 12. Source Code simplify rules Group by overlaping elements Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3 Rule groups GOAL: Eliminate redundant properties of a rule ! H>IVEditorNodeFigure H>IVEditorFigure ----> K>'IVEditor' K>'Figure'! ! H>IVEditorNodeFigure ====> H>IVEditorFigure! IVEditorFigure IVEditorNodeFigure because IVEditorFigure does not add information to the rule
  • 13. Source Code simplify rules Group by overlaping elements Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Rule groups Confidence >= 50% and Support >= 3 ! IVEditorFigure H>IVEditorNodeFigure ==100%==> H>IVEditorFigure! IVEditorNodeFigure ! H>IVEditorNodeFigure, H>IVEditorFigure ----> K>'IVEditor', K>'Figure'! K>'IVEditor', K>'Figure' ----> H>IVEditorNodeFigure, H>IVEditorFigure! K>'IVEditor', K>'Figure', H>IVEditorFigure ----> H>IVEditorNodeFigure! K>'IVEditor', K>'Figure', H>IVEditorNodeFigure ----> H>IVEditorFigure H>Intensional.IVIntensiVEAction (0) ==24 (100%)==> (0) I>#undoAction,H>Classifications2.AbstractAction,I>#performAction
  • 14. Source Code Group by overlaping elements simplify rules: Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3 Rule groups Priority to apply implications b Suppose (R) a,b,c,e,f → h (Z) f b,e (Y) a b c a b (X) b c ! Is there an order to apply several implications to remove redundancies from R? b f e
  • 15. Source Code simplify rules: Eliminate redundant rules Calculate implication & association rules Confidence >= 50% and Support >= 3 b,e h (R) a,b,c,e,f → h b f a ! Y, X, Z: Y, Z, X: Z, X, Y: Z, Y, X: X, Y, Z: X, Z, Y: Simplify rules Rule groups b (Z) f Calculate concept lattice c (Y) a Extract FCA context Group by overlaping elements Priority to apply implications (X) b Eliminate irrelevant SCEs (R) (R) (R) (R) (R) (R) a,b,c,e,f a,b,c,e,f a,b,c,e,f a,b,c,e,f a,b,c,e,f a,b,c,e,f → → → → → → h h h h h h c e
  • 16. simplify rules: Priority to apply implications A≤B Source Code Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Group by overlaping elements Confidence >= 50% and Support >= 3 Rule groups A.condition ⊆ B.conclusion ∨ B. conclusion ⊆ A. conclusion (X) b c (Y) a b (Z) f b,e (R) a,b,c,e,f → h h b f a ! X 1st: X≤Z because b ⊆ b,e (X.condition ⊆ Z.conclusion) X≤Y because b ⊆ b (X.condition ⊆ Y.conclusion) applying X: (XtoR) a,b,c,e,f → h ! Z 2nd: Z≤Y because b ⊆ b+e (Y. conclusion ⊆ Z. conclusion) applying Z: (Zto(XtoR)) a,b,c,e,f → h c e
  • 17. Source Code eliminate rules • Group by overlaping elements Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Rule groups Confidence >= 50% and Support >= 3 Eliminate unrelated sets ! K>'Colopedia' (0) ==7 (100%)==> (86) I>'actionPerformed' ! a (exc. condition) --(matches)→ (exc. conclusion) b ! i.e. the average number of exceptions is below a quarter of any of the sets (exc. condition)! (exc. conclusion)! (condition size) (conclusion size) 2 (exc. condition)! (exc. condition+matches) (exc. conclusion)! (exc. conclusion +matches) 2 0.25
  • 18. Source Code eliminate rules Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Group by overlaping elements Confidence >= 50% and Support >= 3 Rule groups • Similar rules that conclude SubClass or SuperClass. Property>X -- matchesSub--> (exc. conclusion sub) H>SubClass Property>X -- matchesSuper--> (exc. conclusion super) ! H>SuperClass eliminate the super rule ! (matchesSub)! if it just adds noise ! ! (condition size) (--matches, ++ exceptions):! (matchesSuper)! (exc. conclusion sub)! 0.9 and 0.9 (exc. conclusion super) (condition size) ! deleted K>'Classification' (0) --5 (100%)--> (24) H>Classifications2.AbstractClassification K>'Classification' (0) --5 (100%)--> (0) H>Classifications2.Classification ! eliminate the sub rule! (matchesSub)! (matchesSuper)! if it has lower confidence: (condition size) (condition size) I>'shouldBeEnabled' (0) --46 (100%)--> (7) H>'FreeColAction' deleted I>'shouldBeEnabled' (2) --44 (96%)--> (7) H>'MapboardAction'
  • 19. Source Code eliminate rules Group by overlaping elements Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3 Rule groups • Conclude the root of the classes in the app. does not add any information: Property>X ----> H>RootClass ! • When having converse pairs of rules, Property>X Property>Y ----> Property>Y ! ----> Property>X • prefer the one with better confidence. • If similar confidence, prefer the one that starts with a the condition of **stronger semantics**.! H > I, H > K, I=K, U > k, R > k, U=R H I I H
  • 20. Source Code group rules • • Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3 Rule groups Those rules that share at least 85% of their matches are be grouped together • • Group by overlaping elements Eliminate irrelevant SCEs threshold comes from analysis of results (might change depending on the case study) They represent common properties of a set of source code entities These groups are ordered by number of matches I>getID H>FreeColAction I>getActionPerformed K>Action H>FreeColAction H>MapboardAction I>getId
  • 21. RESULTS: Reduction of information to process • IntensiVE [270 classes; 2729 methods] • Concepts: 1289 / Relations: 4390 • Rules: 325 / Groups: 50 ! • Freecol [382 classes; 3252 methods] • Concepts: 1261 / Relations: 5149 • Rules: 134 / Groups: 42
  • 22. RESULTS: Rules Freecol • K → H = 2 rules • The concept described by the keyword is confined to classes in the hierarchy K>'Mission'→ H>'Mission' • K → K = 4 rules • Combined words Free+Col, Trade+Route, Free+Col+Menu, etc. • K → I = 8 rules • Classes named *Keyword* should implement the method I e.g. K>'Info'→ I>'update', K>'Action'→ I>'actionPerformed', K>'Thread'→ I>'run', K>'Mission'→ I>'doMission', etc.
  • 23. RESULTS: Rules Freecol • H → K = 9 rules • Classes in the hierarchy can be described by the keyword. e.g. • H>ReportPanel → K>'Panel' K>'Report' • H>'NetworkRequestHandler'→ K>'Handler' • H>'InputHandler'→ K>'Input' • H>'OptionUpdater'→ K>'UI' • 'TradeItem'→ K>'Item'
  • 24. RESULTS: Rules Freecol • H->I = 28 rules • Classes in the hierarchy should implement the method. e.g. • H>'NetworkRequestHandler'→ I>'handle' • H>'OptionMap' → I>'addDefaultOptions' • H>'Location' → I>'getGoodsContainer'→ I>'getLocationName' • H>'MapIterator' → I>'nextPosition' • H>'MapboardAction' → I>'getId' → I>'shouldBeEnabled' • H>'OptionUpdater' → I>'updateOption',I>'unregister' • H>'TradeItem' → I>'makeTrade' • H>'PersistentObject'→ I>'readFromXMLImpl' → I>'toXML'
  • 25. RESULTS: Rules Freecol • I->I = 71 rules • Implementation protocols. e.g. • I>'getColony'→ I>'getXMLElementTagName'→ I>'toXMLImpl' →I>'readFromXMLImpl' • I>'installUI' → I>'createUI' • I>'getTransportDestination'→ I>'doMission' →I>'dispose' • I>'contains' →I>'add' →I>'newTurn' →I>'remove' • I>'toXML' →I>'getXMLElementTagName' →I>'readFromXMLImpl' • I>'requestFocus' → I>'actionPerformed' →I>'initialize' • I>'setOwner' → I>'newTurn' → I>'getTile' • I>'setName'→ I>'getName'
  • 26. RESULTS: Groups Freecol • Classes in the hierarchy FreeColAction are named *Action*, and tend to implement getId and actionPerformed • 50 matches, 6 rules • • • • • • H>'MapboardAsction' (1) --50 (98%)----> (3) I>'getId'! H>'FreeColAction' (0) ==53 (100%)==> (3) K>'Action'! H>'FreeColAction' (3) --50 (94%)--> (43) I>'actionPerformed'! I>'getId' (3) --50 (94%)--> (43) I>'actionPerformed'! I>'getId',H>'FreeColAction' (0) --52 (100%)--> (4) K>'Action'! K>'Action' (5) --51 (91%)--> (42) I>'actionPerformed'
  • 27. RESULTS: Groups Freecol • Most of the classes that implement initialize belong to the hierarchy of FreeColPanel • initialize prepares a panel to be displayed • 36 matches, 1 rules • I>'initialize' (7) --36 (84%)--> (9) H>'FreeColPanel'! ! • Classes that implement toXMLImpl also implement getXMLElementTagName • toXMLImpl writes an XML representation of the object to a stream. • getXMLElementTagName gets the tag name that represents the object • Exception is FreeColAction, which is the XML root • 44 matches, 1 rules • I>'toXMLImpl' (1) --44 (98%)--> (16) I>'getXMLElementTagName'
  • 28. RESULTS: IntensiVE • Regularities documented & found! • Interface! • Action protocol & undoable protocol! • Compilation! • Relation evaluators! • Cache / Save / Remove on definitions ! • Intension Editors (partial protocol)! • Instantiable views! • Constraint editors! • Evaluators! • Naming convention! • Unit testing, View hierarchy ! • Interface + Naming convention! • Quantifiers (naming + partial interface)
  • 29. RESULTS: IntensiVE • Regularities found & NOT documented! • Interfaces! • IntensiVE Explorer Visualization ! • Checkable entities! • Fuzzy quantifiers! • Query generation for visual querying! • Context-menu in visual query language! • Figure rendering! • Special classifications! • Naming conventions! • Figures, Exceptions, Visualization, Classifications, Exceptions to views, Result pairs, Reporters.! • Interfaces + Naming conventions! • Starbrowser shells
  • 30. CONCLUSIONS • Use FCA with objects = source code entities • As attributes = several types of properties • Calculate implications • to mine for intension of regularity rather than extension • not just entities that match regularity but explicit specification of regularity • Allow for variations and irregularities = association rules • To overcome previous pitfalls and make regularities explicit
  • 31. THREATS & LIMITATIONS • Redundant information • There are groups that are sub-sets of other groups • All results are correct but.... • some regularities found might be due to chance and not as conscientious development decision • interpretation of results require to assume a close world • Usefulness is subjective • i.e. separating useful from useless results • Data analyzed could be more of semantic
  • 32. CURRENT & FUTURE WORK: • ...Running the same case studies mixing the results of Classes and Methods • ...Comparing the regularities found with those previously documented in IntensiVE • Calculate which percentage of the irregularities of a group are indeed an error • Use the results to guide the developer while adding or modifying SCEs • Use a similar approach to mine feature dependencies