The negative impact of smells on the quality of a software systems has been empirical investigated in several studies. This has recalled the need to have approaches for the identification and the removal of smells. While approaches to remove smells have investigated the use of both structural and conceptual information extracted from source code, approaches to identify smells are based on structural information only. In this paper, we bridge the gap analyzing to what extent conceptual information, extracted using textual analysis techniques, can be used to identify smells in source code. The proposed textual-based approach for detecting smells in source code, coined as TACO (Textual Analysis for Code smell detectiOn), has been instantiated for detecting the Long Method smell and has been evaluated on three Java open source projects. The results indicate that TACO is able to detect between 50% and 77% of the smell instances with a precision ranging between 63% and 67%. In addition, the results show that TACO identifies smells that are not identified by approaches based on solely structural information.
1. Textual Analysis for Code
Smell Detection
FA B I O PA L O M B A , U N I V E R S I T Y O F S A L E R N O
D I B T. U N I M O L . I T / F PA L O M B A
I C S E 2 0 1 5 S R C
M AY 2 2 , 2 0 1 5
F L O R E N C E , I TA LY
2. C O D E S M E L L S A R E
S Y M P T O M S O F P O O R
D E S I G N O R
I M P L E M E N TAT I O N C H O I S E S
Code Smell
[Martin Fowler]
3. C O D E S M E L L S A R E M O R E
C H A N G E - A N D FA U LT-
P R O N E N E S S
C O D E S M E L L S I N C R E A S E S
M A I N T E N A N C E C O S T S
Khomh et. al - EMSE 2012
Banker et. al - Communications of
the ACM 1993
C O D E S M E L L S
H I N D E R
C O M P R E H E N S I B I L I T Y
Abbes et. al - CSMR 2011
4. S E V E R A L A P P ROAC H E S H AV E B E E N P RO P O S E D, E X P L O I T I N G
D I F F E R E N T K I N D O F I N F O R M AT I O N
S T R U C T U R A L
I N F O R M AT I O N
H I S T O R I C A L
A N A LY S I S
5. I S I T P O S S I B L E T O D E T E C T S M E L L S U S I N G
C O N C E P T U A L I N F O R M AT I O N
?The Third Dimension
6. T H E C A S E O F L O N G M E T H O D D E T E C T I O N
TACO - Textual Analysis for Code Smell Detection
“A Long Method as a method in which there is the implementation
of a main functionality together with auxiliary functions that should
be managed in different methods.”
[Martin Fowler]
7. TACO - Textual Analysis for Code Smell Detection
T H E C A S E O F L O N G M E T H O D D E T E C T I O N
method mi
9. method blocks
pruned method blocks
extract
identifiers
extract
comments
extract
blocks
T H E C A S E O F L O N G M E T H O D D E T E C T I O N
TACO - Textual Analysis for Code Smell Detection
method mi
10. similarity matrix
method blocks
pruned method blocks
compute
similarity
extract
identifiers
extract
comments
extract
blocks
T H E C A S E O F L O N G M E T H O D D E T E C T I O N
TACO - Textual Analysis for Code Smell Detection
method mi
11. similarity matrix
method blocks
pruned method blocks
compute
similarity
extract
identifiers
extract
comments
I F E X I S T B L O C K S W I T H S I M I L A R I T Y <
T H E N A L O N G M E T H O D I S D E T E C T E D
extract
blocks
t
T H E C A S E O F L O N G M E T H O D D E T E C T I O N
TACO - Textual Analysis for Code Smell Detection
method m
12. T H E C A S E O F L O N G M E T H O D D E T E C T I O N - C A S E S T U D Y
TACO - Textual Analysis for Code Smell Detection
Apache Cassandra
Apache Xerces
Eclipse Core
13. TACO - Textual Analysis for Code Smell Detection
Apache Cassandra
Apache Xerces
Eclipse Core
Precision
Recall
F-Measure
T H E C A S E O F L O N G M E T H O D D E T E C T I O N - C A S E S T U D Y
14. TACO - Textual Analysis for Code Smell Detection
Apache Cassandra
Apache Xerces
Eclipse Core
Precision
Recall
F-Measure
Compared with
DECOR approach
[Moha et al.]
T H E C A S E O F L O N G M E T H O D D E T E C T I O N - C A S E S T U D Y
15. TACO - Textual Analysis for Code Smell Detection
Apache Cassandra
Apache Xerces
Eclipse Core
Precision
Recall
F-Measure
Compared with
DECOR approach
[Moha et al.]
T H E C A S E O F L O N G M E T H O D D E T E C T I O N - C A S E S T U D Y
63%
51%
DECOR
TACO
OVERALL F-MEASURE
OVERALL F-MEASURE
16. TACO - Textual Analysis for Code Smell Detection
Apache Cassandra
Apache Xerces
Eclipse Core
Precision
Recall
F-Measure
Compared with
DECOR approach
[Moha et al.]
TACO is highly complementary with
respect to DECOR on 2 systems analyzed
T H E C A S E O F L O N G M E T H O D D E T E C T I O N - C A S E S T U D Y
63%
51%
DECOR
TACO
OVERALL F-MEASURE
OVERALL F-MEASURE
17. A P R AT I C A L E X A M P L E
TACO - Textual Analysis for Code Smell Detection
Method: findTypesAndPackages()
Goal: Discover the classes and the packages of a given project
Class: CompletionEngine - Eclipse Core
18. TACO - Textual Analysis for Code Smell Detection
Method: findTypesAndPackages()
Goal: Discover the classes and the packages of a given project
This method has
65 lines of code
Class: CompletionEngine - Eclipse Core
A P R AT I C A L E X A M P L E
19. TACO - Textual Analysis for Code Smell Detection
Method: findTypesAndPackages()
Goal: Discover the classes and the packages of a given project
This method has
65 lines of code
A S TAT I C A P P R O A C H C A N N O T
D E T E C T A L O N G M E T H O D
Class: CompletionEngine - Eclipse Core
A P R AT I C A L E X A M P L E
20. TACO - Textual Analysis for Code Smell Detection
Class: CompletionEngine - Eclipse Core
Method: findTypesAndPackages()
Goal: Discover the classes and the packages of a given project
This method has
65 lines of code
A S TAT I C A P P R O A C H C A N N O T
D E T E C T A L O N G M E T H O D
But the method manages
more than one responsibility
I T A C T U A L LY
I S A L O N G M E T H O D !
A P R AT I C A L E X A M P L E
21. I S I T P O S S I B L E T O D E T E C T S M E L L S U S I N G
C O N C E P T U A L I N F O R M AT I O N
!The Third Dimension
YES, IT IS
22. S U M M A R I Z I N G
TACO - Textual Analysis for Code Smell Detection
Textual Analysis is useful for
smell detection
23. Textual Analysis is useful for
smell detection
Is TACO suitable for detectingother smells?
TACO - Textual Analysis for Code Smell Detection
What about a hybrid technique for
detecting smells?
C O M I N G S O O N …
24. Textual Analysis for Code
Smell Detection
FA B I O PA L O M B A , U N I V E R S I T Y O F S A L E R N O
D I B T. U N I M O L . I T / F PA L O M B A
I C S E 2 0 1 5 S R C
M AY 2 2 , 2 0 1 5
F L O R E N C E , I TA LY