SlideShare a Scribd company logo
1 of 29
Mining Source Code Improvement
Patterns from Similar Code Review
Yuki Ueda1, Takashi Ishio1, Akinori Ihara2,
Kenichi Matsumoto1
1Nara Institute of Science and Technology
2Wakayama University
13th International Workshop on Software Clones (IWSC’19)
Background Approach Result Summary
Contents
• Goal:Reduce Code Review Cost
• Approach:Code Improvement Pattern Detection
That Appeared Review
• Evaluation: Measure Patterns’ Frequency and
Accuracy
2
Background Approach Result Summary
Code review process:
Reviewers suggest code fix
Patch
Author
Reviewer Project
3
- i=key
+ i=dic[“key”]
Patch
Background
(1) Submit
Background Approach Result Summary
Code review process:
Reviewers suggest code fix
Patch
Author
Reviewer Project
4
- i=key
+ i=dic[“key”]
Patch
You should fix
(1) Submit
(2) Review, Fix suggestion
Background
Background Approach Result Summary
Code review process:
Reviewers suggest code fix
5
- i=key
+ i=dic[“key”]
- i=key
+ i_=_dic[“KEY”]
(3) Integrate
Patch
Author
Reviewer Project(1) Submit
(2) Review, Fix suggestion
Reviewed Patch
(Integrated Patch)
Pre-Review Patch
(Initial Patch)
Background
Background Approach Result Summary
Problem:
Reviewers need to check several times
6
- i=key
+ i=dic[“key”]
- i=key
+ i_=_dic[“KEY”]
(2)〜(n) Review,Fix suggestion
(n) Integrate
Patch
Author
Reviewer Project(1) Submit
Reviewed Patch
(Integrated Patch)
Pre-Review Patch
(Initial Patch)
String should be lower
Waste space
Background
Background Approach Result Summary
Goal:
Reduce Similar Review Automatically
7
Auto Review
System
(2) Review,Fix suggestion
(3) Review
request
Patch
Author
Reviewer(1) Submit
Similar patch is fixed in the
past like..
Background
Background Approach Result Summary
Approach:
Detect Pattern from Reviewed Patch Diff
8
”key” , it will be “KEY”
Pattern:
i=dic[“key”] i=dic[“KEY”]
Dataset:
i=dic[“key”] i=dic[“KEY”]i=dic[“key”]
Pre-Review Patch
i=dic[“KEY”]
Reviewed Patch
Approach
If patch has
Detect
Background Approach Result Summary
Approach:
Detect Pattern from Reviewed Patch Diff
9
Patch
Author
Auto Review
System
print(“key”)
print(“KEY”)
”key” , it will be “KEY”
Pattern:
If patch has
Use
Dataset:
i=dic[“key”] i=dic[“KEY”]i=dic[“key”] i=dic[“KEY”]i=dic[“key”]
Pre-Review Patch
i=dic[“KEY”]
Reviewed Patch
Approach
Background Approach Result Summary
Detect Code Improved Pattern (1/2):
Divide Patch Diff to Chunk
10
- if i␣==␣0:
+ if i==0:
break
- i=dic[“key”]
+ i=dic.get(“key”)
- i=dic[“key”]
+ i=dic.get(“key”)
- if i␣==␣0:
+ if i==0:
Approach
Background Approach Result Summary
Detect Code Improved Pattern (1/2):
Get Pattern by Sequential Pattern Mining
11
- i=dic[“key”]
+ i=dic.get(“key”)
- [i=dic - [ + .get(i=dic
- [i=dic
i=dic
- [ + .get(i=dic “key”
- ]
+ )
Length 2 Length3 Length4
Approach
Background Approach Result Summary
Detect Code Improved Pattern (1/2):
Get Pattern by Sequential Pattern Mining
12
- i=dic[“key”]
+ i=dic.get(“key”)
- [i=dic - [ + .get(i=dic
- [i=dic - ]
i=dic + )
- [ + .get(i=dic “key”
Length 2 Length3 Length4
Keep Frequently Appeared and Longer Patterns
Approach
Background Approach Result Summary
Pattern Evaluation
13
i=dic + .get( - ]
Appeared Time:
+ )(e.g. Pattern
i=dic
.get(
]
)
Pre-Reviewed Patches that have
Reviewed Patches that have
)
Number of Patch Pairs
Approach
Background Approach Result Summary
Pattern Evaluation
14
Appeared Time:
.get( )
Pre-Reviewed Patches that have
Reviewed Patches that have
Number of Patch Pairs
i=dic[“key”]
i=dic.get(“key”)
i=dic[”KEY”]
Count
NOT Count
e.g.):
Pre-Reviewed Patch
i=dic + .get( - ] + )(e.g. Pattern )
i=dic ]
Approach
Background Approach Result Summary
15
Appeared Time:
.get( )
Pre-Reviewed Patches that have
Reviewed Patches that have
Number of Patch Pairs
Accuracy:
.get( )
Pre-Reviewed Patches that have
Reviewed Patches that have
Ratio of Patch Pairs
Pattern Evaluation
i=dic ]
i=dic ]
i=dic + .get( - ] + )(e.g. Pattern )
Approach
Background Approach Result Summary
Target
16
Project OpenStack
Language Python3
Time Period 2011-2016
# Patches 173,749
# Chunks for Detect Pattern 555,050
# Chunks for Evaluate Pattern 61,673
Result
Background Approach Result Summary
8 Frequently Appeared Pattern
17
self.stbout() self.stubs.Set()
Why?: Support for OpenStacks‘ library dependency changes
Result
Background Approach Result Summary
8 Frequently Appeared Pattern
18
assertEquals() assertEqual()
Why?: Support for Python 2 to 3 changes
self.stbout() self.stubs.Set()
Why?: Support for OpenStacks‘ library dependency changes
xrange() range()
Result
Background Approach Result Summary
8 Frequently Appeared Pattern
19
assertEquals() assertEqual()
Why?: Support for Python 2 to 3 changes
assertTrue(x in array)
Why?: Improve readability
assertIn(x, array)
xrange() range()
self.stbout() self.stubs.Set()
Why?: Support for OpenStacks‘ library dependency changes
Result
Background Approach Result Summary
8 Frequently Appeared Pattern
20
assertEquals() assertEqual()
Why?: Support for Python2 to 3 changes
assertTrue(x in array)
Why?: Improve readability
assertIn(x, array)
- xrange() + range()
self.stbout() self.stubs.Set()
Why?: Support for OpenStacks‘ library dependency changes
Thresholds:
Appeared time > 300
Accuracy > 10%
Total 8 patterns
Cover: 32.3% (19,940/ 61,673) similar patches
Accuracy: 45.9%
Result
Background Approach Result Summary
Patterns are discussed on StackOverflow
21
- assertEquals() + assertEqual()
Why?: Support for Python2 to 3 changes
- assertTrue(x in array)
Why?: Improve readability
+ assertIn(x, array)
- xrange() + range()
- self.stbout() + self.stubs.Set()
Why?: Support for OpenStacks‘ library dependency changes
Result
Background Approach Result Summary
修正しました
For Automatically Code Review:
Work as GitHub Bot
22
Patch authorBot
I fixed
Reviewer
OK
Sample URL: https://github.com/Ikuyadeu/ExtentionTest/pull/9
Result
Background Approach Result Summary
vs Other Tool (1 / 2)
Static Analysis Tool
FOO=0 foo_=_0
23
Bad name
Waste
space
Static Analysis Tool (pylint)
Fix based on Language
Other tools: ESlint, Pmd, checkstyle
Result
Background Approach Result Summary
vs Other Tool (1 / 2)
Static Analysis Tool
FOO=0 foo_=_0
24
Static Analysis Tool (pylint)
Fix based on Language
This research:
Project-specific
changes
self.stbout()
xrange()
self.stubs.Set()
range()
Old library
dependency
Language
definition
Result
Background Approach Result Summary
vs Other Tool (2 / 2)
25
Choose best rule set from large rule set
• Invalid-name
• Bad-continuation
• Wrong-import-order
• Invalid-name
• Bad-continuation
• Wrong-import-order
IntelliCode
Result
Background Approach Result Summary
vs Other Tool (2 / 2)
26
Choose best rule set from large rule set
Find NEW pattern set from history
• Invalid-name
• Bad-continuation
• Wrong-import-order
• Invalid-name
• Bad-continuation
• Wrong-import-order
• disk2disk_api
• stubs.Set2stub_out
• assert-equals2equal
IntelliCode
This Study
Result
Background Approach Result Summary
vs Other Tool (2 / 2)
27
Choose best rule set from large rule set
Find NEW pattern set from history
• Invalid-name
• Bad-continuation
• Wrong-import-order
• Invalid-name
• Bad-continuation
• Wrong-import-order
• disk2disk_api
• stubs.Set2stub_out
• assert-equals2equal
IntelliCode
This Study
Support project-specific problem
Support change of environment
Result
Background Approach Result Summary
Future Work
• Which pattern should bot choose?
Most appeared pattern, High accuracy pattern
• Compare with Other Projects and Languages’
Patterns
• Evaluate by submitting pull request, and get ratio of
Accepted / Submitted pull request
28
Summary
📧ueda.yuki.un7@is.naist.jp

More Related Content

What's hot

What's hot (8)

C++20 Key Features Summary
C++20 Key Features SummaryC++20 Key Features Summary
C++20 Key Features Summary
 
SCons an Introduction
SCons an IntroductionSCons an Introduction
SCons an Introduction
 
[H3 2012] 오픈소스로 개발 실력 쌓기
[H3 2012] 오픈소스로 개발 실력 쌓기[H3 2012] 오픈소스로 개발 실력 쌓기
[H3 2012] 오픈소스로 개발 실력 쌓기
 
20220716_만들면서 느껴보는 POP
20220716_만들면서 느껴보는 POP20220716_만들면서 느껴보는 POP
20220716_만들면서 느껴보는 POP
 
Logging system of Android
Logging system of AndroidLogging system of Android
Logging system of Android
 
Arte na idade média
Arte na idade médiaArte na idade média
Arte na idade média
 
Spring data
Spring dataSpring data
Spring data
 
All about Context API
All about Context APIAll about Context API
All about Context API
 

Similar to Mining Source Code Improvement Patterns from Similar Code Review Works

Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
Sebastiano Panichella
 
Reverse engineering and theory building v3
Reverse engineering and theory building v3Reverse engineering and theory building v3
Reverse engineering and theory building v3
ClarkTony
 

Similar to Mining Source Code Improvement Patterns from Similar Code Review Works (20)

Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review Works
 
Scientific Software Development
Scientific Software DevelopmentScientific Software Development
Scientific Software Development
 
Expanding the idea of static analysis from code check to other development pr...
Expanding the idea of static analysis from code check to other development pr...Expanding the idea of static analysis from code check to other development pr...
Expanding the idea of static analysis from code check to other development pr...
 
Static analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesStatic analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutes
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
 
Delivering High Quality Elixir Code using Gitlab
Delivering High Quality Elixir Code using GitlabDelivering High Quality Elixir Code using Gitlab
Delivering High Quality Elixir Code using Gitlab
 
Does static analysis need machine learning?
Does static analysis need machine learning?Does static analysis need machine learning?
Does static analysis need machine learning?
 
Devry CIS 355A Full Course Latest
Devry CIS 355A Full Course LatestDevry CIS 355A Full Course Latest
Devry CIS 355A Full Course Latest
 
Tech talks#6: Code Refactoring
Tech talks#6: Code RefactoringTech talks#6: Code Refactoring
Tech talks#6: Code Refactoring
 
Introduction to Aspect Oriented Programming
Introduction to Aspect Oriented ProgrammingIntroduction to Aspect Oriented Programming
Introduction to Aspect Oriented Programming
 
Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...
Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...
Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...
 
Compiler Construction | Lecture 14 | Interpreters
Compiler Construction | Lecture 14 | InterpretersCompiler Construction | Lecture 14 | Interpreters
Compiler Construction | Lecture 14 | Interpreters
 
Symbexecsearch
SymbexecsearchSymbexecsearch
Symbexecsearch
 
SQLGitHub - Access GitHub API with SQL-like syntaxes
SQLGitHub - Access GitHub API with SQL-like syntaxesSQLGitHub - Access GitHub API with SQL-like syntaxes
SQLGitHub - Access GitHub API with SQL-like syntaxes
 
C++ Windows Forms L01 - Intro
C++ Windows Forms L01 - IntroC++ Windows Forms L01 - Intro
C++ Windows Forms L01 - Intro
 
Kyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdfKyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdf
 
Reverse engineering and theory building v3
Reverse engineering and theory building v3Reverse engineering and theory building v3
Reverse engineering and theory building v3
 
Eclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDT
Eclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDTEclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDT
Eclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDT
 
BIRTE-13-Kawashima
BIRTE-13-KawashimaBIRTE-13-Kawashima
BIRTE-13-Kawashima
 
EKON 23 Code_review_checklist
EKON 23 Code_review_checklistEKON 23 Code_review_checklist
EKON 23 Code_review_checklist
 

More from 奈良先端大 情報科学研究科

More from 奈良先端大 情報科学研究科 (20)

テレコミュニケーションを支援してみよう
テレコミュニケーションを支援してみようテレコミュニケーションを支援してみよう
テレコミュニケーションを支援してみよう
 
マイコンと機械学習を使って行動認識システムを作ろう
マイコンと機械学習を使って行動認識システムを作ろうマイコンと機械学習を使って行動認識システムを作ろう
マイコンと機械学習を使って行動認識システムを作ろう
 
5G時代を支えるNFVによるネットワーク最適設計
5G時代を支えるNFVによるネットワーク最適設計5G時代を支えるNFVによるネットワーク最適設計
5G時代を支えるNFVによるネットワーク最適設計
 
21.Raspberry Piを用いたIoTアプリの開発
21.Raspberry Piを用いたIoTアプリの開発21.Raspberry Piを用いたIoTアプリの開発
21.Raspberry Piを用いたIoTアプリの開発
 
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
 
11.実装の脆弱性を利用して強力な暗号を解読してみよう!
11.実装の脆弱性を利用して強力な暗号を解読してみよう!11.実装の脆弱性を利用して強力な暗号を解読してみよう!
11.実装の脆弱性を利用して強力な暗号を解読してみよう!
 
8. ミニ・スーパコンピュータを自作しよう!
8. ミニ・スーパコンピュータを自作しよう!8. ミニ・スーパコンピュータを自作しよう!
8. ミニ・スーパコンピュータを自作しよう!
 
16. マイコンと機械学習を使って行動認識システムを作ろう
16. マイコンと機械学習を使って行動認識システムを作ろう16. マイコンと機械学習を使って行動認識システムを作ろう
16. マイコンと機械学習を使って行動認識システムを作ろう
 
15. テレイグジスタンスシステムを制作してみよう
15. テレイグジスタンスシステムを制作してみよう15. テレイグジスタンスシステムを制作してみよう
15. テレイグジスタンスシステムを制作してみよう
 
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
 
19. 生物に学ぶ人工知能とロボット制御
19. 生物に学ぶ人工知能とロボット制御19. 生物に学ぶ人工知能とロボット制御
19. 生物に学ぶ人工知能とロボット制御
 
13. SDRで学ぶ無線通信
13. SDRで学ぶ無線通信13. SDRで学ぶ無線通信
13. SDRで学ぶ無線通信
 
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
 
21. 人の動作・行動センシングに基づく拡張現実感システムの開発
21. 人の動作・行動センシングに基づく拡張現実感システムの開発21. 人の動作・行動センシングに基づく拡張現実感システムの開発
21. 人の動作・行動センシングに基づく拡張現実感システムの開発
 
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
 
9. マイコンと機械学習を使って行動認識システムを作ろう
9. マイコンと機械学習を使って行動認識システムを作ろう9. マイコンと機械学習を使って行動認識システムを作ろう
9. マイコンと機械学習を使って行動認識システムを作ろう
 
6. 生物に学ぶ人工知能とロボット制御
6. 生物に学ぶ人工知能とロボット制御6. 生物に学ぶ人工知能とロボット制御
6. 生物に学ぶ人工知能とロボット制御
 
14. モバイルエージェントによる並列分散学習システムの構築
14. モバイルエージェントによる並列分散学習システムの構築14. モバイルエージェントによる並列分散学習システムの構築
14. モバイルエージェントによる並列分散学習システムの構築
 
17. 100台の小型ロボットを協調させよう
17. 100台の小型ロボットを協調させよう17. 100台の小型ロボットを協調させよう
17. 100台の小型ロボットを協調させよう
 
5. ミニ・スーパコンピュータを自作しよう!
5. ミニ・スーパコンピュータを自作しよう!5. ミニ・スーパコンピュータを自作しよう!
5. ミニ・スーパコンピュータを自作しよう!
 

Recently uploaded

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
LeenakshiTyagi
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
University of Hertfordshire
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
anilsa9823
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 

Recently uploaded (20)

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 

Mining Source Code Improvement Patterns from Similar Code Review Works

  • 1. Mining Source Code Improvement Patterns from Similar Code Review Yuki Ueda1, Takashi Ishio1, Akinori Ihara2, Kenichi Matsumoto1 1Nara Institute of Science and Technology 2Wakayama University 13th International Workshop on Software Clones (IWSC’19)
  • 2. Background Approach Result Summary Contents • Goal:Reduce Code Review Cost • Approach:Code Improvement Pattern Detection That Appeared Review • Evaluation: Measure Patterns’ Frequency and Accuracy 2
  • 3. Background Approach Result Summary Code review process: Reviewers suggest code fix Patch Author Reviewer Project 3 - i=key + i=dic[“key”] Patch Background (1) Submit
  • 4. Background Approach Result Summary Code review process: Reviewers suggest code fix Patch Author Reviewer Project 4 - i=key + i=dic[“key”] Patch You should fix (1) Submit (2) Review, Fix suggestion Background
  • 5. Background Approach Result Summary Code review process: Reviewers suggest code fix 5 - i=key + i=dic[“key”] - i=key + i_=_dic[“KEY”] (3) Integrate Patch Author Reviewer Project(1) Submit (2) Review, Fix suggestion Reviewed Patch (Integrated Patch) Pre-Review Patch (Initial Patch) Background
  • 6. Background Approach Result Summary Problem: Reviewers need to check several times 6 - i=key + i=dic[“key”] - i=key + i_=_dic[“KEY”] (2)〜(n) Review,Fix suggestion (n) Integrate Patch Author Reviewer Project(1) Submit Reviewed Patch (Integrated Patch) Pre-Review Patch (Initial Patch) String should be lower Waste space Background
  • 7. Background Approach Result Summary Goal: Reduce Similar Review Automatically 7 Auto Review System (2) Review,Fix suggestion (3) Review request Patch Author Reviewer(1) Submit Similar patch is fixed in the past like.. Background
  • 8. Background Approach Result Summary Approach: Detect Pattern from Reviewed Patch Diff 8 ”key” , it will be “KEY” Pattern: i=dic[“key”] i=dic[“KEY”] Dataset: i=dic[“key”] i=dic[“KEY”]i=dic[“key”] Pre-Review Patch i=dic[“KEY”] Reviewed Patch Approach If patch has Detect
  • 9. Background Approach Result Summary Approach: Detect Pattern from Reviewed Patch Diff 9 Patch Author Auto Review System print(“key”) print(“KEY”) ”key” , it will be “KEY” Pattern: If patch has Use Dataset: i=dic[“key”] i=dic[“KEY”]i=dic[“key”] i=dic[“KEY”]i=dic[“key”] Pre-Review Patch i=dic[“KEY”] Reviewed Patch Approach
  • 10. Background Approach Result Summary Detect Code Improved Pattern (1/2): Divide Patch Diff to Chunk 10 - if i␣==␣0: + if i==0: break - i=dic[“key”] + i=dic.get(“key”) - i=dic[“key”] + i=dic.get(“key”) - if i␣==␣0: + if i==0: Approach
  • 11. Background Approach Result Summary Detect Code Improved Pattern (1/2): Get Pattern by Sequential Pattern Mining 11 - i=dic[“key”] + i=dic.get(“key”) - [i=dic - [ + .get(i=dic - [i=dic i=dic - [ + .get(i=dic “key” - ] + ) Length 2 Length3 Length4 Approach
  • 12. Background Approach Result Summary Detect Code Improved Pattern (1/2): Get Pattern by Sequential Pattern Mining 12 - i=dic[“key”] + i=dic.get(“key”) - [i=dic - [ + .get(i=dic - [i=dic - ] i=dic + ) - [ + .get(i=dic “key” Length 2 Length3 Length4 Keep Frequently Appeared and Longer Patterns Approach
  • 13. Background Approach Result Summary Pattern Evaluation 13 i=dic + .get( - ] Appeared Time: + )(e.g. Pattern i=dic .get( ] ) Pre-Reviewed Patches that have Reviewed Patches that have ) Number of Patch Pairs Approach
  • 14. Background Approach Result Summary Pattern Evaluation 14 Appeared Time: .get( ) Pre-Reviewed Patches that have Reviewed Patches that have Number of Patch Pairs i=dic[“key”] i=dic.get(“key”) i=dic[”KEY”] Count NOT Count e.g.): Pre-Reviewed Patch i=dic + .get( - ] + )(e.g. Pattern ) i=dic ] Approach
  • 15. Background Approach Result Summary 15 Appeared Time: .get( ) Pre-Reviewed Patches that have Reviewed Patches that have Number of Patch Pairs Accuracy: .get( ) Pre-Reviewed Patches that have Reviewed Patches that have Ratio of Patch Pairs Pattern Evaluation i=dic ] i=dic ] i=dic + .get( - ] + )(e.g. Pattern ) Approach
  • 16. Background Approach Result Summary Target 16 Project OpenStack Language Python3 Time Period 2011-2016 # Patches 173,749 # Chunks for Detect Pattern 555,050 # Chunks for Evaluate Pattern 61,673 Result
  • 17. Background Approach Result Summary 8 Frequently Appeared Pattern 17 self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Result
  • 18. Background Approach Result Summary 8 Frequently Appeared Pattern 18 assertEquals() assertEqual() Why?: Support for Python 2 to 3 changes self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes xrange() range() Result
  • 19. Background Approach Result Summary 8 Frequently Appeared Pattern 19 assertEquals() assertEqual() Why?: Support for Python 2 to 3 changes assertTrue(x in array) Why?: Improve readability assertIn(x, array) xrange() range() self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Result
  • 20. Background Approach Result Summary 8 Frequently Appeared Pattern 20 assertEquals() assertEqual() Why?: Support for Python2 to 3 changes assertTrue(x in array) Why?: Improve readability assertIn(x, array) - xrange() + range() self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Thresholds: Appeared time > 300 Accuracy > 10% Total 8 patterns Cover: 32.3% (19,940/ 61,673) similar patches Accuracy: 45.9% Result
  • 21. Background Approach Result Summary Patterns are discussed on StackOverflow 21 - assertEquals() + assertEqual() Why?: Support for Python2 to 3 changes - assertTrue(x in array) Why?: Improve readability + assertIn(x, array) - xrange() + range() - self.stbout() + self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Result
  • 22. Background Approach Result Summary 修正しました For Automatically Code Review: Work as GitHub Bot 22 Patch authorBot I fixed Reviewer OK Sample URL: https://github.com/Ikuyadeu/ExtentionTest/pull/9 Result
  • 23. Background Approach Result Summary vs Other Tool (1 / 2) Static Analysis Tool FOO=0 foo_=_0 23 Bad name Waste space Static Analysis Tool (pylint) Fix based on Language Other tools: ESlint, Pmd, checkstyle Result
  • 24. Background Approach Result Summary vs Other Tool (1 / 2) Static Analysis Tool FOO=0 foo_=_0 24 Static Analysis Tool (pylint) Fix based on Language This research: Project-specific changes self.stbout() xrange() self.stubs.Set() range() Old library dependency Language definition Result
  • 25. Background Approach Result Summary vs Other Tool (2 / 2) 25 Choose best rule set from large rule set • Invalid-name • Bad-continuation • Wrong-import-order • Invalid-name • Bad-continuation • Wrong-import-order IntelliCode Result
  • 26. Background Approach Result Summary vs Other Tool (2 / 2) 26 Choose best rule set from large rule set Find NEW pattern set from history • Invalid-name • Bad-continuation • Wrong-import-order • Invalid-name • Bad-continuation • Wrong-import-order • disk2disk_api • stubs.Set2stub_out • assert-equals2equal IntelliCode This Study Result
  • 27. Background Approach Result Summary vs Other Tool (2 / 2) 27 Choose best rule set from large rule set Find NEW pattern set from history • Invalid-name • Bad-continuation • Wrong-import-order • Invalid-name • Bad-continuation • Wrong-import-order • disk2disk_api • stubs.Set2stub_out • assert-equals2equal IntelliCode This Study Support project-specific problem Support change of environment Result
  • 28. Background Approach Result Summary Future Work • Which pattern should bot choose? Most appeared pattern, High accuracy pattern • Compare with Other Projects and Languages’ Patterns • Evaluate by submitting pull request, and get ratio of Accepted / Submitted pull request 28 Summary