Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

PyDriller: Python Framework for Mining Software Repositories

126 Aufrufe

Veröffentlicht am

FSE 2018

Davide Spadini

Veröffentlicht in: Ingenieurwesen
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

PyDriller: Python Framework for Mining Software Repositories

  1. 1. PyDriller: Python Framework for Mining Software Repositories Davide Spadini, Mauricio Aniche, Alberto Bacchelli
  2. 2. PyDriller: Python Framework for Mining Software Repositories Davide Spadini, Mauricio Aniche, Alberto Bacchelli ishepard @DavideSpadini
  3. 3. What?
  4. 4. Framework to analyse Git (and soon Mercurial) repositories
  5. 5. Why?
  6. 6. • There are already many frameworks for Git • Generally, one for each programming language • Java -> JGit • Python -> GitPython • Javascript -> nodegit • etc.
  7. 7. So, why?
  8. 8. How many commands does Git have? • > 20? • > 50? • > 100? • > 150? 154!!
  9. 9. PyDriller • Aim: to ease the extraction of information from Git repositories • What is supported: • analysing the history of a project • retrieving commit information (date, message, authors, etc.) • retrieving files information (diff, source code) • What is not supported: • writing on the repo (git pull, git push, git add, git commit, etc..)
  10. 10. Demo
  11. 11. Statistics • Everything is lazy evaluated, so you “pay” what you get. 1. only commit information: immediate (as git log) 2. commit and file information: 60 commits/sec (1240 commits in 22 seconds) 3. commit, file and metrics information: 4 commits/s (1240 commits in ~5min)
  12. 12. Thank you for your support! • Some numbers: 1. Downloaded approximatively 4000 times 2. 100 times only last 2 weeks • Community driven • University of Zurich, TU Delft and University of Catania teach PyDriller in their MSR courses • SIG uses PyDriller in their quality assessments
  13. 13. What’s next? • A company asked me to implement RepositoryMining().traverse_files() • Mercurial support • Ideas? Talk to me or submit a PR :)
  14. 14. PyDriller • Source code: https://github.com/ishepard/pydriller • Doc: https://pydriller.readthedocs.io/en/latest/ • Feel free to leave a star! :)

×