SlideShare a Scribd company logo
1 of 20
Gitana:
a SQL-based Git Repository Inspector
ER’15 - Stockholm
Jordi Cabot
jordi.cabot@icrea.cat
Javier L. Cánovas Izquierdo
jcanovasi@uoc.edu
Valerio Cosentino
valerio.cosentino@inria.fr
Outline
 Motivation
 Gitana
Git Conceptual Schema
Database Implementation
Database Operations
Tool Support
Evaluation
 Application Scenarios
Integration
Query Functionalities
 Conclusion
Motivation
Software development projects are complex due to the extensive collaboration and
creative thinking involved
Motivation
Issue trackers
Source Control
Management systems
…
Code review tools
Several tools exist to support different development activities
Motivation
Issue trackers
Source Control
Management systems
…
Code review tools
• They provide just a partial view of the software project
Motivation
Issue trackers
Source Control
Management systems
…
Code review tools
• They provide just a partial view of the software project
• They come with insufficient means to perform non-trivial query operations
Motivation
Issue trackers
…
Code review tools
• They provide just a partial view of the software project
• They come with insufficient means to perform non-trivial query operations
• Specially true for Git repositories
Gitana
Issue trackers
…
Code review tools
 Conceptual model for Git / relational database implementation
 Import and incremental update processes
 JSON exporter to facilitate the analysis of Git repositories in other technologies
Gitana
Issue trackers
…
Code review tools
• Easy integration with other tools (issue trackers, etc.) that rely on a database
Gitana
Issue trackers
…
Code review tools
• Easy integration with other tools (issue trackers, etc.) that rely on a database
• Easy inspection of any Git repository
Conceptual Schema
Database Implementation
Database Operations
Initial Import Process
Incremental Update
JSON Exporter
Tool Support
github.com/SOM-Research/Gitana
Evaluation
The extraction time only refers to the initial import. Once this phase is complete, the
incremental mechanism takes over and minimizes the time for future imports.
Executed on a 2.6 GHz Intel Core i7 processor with 8 GB of RAM.
Integration
Query Functionalities
Comparision between command line and SQL
Advanced queries
# modifications on a given file
# files commented per developer
Conclusion
The import process is slow. It should
be parallelized.
The bad The good
The JSON export process binds the
user to the predefined output structure.
The exporter should be more tunable.
The materialized views in the database
are recalculated each time the update
process is triggered (not good for
large repositories). Incremental
maintenance on the materialized views
could be applied.
Genericity. Gitana stores all the
information in a Git repository.
Flexibility. Users can perform any kind
of query on the repository using SQL.
Incrementality. Gitana includes an
incremental propagation mechanism.
Exportability. The JSON exporter
makes the database information
available in other technologies.
Extensibility. Gitana can be easily
integrated with other DB-based tools.
Availability. Gitana is freely available
on GitHub
What’s next?
Coding platform
Issue trackers
Commun. channels
Code review tools
• Deeper integration of all kinds of project information
What’s next?
• Deeper integration of all kinds of project information
• One single central (database-oriented) shared access point for all the project
information, enabling lots of interesting cross-cutting queries.

More Related Content

What's hot

Providers
ProvidersProviders
Providers
BeMyApp
 

What's hot (15)

Version Control in AI/Machine Learning by Datmo
Version Control in AI/Machine Learning by DatmoVersion Control in AI/Machine Learning by Datmo
Version Control in AI/Machine Learning by Datmo
 
Thomson Reuters, TMS: Workflow in GitLab
Thomson Reuters, TMS: Workflow in GitLabThomson Reuters, TMS: Workflow in GitLab
Thomson Reuters, TMS: Workflow in GitLab
 
REST vs GraphQL
REST vs GraphQLREST vs GraphQL
REST vs GraphQL
 
Are we there yet?
Are we there yet?Are we there yet?
Are we there yet?
 
This Week in Neo4j - 24th November 2018
This Week in Neo4j - 24th November 2018This Week in Neo4j - 24th November 2018
This Week in Neo4j - 24th November 2018
 
Authority control project - ITT Dublin (2008)
Authority control project - ITT Dublin (2008)Authority control project - ITT Dublin (2008)
Authority control project - ITT Dublin (2008)
 
APIdays Helsinki 2019 - GraphQL API Management with Amit P. Acharya, IBM
APIdays Helsinki 2019 - GraphQL API Management with Amit P. Acharya, IBMAPIdays Helsinki 2019 - GraphQL API Management with Amit P. Acharya, IBM
APIdays Helsinki 2019 - GraphQL API Management with Amit P. Acharya, IBM
 
Introduction to GraphQL
Introduction to GraphQLIntroduction to GraphQL
Introduction to GraphQL
 
The Monitoring and Metic aspects of Eclipse MicroProfile
The Monitoring and Metic aspects of Eclipse MicroProfileThe Monitoring and Metic aspects of Eclipse MicroProfile
The Monitoring and Metic aspects of Eclipse MicroProfile
 
Graphql
GraphqlGraphql
Graphql
 
Boost your APIs with GraphQL
Boost your APIs with GraphQLBoost your APIs with GraphQL
Boost your APIs with GraphQL
 
SiLA: Making the standard fit for the future and adapting an open-source coll...
SiLA: Making the standard fit for the future and adapting an open-source coll...SiLA: Making the standard fit for the future and adapting an open-source coll...
SiLA: Making the standard fit for the future and adapting an open-source coll...
 
Providers
ProvidersProviders
Providers
 
Git influencer -catherine shen
Git influencer -catherine shenGit influencer -catherine shen
Git influencer -catherine shen
 
GraphQL API Crafts presentation
GraphQL API Crafts presentationGraphQL API Crafts presentation
GraphQL API Crafts presentation
 

Similar to Gitana: a SQL-based Git Repository Inspector

OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
NETWAYS
 
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Raffi Khatchadourian
 

Similar to Gitana: a SQL-based Git Repository Inspector (20)

OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
 
Git workshop
Git workshopGit workshop
Git workshop
 
It's all about feedback - code review as a great tool in the agile toolbox
It's all about feedback - code review as a great tool in the agile toolboxIt's all about feedback - code review as a great tool in the agile toolbox
It's all about feedback - code review as a great tool in the agile toolbox
 
Advanced ICT Tools - Git, Github and other collaborative tools
Advanced ICT Tools - Git, Github and other collaborative toolsAdvanced ICT Tools - Git, Github and other collaborative tools
Advanced ICT Tools - Git, Github and other collaborative tools
 
GitPro Whitepaper
GitPro WhitepaperGitPro Whitepaper
GitPro Whitepaper
 
KubeCon EU 2022 Istio, Flux & Flagger.pdf
KubeCon EU 2022 Istio, Flux & Flagger.pdfKubeCon EU 2022 Istio, Flux & Flagger.pdf
KubeCon EU 2022 Istio, Flux & Flagger.pdf
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
GitLab.pptx
GitLab.pptxGitLab.pptx
GitLab.pptx
 
Git tech
Git techGit tech
Git tech
 
Optimize Your Enterprise Git Webinar
Optimize Your Enterprise Git WebinarOptimize Your Enterprise Git Webinar
Optimize Your Enterprise Git Webinar
 
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
 
Git/Gerrit with TeamForge
Git/Gerrit with TeamForgeGit/Gerrit with TeamForge
Git/Gerrit with TeamForge
 
Version Control, Writers, and Workflows
Version Control, Writers, and WorkflowsVersion Control, Writers, and Workflows
Version Control, Writers, and Workflows
 
Continuous Integration
Continuous IntegrationContinuous Integration
Continuous Integration
 
GitOps - Modern best practices for high velocity app dev using cloud native t...
GitOps - Modern best practices for high velocity app dev using cloud native t...GitOps - Modern best practices for high velocity app dev using cloud native t...
GitOps - Modern best practices for high velocity app dev using cloud native t...
 
Introduction to git & github
Introduction to git & githubIntroduction to git & github
Introduction to git & github
 
Github Copilot vs Amazon CodeWhisperer for Java developers at JCON 2023
Github Copilot vs Amazon CodeWhisperer for Java developers at JCON 2023Github Copilot vs Amazon CodeWhisperer for Java developers at JCON 2023
Github Copilot vs Amazon CodeWhisperer for Java developers at JCON 2023
 
Devops.pptx
Devops.pptxDevops.pptx
Devops.pptx
 
Continuous integration and delivery for java based web applications
Continuous integration and delivery for java based web applicationsContinuous integration and delivery for java based web applications
Continuous integration and delivery for java based web applications
 
Expert guidance on migrating from magento 1 to magento 2
Expert guidance on migrating from magento 1 to magento 2Expert guidance on migrating from magento 1 to magento 2
Expert guidance on migrating from magento 1 to magento 2
 

More from Valerio Cosentino

Gamification oss
Gamification ossGamification oss
Gamification oss
Valerio Cosentino
 
Extracting Business Rules from COBOL: A Model-Based Framework
Extracting Business Rules from COBOL: A Model-Based FrameworkExtracting Business Rules from COBOL: A Model-Based Framework
Extracting Business Rules from COBOL: A Model-Based Framework
Valerio Cosentino
 
A Model Driven Reverse Engineering framework for extracting business rules ou...
A Model Driven Reverse Engineering framework for extracting business rules ou...A Model Driven Reverse Engineering framework for extracting business rules ou...
A Model Driven Reverse Engineering framework for extracting business rules ou...
Valerio Cosentino
 

More from Valerio Cosentino (19)

Tracking counterfeiting on the web with python and ml
Tracking counterfeiting on the web with python and mlTracking counterfeiting on the web with python and ml
Tracking counterfeiting on the web with python and ml
 
GrimoireLab: Measuring the health of your software project with Python
GrimoireLab: Measuring the health of your software project with PythonGrimoireLab: Measuring the health of your software project with Python
GrimoireLab: Measuring the health of your software project with Python
 
Perceval, Graal and Arthur: The Quest for Software Project Data
Perceval, Graal and Arthur: The Quest for Software Project DataPerceval, Graal and Arthur: The Quest for Software Project Data
Perceval, Graal and Arthur: The Quest for Software Project Data
 
Gamification oss
Gamification ossGamification oss
Gamification oss
 
SortingHat: Wizardry on Software Project Members
SortingHat: Wizardry on Software Project MembersSortingHat: Wizardry on Software Project Members
SortingHat: Wizardry on Software Project Members
 
Measuring Software development with GrimoireLab
Measuring Software development with GrimoireLabMeasuring Software development with GrimoireLab
Measuring Software development with GrimoireLab
 
Graal The Quest for Source Code Knowledge
Graal  The Quest for Source Code KnowledgeGraal  The Quest for Source Code Knowledge
Graal The Quest for Source Code Knowledge
 
Measuring Software development with GrimoireLab
Measuring Software development with GrimoireLabMeasuring Software development with GrimoireLab
Measuring Software development with GrimoireLab
 
Crossminer and GrimoireLab
Crossminer and GrimoireLabCrossminer and GrimoireLab
Crossminer and GrimoireLab
 
Perceval: Software Project Data at Your Will
Perceval: Software Project Data at Your WillPerceval: Software Project Data at Your Will
Perceval: Software Project Data at Your Will
 
Extending grimoirelab
Extending grimoirelabExtending grimoirelab
Extending grimoirelab
 
Perceval
PercevalPerceval
Perceval
 
Gamification pres-scme-2017
Gamification pres-scme-2017Gamification pres-scme-2017
Gamification pres-scme-2017
 
A Model-Based Approach for Extracting Business Rules out of Legacy Informatio...
A Model-Based Approach for Extracting Business Rules out of Legacy Informatio...A Model-Based Approach for Extracting Business Rules out of Legacy Informatio...
A Model-Based Approach for Extracting Business Rules out of Legacy Informatio...
 
Assessing the Bus Factor of Git Repositories
Assessing the Bus Factor of Git RepositoriesAssessing the Bus Factor of Git Repositories
Assessing the Bus Factor of Git Repositories
 
A Model-Driven Approach to Generate External DSLs from Object-Oriented APIs
A Model-Driven Approach to Generate External DSLs from Object-Oriented APIsA Model-Driven Approach to Generate External DSLs from Object-Oriented APIs
A Model-Driven Approach to Generate External DSLs from Object-Oriented APIs
 
Extracting Business Rules from COBOL: A Model-Based Framework
Extracting Business Rules from COBOL: A Model-Based FrameworkExtracting Business Rules from COBOL: A Model-Based Framework
Extracting Business Rules from COBOL: A Model-Based Framework
 
Extracting UML/OCL Integrity Constraints and Derived Types from Relational Da...
Extracting UML/OCL Integrity Constraints and Derived Types from Relational Da...Extracting UML/OCL Integrity Constraints and Derived Types from Relational Da...
Extracting UML/OCL Integrity Constraints and Derived Types from Relational Da...
 
A Model Driven Reverse Engineering framework for extracting business rules ou...
A Model Driven Reverse Engineering framework for extracting business rules ou...A Model Driven Reverse Engineering framework for extracting business rules ou...
A Model Driven Reverse Engineering framework for extracting business rules ou...
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Gitana: a SQL-based Git Repository Inspector

  • 1. Gitana: a SQL-based Git Repository Inspector ER’15 - Stockholm Jordi Cabot jordi.cabot@icrea.cat Javier L. Cánovas Izquierdo jcanovasi@uoc.edu Valerio Cosentino valerio.cosentino@inria.fr
  • 2. Outline  Motivation  Gitana Git Conceptual Schema Database Implementation Database Operations Tool Support Evaluation  Application Scenarios Integration Query Functionalities  Conclusion
  • 3. Motivation Software development projects are complex due to the extensive collaboration and creative thinking involved
  • 4. Motivation Issue trackers Source Control Management systems … Code review tools Several tools exist to support different development activities
  • 5. Motivation Issue trackers Source Control Management systems … Code review tools • They provide just a partial view of the software project
  • 6. Motivation Issue trackers Source Control Management systems … Code review tools • They provide just a partial view of the software project • They come with insufficient means to perform non-trivial query operations
  • 7. Motivation Issue trackers … Code review tools • They provide just a partial view of the software project • They come with insufficient means to perform non-trivial query operations • Specially true for Git repositories
  • 8. Gitana Issue trackers … Code review tools  Conceptual model for Git / relational database implementation  Import and incremental update processes  JSON exporter to facilitate the analysis of Git repositories in other technologies
  • 9. Gitana Issue trackers … Code review tools • Easy integration with other tools (issue trackers, etc.) that rely on a database
  • 10. Gitana Issue trackers … Code review tools • Easy integration with other tools (issue trackers, etc.) that rely on a database • Easy inspection of any Git repository
  • 13. Database Operations Initial Import Process Incremental Update JSON Exporter
  • 15. Evaluation The extraction time only refers to the initial import. Once this phase is complete, the incremental mechanism takes over and minimizes the time for future imports. Executed on a 2.6 GHz Intel Core i7 processor with 8 GB of RAM.
  • 17. Query Functionalities Comparision between command line and SQL Advanced queries # modifications on a given file # files commented per developer
  • 18. Conclusion The import process is slow. It should be parallelized. The bad The good The JSON export process binds the user to the predefined output structure. The exporter should be more tunable. The materialized views in the database are recalculated each time the update process is triggered (not good for large repositories). Incremental maintenance on the materialized views could be applied. Genericity. Gitana stores all the information in a Git repository. Flexibility. Users can perform any kind of query on the repository using SQL. Incrementality. Gitana includes an incremental propagation mechanism. Exportability. The JSON exporter makes the database information available in other technologies. Extensibility. Gitana can be easily integrated with other DB-based tools. Availability. Gitana is freely available on GitHub
  • 19. What’s next? Coding platform Issue trackers Commun. channels Code review tools • Deeper integration of all kinds of project information
  • 20. What’s next? • Deeper integration of all kinds of project information • One single central (database-oriented) shared access point for all the project information, enabling lots of interesting cross-cutting queries.