Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Implications Of Dual Participation Of Floss Developer
1. Are FLOSS Developers Committing to CVS/SVN as much
as they are Talking in Mailing Lists?
Challenges for Integrating Data from Multiple Repositories
Sulayman K. Sowe, I. Samoladas, I. Stamelos, L. Angelis
Dept. of Informatics, Aristotle University, Greece.
sksowe@csd.auth.gr
3rd International Workshop on Public Data about Software Development (WoPDaSD)
10th September 2008, Milan, Italy.
This research is partially sponsored by the FLOSSMetrics Project (Ref. No. FP6-IST5-033547), http://flossmetrics.org/
and SQO-OSS project (Ref. No. FP6-IST-5-033331),http://www.sqo-oss.eu/
WoPDaSD ~.1
2. In this presentation...
➲ Nomadic life of FLOSS developers
Motivation for this research:
Research hypothesis
➲ Methodology in brief
Data & Source
Identification of developers from SVN & Lists
➲ Results & Discussion
➲ Summary & conclusion
Ongoing research
WoPDaSD ~.2
3. Nomadic life of FLOSS developers
➲ Like the Fulani nomads of the West African planes
FLOSS developers are not bound to a single territory
and are free to:
participate in other projects or communities,
use and reuse software/bits of code from other projects,
suggest, argue for or against requirements, specs., etc. in
projects where they have least commits rights,
use different identities (usernames, email), etc.
WoPDaSD ~.3
4. Motivation for this research
➲
Why research FLOSS developers or nomads?
Understand the collaborative nature of developing FLOSS in
terms developer participation (code commits and email postings)
in multiple repositories - SVN and Mailing Lists.
➲
Research Hypothesis:
IF Mailing lists are the main communication veins in most projects,
then CVS/SVN is a collection of arteries. Thus,
FLOSS developers code and participate in lists discussions:
H0: ”FLOSS developers contribute equally to code
repository and mailing lists”, alternative
H1: “FLOSS developers contribute more to code repository
than mailing lists”.
WoPDaSD ~.4
5. Methodology…Data & Source
➲ Retrieve data from 14 projects from the Flossmetric
retrieval system
Mailing lists data dumps (.sql file format)
SVN data dumps (.sql file format)
WoPDaSD ~.5
6. Initial (Raw) Data
➲ How many SVN commiters and Mailing Lists posters in each project?
SVN
Commits
ML
Posts
WoPDaSD ~.6
7. Methodology…Identification of developers
➲
The main problem in
studying developers
activities in multiple
repositories is
identification:
➲
Is committer A in SVN of
project X the same person
(Poster A) in mailing lists of
project X?
WoPDaSD ~.7
8. Results & Discussion…1
➲ The query result for each project gave us developers co-occurrence in both SVN
and mailing list
➲ N=486 for all 14 projects.
Percentage of developer in both repositories
In 8 projects = 57.14%
In 4 projects = 90.11%
In 2 projects = 80.21%
➲ What is going on in ibatis and turbine?
WoPDaSD ~.8
9. Results & Discussion...2
➲ Distribution of Commits & Posts
Domination of commits over posts
Mean commit per developer > Mean post per developer
Developers are committing more to SVN than they are posting to mailing lists,
EXCEPT in ibatis and turbine.
WoPDaSD ~.9
10. Results & Discussion...3
➲ Relationship between Commits and Posts
➲ Overall correlation between commits and posts shows statistical significance
(with * and for p < 0.05).
WoPDaSD ~.10
11. Results & Discussion...4
➲ Developers contribution in terms of commits and posts
Wilcoxon signed rank test applied on mean values shows almost 50-50 split
between projects where commits = posts (green) and commits > posts (yellow).
With only the turbine project showing otherwise.
WoPDaSD ~.11
12. Summary & conclusion
➲ FLOSS developers are coding as much as they are
talking. They contribute equally to cod repositories
and mailing lists, H0 supported.
➲ However, in almost all the projects, developers made
more commits than posts, H1 supported.
➲ Why turbine and ibatis are outliers?
Maybe the high prolific developer is making more posts than commits; in
a ratio 4:1.
Something peculiar about the composition of Apache related projects
➲ Ongoing aspects of this research
Automate data collection and identification process
Analyze a total of 60 or more projects from the FM retrieval system.
Add a quality dimension to committers variable:
Categorize commits: modifications, deletions, additions, code related,
documentation (reports, readme, etc)
Time scale/Sliding frames: the evolution of commits and posts over a
given period.
WoPDaSD ~.12
13. Thank you for your attention
Questions ?
Comments
Suggestion for improvements
WoPDaSD ~.13