This document analyzes the community dynamics of three open source bioinformatics projects (BioJava, Biopython, and BioPerl) over their lifetimes of 10+ years based on mailing list and version control data. It finds that most users are only active for their first 3 years before leaving, with survival rates declining over time, but those who survive longer than 3 years tend to remain engaged indefinitely. It also analyzes the social networks of the communities and finds indicators of "core generations" that evolve every 5 years and help define the projects' organization structures. Threats to the validity of the findings are also discussed.
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
Community Dynamics in Open Source Software Projects: Aging and Social Reshaping
1. TeLLNet
Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
I5-FL-MMYY-1 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Community Dynamics in Open
Source Software Projects:
Aging and Social Reshaping
Anna Hannemann and Ralf Klamma
RWTH Aachen University
Advanced Community Information Systems (ACIS)
hannemann@dbis.rwth-aachen.de
2. TeLLNet
Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
I5-FL-MMYY-2
Motivation for Study Settings
Address interdisciplinary projects (Bioinformatics)
– Biology meets Computer Science
– High disparities in level of development experience
– Better approximation for end-user integration in
community information systems (Lead User1, Open
Innovation2, etc.)
Analysis of long-tail: based on mailing lists
Dynamic analysis: community evolution
– Demographic perspective
– Social structure perspective1 von Hippel, E. “Lead users: a source of novel product concepts”, 1986
2 Chesbrough, H. “Open Innovation: The new imperative for creating and profiting from technology”, 2003
3. TeLLNet
Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
I5-FL-MMYY-3
Open Bio*
BioJava, Biopython, BioPerl
Similar problems, infrastructure, organizational issues
Open Bioinformatics Foundation
Long-term: over 13 years
Project* #Messages #User in ML #Commits #Developers LOC
BioJava 11951 2208 8267 94 290608
Biopython 16108 1138 16868 143 249566
BioPerl 31755 2824 12848 139 383351
* [Data on May 20, 2013]
7. TeLLNet
Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
I5-FL-MMYY-7
External Factors
High attention to Bioinformatics due to
sequencing of human genome
Cross-project influence: rich get richer
Personal aspects:
– doing PhD for 3 years
– being in a project with room for OSS
8. TeLLNet
Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
I5-FL-MMYY-8
Population Ecology
Year of birth t0i: date of the first message from user i
to project mailing list
Age group (x; x+1): all currently active project
participants participating in the project for min x and
max x+1 years
Currently active: at least one posting to the mailing
lists in current year
Survival rate (x; x+1)è(x+1; x+2): percentage of
active users in age group (x; x+1) in the last year,
who still active in the current year
9. TeLLNet
Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
I5-FL-MMYY-9
Population Ecology, Example 2010
Age groups
– (0,1) people started in 2010
– (1,2) people started in 2009, still active in 2010
– (2,3) people started in 2008, still active in 2010
– …
Survival rates
– |(1,2)|2010/|(0,1)|2009
– |(2,3)|2010/|(1,2)|2009
– |(3,4)|2010/|(2,3)|2009
– ...
11. TeLLNet
Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
I5-FL-MMYY-11
Conclusions and Discussion (1)
Survival pattern:
– Prediction of minimal number of newcomers required to
support the same level of participation
– Longer than three years survives only 7.2%
– Who saves over three years, stays “forever”
No maximal participation duration
– Number of “oldies” increases continuously
– Possible seclusion against newcomers
newcomer t+1
! 0;1( )t
*0.2 + 1;2( )t
*0.4+… x;x +1( )t
*0.9,"x >1
12. TeLLNet
Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
I5-FL-MMYY-12
Social Network Analysis
Social Network (1 for each year)
– Nodes: Email Participants
– Relations: Same thread
Shortest path
Diameter
Node betweenness
Largest connected component
Density
Transitivity
Edge betweenness clustering
Biopython Network for 2012
23. TeLLNet
Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
I5-FL-MMYY-23
Conclusion and Discussion (2)
Core evolution
– Evolves strongly
– Core generations (ca. 5 year periods)
– Dangerous for the whole project
– Defines organizational principles
– Can be predicted by combination of diameter and max
betweenness
Threats to Validity
– Evolution step size (year to year, release to release, etc.)
– Scientist driven OSS
– Construct validity: quality of data; network construction
– Internal validity: observation – explanation