1. The Anatomy of Developer Social
Networks
Qiaona HONG
Supervisor: Prof. Shing-Chi Cheung
1
2. Social Network
• Study the Topological Structure of Social
Network
– Y. Y. Ahn @WWW '07; A. Mislove@IMC '07
• Study the Community Structure of Social
Network
– V. D. Blondel@ Journal of Statistical Mechanics:
Theory and Experiment; Y. R. Lin@WI '07
• Techniques to visualize the social network
– Jeffrey Heer@InfoVis '05
• Influential People & Information Diffusion
General Social Network – Kimura, M.@InfoVis '07
(GSN) • Friend Recommendation
– Nitai B. Silva@WCCI‘10
2
4. Research Questions
• Q1: What are the similarities and differences
between DSNs and GSNs?
• Q2: How do DSNs evolve over time?
• Q3: How do communities evolve in DSNs?
• Q4: What are the similarities and differences
between DSNs extracted using different social
linkage indicators?
4
5. Research Questions
• Q1: What are the similarities and differences
between DSNs and GSNs?
• Q2: How do DSNs evolve over time?
• Q3: How do communities evolve in DSNs?
•Qiaona HONG, the similarities and differences
Q4: What are Sunghun Kim, S.C. Cheung and
Christian Bird, “Understanding a different social
between DSNs extracted using Developer Social
Network indicators?
linkage and its Evolution”, in Proceedings of the
27th IEEE International Conference on Software
Maintenance, 2011.
5
7. DSN Extraction Approach
Bug Report 1 Bug Report 2 Bug Report 3 Bug Report 4
David Comment 1
David Comment 1 Bob Comment 1 Bob Comment 2
Bob Comment 2
Bob Comment 2 Jack Comment 2 Jack Comment 3 Jack Comment 3
Jack Comment 3 Bill Comment 3 Bill Comment 3
David Bill
Bob Jack 7
8. DSN Extraction Approach
Bug Report 1 Bug Report 2 Bug Report 3 Bug Report 4
David Comment 1
David Comment 1 Bob Comment 1 Bob Comment 2
Bob Comment 2
Bob Comment 2 Jack Comment 2 Jack Comment 3 Jack Comment 3
Jack Comment 3 Bill Comment 3 Bill Comment 3
1
David Bill
2 2
2 2
4
Bob Jack 8
9. DSN Extraction Approach
Bug Report 1 Bug Report 2 Bug Report 3 Bug Report 4
David Comment 1
David Comment 1 Bob Comment 1 Bob Comment 2
Bob Comment 2
Bob Comment 2 Jack Comment 2 Jack Comment 3 Jack Comment 3
Jack Comment 3 Bill Comment 3 Bill Comment 3
David Bill
4
Bob Jack 9
10. DSN Extraction Approach
Bug Report 1 Bug Report 2 Bug Report 3 Bug Report 4
David Comment 1
David Comment 1 Bob Comment 1 Bob Comment 2
Bob Comment 2
Bob Comment 2 Jack Comment 2 Jack Comment 3 Jack Comment 3
Jack Comment 3 Bill Comment 3 Bill Comment 3
Bob Jack
10
11. Metrics
• Degree Distribution
– The number of edges connected to a node
• Degree of Separation
– The shortest path between two nodes
• Modularity
– To measure the quality of division of nodes
• Community Size
– The number of nodes within a community
11
12. Modularity
A 0.51 B 0.176
• According to A. Clauset’s work, modularity of 0.3 is
a good indicator of significant community structure
in a network
• When the modularity is 0, the community structure
is no stronger than that of a randomly generated
network 12
13. Communities in DSN
• Identified Communities in DSN
– Louvain Algorithm (by optimizing modularity)
– 50 different input ordering of nodes
13
14. ?
Q1: What are the similarities
and differences between
DSNs and GSNs
Degree of Distribution Degree of Separation
Modularity Community Size
14
15. Q1: What are the similarities and differences between DSNs and GSNs
Degree Distribution
(1) MozillaDSN-BR (2) MozillaDSN-CL
(3) EclipseDSN-BR (4) EclipseDSN-CL
15
16. Q1: What are the similarities and differences between DSNs and GSNs
Degree Distribution
(1) MozillaDSN-BR (2) MozillaDSN-CL
(3) EclipseDSN-BR (4) EclipseDSN-CL
16
17. Q1: What are the similarities and differences between DSNs and GSNs
Degree Distribution
• Quantitative power law fit test
– An approach of analyzing power law distributed
data introduced by A. Clauset et al.
• P-value : The likelihood that(2) MozillaDSN-CL
(1) MozillaDSN-BR degree
distribution does actually follow a power-law
– If p-value is less than 0.1, the power law is
rejected.
(3) EclipseDSN-BR (4) EclipseDSN-CL
17
18. Q1: What are the similarities and differences between DSNs and GSNs
P-value<0.1
Degree some<0.1,other>0.1
Distribution
(1) MozillaDSN-BR (2) MozillaDSN-CL
Different from GSNs, DSNs do not(4) EclipseDSN-CL
(3) EclipseDSN-BR follow power-law
18
22. Q1: What are the similarities and differences between DSNs and GSNs
Modularity Modularity
MozillaDSN-CL
0.7
0.6
0.5
0.4
0.3
MozillaDSN-BR
0.7
0.6
0.5
0.4
Modularity
0.3
EclipseDSN-CL
0.7
0.6
0.5
0.4
0.3
EclipseDSN-BR
0.7
0.6
0.5
0.4
0.3
ok
SN
SN
SN
rld
N
N
N
DS
DS
DS
bo
wo
D
D
D
ce
th
th
th
ar
ar
ar
Cy
on
on
on
Fa
ye
ye
ye
m
m
m
1-
2-
4-
1-
3-
6-
Network
Similar to GSNs, all DSNs have significant community structure
22
23. Q1: What are the similarities and differences between DSNs and GSNs
Community Size
(1) MozillaDSN-BR (2) MozillaDSN-CL
(3) EclipseDSN-BR (4) EclipseDSN-CL
23
24. Q1: What are the similarities and differences between DSNs and GSNs
Community Size
28%
(1) MozillaDSN-BR (2) MozillaDSN-CL
(3) EclipseDSN-BR (4) EclipseDSN-CL
24
25. Q1: What are the similarities and differences between DSNs and GSNs
Community Size
21%-36% 23%-43%
(1) MozillaDSN-BR (2) MozillaDSN-CL
15%-30% 23%-33%
(3) EclipseDSN-BR (4) EclipseDSN-CL
25
26. ?
Q4:What are the similarities and
differences between DSNs extracted
using different social linkage indicators
Q2: How do DSNs evolve over time?
Degree of Distribution Degree of Separation
Modularity Community Size
26
27. Q2: How do DSNs evolve over time?
Change of Developer Size
DSNs-BR always have more developers than DSNs-CL
27
28. Q2: How do DSNs evolve over time?
Change of Percentage of New Comers
DSNs-BR always have higher percentage of new
comers than DSNs-CL
28
Hinweis der Redaktion
Metrics to analyze the social networkTechniques to visualize the social networkFinding influential peopleFinding communityInformation diffusionRecommendationStudy the Topological Structure of Social Network[1] Y. Y. Ahn, S. Han, H. Kwak, S. Moon, and H. Jeong, "Analysis of topological characteristics of huge online social networking services," in WWW '07: Proceedings of the 16th international conference on World Wide Web. New York, NY, USA: ACM, 2007, pp. 835-844.[2] A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee, "Measurement and analysis of online social networks," in Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, ser. IMC '07. New York, NY, USA: ACM, 2007, pp. 29-42.Study the Community Structure of Social Network[1] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, "Fast unfolding of communities in large networks," Journal of Statistical Mechanics: Theory and Experiment, vol. 2008, no. 10, pp. P10 008+, Jul. 2008.[2] Y. R. Lin, H. Sundaram, Y. Chi, J. Tatemura, and B. L. Tseng, "Blog community discovery and evolution based on mutual awareness expansion," in WI '07: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. Washington, DC, USA: IEEE Computer Society, 2007, pp. 48-56.Study the Topological Structure of Social NetworkDegree distribution [Y. Y. Ahn @WWW '07]Power-law, small-world [A. Mislove@IMC '07]Study the Community Structure of Social NetworkCommunity structure extraction method[V. D. Blondel@ Journal of Statistical Mechanics: Theory and Experiment]Evolution of community, community evolution patterns [Y. R. Lin@WI '07]Techniques to visualize the social networkCommunity structure extraction method[V. D. Blondel@ Journal of Statistical Mechanics: Theory and Experiment]Finding Influential PeopleCommunity structure extraction method[V. D. Blondel@ Journal of Statistical Mechanics: Theory and Experiment]Information DiffusionCommunity structure extraction method[V. D. Blondel@ Journal of Statistical Mechanics: Theory and Experiment]
A nature question to ask here is that ..
Apart from Q1, in this thesis, we also study other research question, cite my paper here [very important]
Apart from Q1, in this thesis, we also study other research question, cite my paper here [very important]
The subjects used for this study are Mozilla Bug Report, Mozilla CVS Log, Eclipse Bug Report, Eclipse CVS Log.Both Mozilla and Eclipse are very successful open source projects.To compare with GSN, we extract DSNs from these two projects.
Why I used these metrics? I need to polish this slide by using more formal sentences.
[8] A. Clauset, M. E. J. Newman, and C. Moore, "Finding community structure in very large networks," Aug. 2004.
BOF meetings. Developer are free to join the BOF meetings. So we consider BOF meetings reflect real communities.One identified community may contain more than one BOF meetings. However one BOF only be contained in one identified community.Which means BOF represent finer division of developers and Our identified communities reflect real communities.
Why this question? There are many possibilities. Please list some here.
To compare with GSN, we extract DSN from different length of time 1-month, 3-month, 6-month, 1-year, 2-year ,4-years. Possible result, my effort is not trivial.How to interpret the graph.
To compare with GSN, we extract DSN from different length of time 1-month, 3-month, 6-month, 1-year, 2-year ,4-years. Possible result, my effort is not trivial.
To compare with GSN, we extract DSN from different length of time 1-month, 3-month, 6-month, 1-year, 2-year ,4-years. Possible result, my effort is not trivial.
To compare with GSN, we extract DSN from different length of time 1-month, 3-month, 6-month, 1-year, 2-year ,4-years.
I need more text on the slides
28%
28%
This is a GREAT slide. Be sure to explain Extinct and Emerge well since both has “empty” on one side of the arrow.
In the paper, we examine the community evolution from 2000 to 2009, here we use the period from 2005 to 2009 to illustrate our findings.This is also a very good slide. I like the tracking of different paths of communities over time.
In the paper, we examine the community evolution from 2000 to 2009, here we use the period from 2005 to 2009 to illustrate our findings.This is also a very good slide. I like the tracking of different paths of communities over time.
In the paper, we examine the community evolution from 2000 to 2009, here we use the period from 2005 to 2009 to illustrate our findings
[1] Xin Yang, RaulaGaikovina Kula, Camargo Cruz Ana Erika, Norihiro Yoshida, KazukiHamasaki, Kenji Fujiwara, and Hajimu Iida, "Understanding OSS Peer Review Roles in Peer Review Social Network (PeRSoN)," In Proceedings of the 19th Asia-Pacific Software Engineering Conference (APSEC2012), (to appear)
Xin Yang in their work, they used our approach for peer review system to generate a peer review social networks. Based on this review social networks, they target to investigate the importance of OSS peer review contributor roles and their review activities.JifengXuan, He Jiang, ZhileiRen, WeiqinZou, “Developer Prioritization in Bug Repositories”, In Proceedings of the 34th International Conference on Software Engineering (ICSE 2012), pp. 25-35, 2012. Y. Tian, P. Achananuparp, I. Lubis, D. Lo, and E.-P. Lim. What does software engineering community microblog about? In MSR, 2012.To investigate the importance of OSS peer-review contributers and review activities.
Xin Yang in their work, they used our approach for peer review system to generate a peer review social networks. Based on this review social networks, they target to investigate the importance of OSS peer review contributor roles and their review activities.JifengXuan, He Jiang, ZhileiRen, WeiqinZou, “Developer Prioritization in Bug Repositories”, In Proceedings of the 34th International Conference on Software Engineering (ICSE 2012), pp. 25-35, 2012. Y. Tian, P. Achananuparp, I. Lubis, D. Lo, and E.-P. Lim. What does software engineering community microblog about? In MSR, 2012.To investigate the importance of OSS peer-review contributers and review activities.
files are likely to be vulnerable when changed by many developers who have made many changes to other files. Practitioners can use these observations to prioritize securi