SlideShare ist ein Scribd-Unternehmen logo
1 von 68
Downloaden Sie, um offline zu lesen
Who is going to Mentor Newcomers
    in Open Source Projects?



   Gerardo   Massimiliano   Rocco     Sebastiano
   Canfora    Di Penta      Oliveto    Panichella
Context and Motivations
         Software Development



How?
Training via Mentoring



         Case Study
         Exploratory analysis
         Recommendation system evaluation
Training Project Newcomers


                  Technical Skills



 Organizational                      Competencies
   Aspects



                    Newcomer
Previous Work
Zhou and Mockus, ICSE 2011
              Does the Initial Environment Impact the Future of
                                 Developers?



                                                                                                                                                                      Low	
  
                                  Minghui Zhou                                                              Audris Mockus
          School of Electronics Engineering and Computer                                              Avaya Labs Research
                     Science, Peking University                                                  233 Mt Airy Rd, Basking Ridge, NJ
            Key Laboratory of High Confidence Software                                                   audris@avaya.com
                Technologies, Ministry of Education
                       Beijing 100871, China
                             zhmh@pku.edu.cn

ABSTRACT
Software developers need to develop technical and social skills to
be successful in large projects. We model the relative sociality of
a developer as a ratio between the size of her communication net-
work and the number of tasks she participates in. We obtain both
measures from the problem tracking systems. We use her work-
                                                                                          according to an expert developer 1 . One possibility suggested by
                                                                                          a software project manager, is that many developers tend to fo-
                                                                                          cus on the modules they are familiar with, and rarely communicate
                                                                                          outside their narrow circle of colleagues to gain expertise in other
                                                                                          areas. The software engineering literature has investigated the im-
                                                                                          portant role of social and communication aspects in a developer’s
                                                                                          work. They might impact developer productivity (Cataldo et al [2])
                                                                                                                                                                   sociability
flow peer network to represent her social learning, and the issues
she has worked on to represent her technical learning. Using three                        and they might affect software quality (Cataldo et al [3]). Further-
open source and three traditional projects we investigate how the                         more, cognitive scientists have argued that interacting with partners
project environment reflected by the sociality measure at the time                         is significantly better than learning alone [5]. In other words, the
a developer joins, affects her future participation. We find: a) the                       developers need both technical and social skills to be capable of
probability that a new developer will become one of long-term and                         solving critical tasks, though that might present two contradicting
productive developers is highest when the project sociality is low;                       or at least competing learning goals.
b) times of high sociality are associated with a higher intensity of                         On the other hand, there may be obstacles for the developers to
new contributors joining the project; c) there are significant dif-                        achieve socio-technical balance, even when they have a strong mo-
ferences between the social learning trajectories of the developers                       tivation to cultivate their social and technical trajectories, because
who join in low and in high sociality environments; d) the open                           the project environment, in particular, the environment at the time
source and commercial projects exhibit different nature in the rela-                      a developers joins (i.e., the initial environment for the developer),
tionship between developer’s tenure and the project’s environment                         may have a significant impact on the individual. For example, in
at the time she joins. These findings point out the importance of                          many offshoring projects, the developers in the offshore location
the initial environment in determining the future of the developers                       were considered to be incompetent to implement new feature de-
and may lead to better training and learning strategies in software                       velopment in legacy projects: “I don’t know if people are “climb-
organizations.                                                                            ing up” (moving from defect fixing to new development) in this
                                                                                          site,” because “initially nobody could get trained by experienced
Categories and Subject Descriptors                                                        mentors”, according to an outsourcing manager. Therefore, “the
D.2.8 [Software Engineering]: Metrics—process metrics; D.2.9                              offshore team really needs time working with onshore developers
[Software Engineering]: Management—productivity                                           to gain mature practices,” according to the same manager.
General Terms                                                                                This anecdotal evidence sparked our interest to investigate how




                                                                                                                                                                     Be/er
                                                                                          the initial environment may impact the developers’ learning trajec-
Measurement, Performance, Human Factors                                                   tories, in particular, the achievement of social and technical bal-
Keywords                                                                                  ance. Improving this process may help understand how to increase
Socio-technical balance, initial environment, relative sociality, learn-                  the number of developers capable of solving critical tasks, to im-
ing trajectory                                                                            prove the developers’ training, and to facilitate the project’s suc-
                                                                                          cess.
1.     INTRODUCTION                                                                          We have to overcome two challenges to proceed with this inves-
                                                                                          tigation. First, we need to measure the socio-technical balance, sec-




                                                                                                                                                                    training
  The most critical tasks in software projects require “expertise                         ond, we need to determine how the initial environment affects the
across multiple areas”, however, “there are few staff to choose from”                     trajectories of developers. In addition to the challenges of measur-
                                                                                          ing the social and technical achievement in general, we also need to
                                                                                          derive these measures from commonly available project data, such
Permission to make digital or hard copies of all or part of this work for                 as version control system and problem tracking system. Such data
Permission to make digitalisor hard copies of fee provided that copies are
personal or classroom use granted without all or part of this work for                    are difficult to obtain and even more difficult to interpret. For ex-
personal or classroom use is granted without fee provided and that copies
not made or distributed for profit or commercial advantage that copies are
not made or distributed for profit or commercial advantage and that copies                 ample, Cataldo et al. [4] compared an MR-induced logical depen-
bear this notice and the full citation on the first page. To copy otherwise, to
bear this notice and the full citation on the first page. To copy otherwise, to            dency graph on source code files with a graph induced by instant
republish, to post on servers or to redistribute to lists, requires prior specific
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
permission and/or a fee.                                                                  1
ICSE’11,May 21-28 2011, Waikiki, Honolulu , HI, USA
ICSE11, May 21–28, 2011, Waikiki, Honolulu, HI, USA                                         The quotes, including the latter ones, are obtained from the inter-
Copyright 2011 ACM 978-1-4503-0445-0/11/05 ...$10.00
Copyright 2011 ACM 978-1-4503-0445-0/11/05 ...$10.00.                                     views conducted in our former work [20].

                                                                                    271
Previous Work
        Dagenais et al., ICSE 2010
                 Moving into a New Software Project Landscape

          Barthélémy Dagenais†∗ Harold Ossher‡ , Rachel K. E. Bellamy‡ , Martin P. Robillard† ,
                              ,
                                      Jacqueline P. de Vries‡

                            School of Computer Science†                             IBM T.J. Watson Research Center‡
                                 McGill University                                            P.O. Box 704
                               Montréal, QC, Canada                                    Yorktown Heights, NY 10598
                          {bart,martin}@cs.mcgill.ca {ossher,rachel,devries}@us.ibm.com

ABSTRACT                                                                                  cess of learning about a project, and how that process unfolds over
When developers join a software development project, they find                             time. From the perspective of someone helping newcomers set-
themselves in a project landscape, and they must become familiar                          tle in, the landscape metaphor reveals the need to show them the
with the various landscape features. To better understand the nature                      commonly-traversed routes, to help them learn to interpret aspects
of project landscapes and the integration process, with a view to im-                     of the landscape unique to the project, and to introduce them to the
proving the experience of both newcomers and the people responsi-                         customs of the people who inhabit the landscape. It also suggests
ble for orienting them, we performed a grounded theory study with                         that if the community wants to be welcoming to newcomers, they
18 newcomers across 18 projects. We identified the main features                           need to be tolerant of cultural faux-pas, be sensitive to mis-steps
that characterize a project landscape, together with key orientation                      caused by a newcomer’s lack of understanding, take the time to
aids and obstacles, and we theorize that there are three primary                          understand why newcomers get lost in their landscape, add readily-
factors that impact the integration experience of newcomers: early                        interpretable signposts and move them as things change. Such sign-
experimentation, internalizing structures and cultures, and progress                      posts are especially important at cross-roads—places with choices
validation.                                                                               where others have tended to get lost. Identifying what counts as a
                                                                                          cross-roads and what characterizes the parts of a project that need
Categories and Subject Descriptors                                                        signposts can be aided by studies such as that presented here.
D.2.9 [Software Engineering]: Management                                                     Specifically, we were interested in answering three main research
                                                                                          questions: what are the key, prominent features in a project land-
General Terms                                                                             scape, what orientation obstacles do new team members face, and
Human Factors                                                                             what orientation aids can be provided? We interviewed 18 develop-
                                                                                          ers and team leaders across 18 projects at IBM during the last year
1. INTRODUCTION                                                                           to answer these questions.
   Software developers working on a project effectively inhabit a                            Following these interviews, we theorized that there are three
project landscape. They are familiar with its features, such as the                       main factors that impact how newcomers settle into a project land-
product architecture, the team communication strategies and the de-                       scape: early experimentation, internalizing structures and cultures,
velopment process, and they know the shortcuts and the commonly-                          and progress validation. We also identified the landscape features
traveled paths. Newcomers are explorers who must orient them-                             that newcomers learned while moving into new project landscapes




                                                                                                                                                                              Mentoring
selves within an unfamiliar landscape. As they gain experience,                           and we observed how the features facilitated or hindered the new-
they eventually settle in and create their own places within the                          comers’ integration. When we presented the results of our study to
landscape. Like explorers of the natural landscape, they encounter                        seven of the participants, they all agreed that the factors accurately
many obstacles, such as culture shock or getting lost without help.                       represented their experiences as newcomers and that application of
   We conducted a qualitative study to better understand what proj-                       our findings would have eased their integration.
ect landscapes look like and how newcomers explore them. Think-                              In the past, studies on project integration have been performed
ing of a project as a landscape, and integration of newcomers as                          with new employees joining their first software development proj-
the process of settling into that landscape, changes what we per-                         ects [2, 15]. Because these studies were performed with junior and




                                                                                                                                                                   	
  project	
  	
  newcomers
ceive to be important and helps us see new ways of aiding new-                            recently-hired developers, many of the difficulties they encountered
comers. From a newcomer’s perspective, it emphasizes the pro-                             related to the newness of the corporate culture and the difference
∗                                                                                         between academic and industrial environments. We were interested
  This research was conducted while the author was working at the                         in understanding specifically the project landscape, independently
IBM T.J. Watson Research Center.
                                                                                          of the circumstances related to the first-time transition of personnel
                                                                                          into an industry environment. To this end, we focused this study
                                                                                          on developers with varying degrees of experience in the field and




                                                                                                                                                                        	
  highly	
  desirable
Permission to make digital or hard copies of all or part of this work for                 within their company who were joining on-going projects in the
personal or classroom use is granted without fee provided that copies are                 company. We reported preliminary results at a workshop [6].
not made or distributed for profit or commercial advantage and that copies                    The contributions of this paper include a theory, grounded in em-
bear this notice and the full citation on the first page. To copy otherwise, to            pirical data, of how newcomers integrate into a project landscape,
republish, to post on servers or to redistribute to lists, requires prior specific         and a characterization of project landscapes as seen by newcomers.
permission and/or a fee.
                                                                                          The landscape features identified are well known; the contribution
ICSE ’10, May 2-8 2010, Cape Town, South Africa
Copyright 2010 ACM 978-1-60558-719-6/10/05 ...$10.00.                                     in this area is the empirical evidence of their impact on integration.




                                                                                    275
Characteristics of a Good Mentor
 enough	
  exper;se
 about	
  the	
  topic	
  of	
  interest	
  	
  
 for	
  the	
  newcomer…

 enough	
  ability	
  to	
  help
 other	
  people…
Sources of Information
                         Expertise

   	
  	
  	
  	
  SVN
   GIT	
  	
  CVS



  Ability to help others
Our Contribution
                YODA
(Young and newcOmer Developer Assistant)




Approach for Mentors Identification
     in Open Source Projects
YODA: Two phases
1) Identify Mentors                      2) Recommend
   in Past Project                          Mentors
   History

                  What factors can be
                   used to identify
                      mentors?


    	
   	
   	
   	
   S VN
                                   ?
    GIT	
   	
   C VS
RQ1: Identifying mentors in past
        project history
                           Similar	
  problem:
What factors can be used   Iden;fying	
  advisors	
  in
 to identify mentors?      academic	
  collabora;ons

                           ArnetMiner
                           (http://arnetminer.org):
                           popular search engine for
                           academic
                           researchers in computer
                           science

                           identifies relations between
                           students and advisors
How does ArnetMiner work?
Ranks	
  pairs	
  of	
  researchers	
  according	
  to	
  four	
  factors:

                                                  f1 they	
  published	
  many	
  
                                                     papers	
  together
                                                  f2 advisor	
  published	
  
                                                     more	
  than	
  the	
  
                                                     student
                                                  f3 advisor	
  older	
  than	
  
                                                     the	
  student
                                                  f4 student	
  published	
  
                                                     her	
  first	
  paper(s)	
  
                                                     with	
  the	
  advisor
Heuristics to identify mentors
                   F1: Exchanged emails


        Is the mentor of

  Jim         IF           Alice




Time
Heuristics to identify mentors
                   F1: Exchanged emails


        Is the mentor of

  Jim         IF           Alice

  F1


Time                        When Alice joins
                             the project
Heuristics to identify mentors
            F2: overall amount of emails


      Is the mentor of

Jim         IF           Alice

F1
Heuristics to identify mentors
            F2: overall amount of emails


      Is the mentor of            F2

Jim         IF           Alice

F1

F2      >
Heuristics to identify mentors
            F2: overall amount of emails


      Is the mentor of            F2

Jim         IF           Alice

F1

F2      >                        >
Heuristics to identify mentors
                  F3: age in the project


      Is the mentor of

Jim         IF           Alice

F1

F2      >          Time
Heuristics to identify mentors
                  F3: age in the project


      Is the mentor of
                                           F3
Jim         IF           Alice

F1

F2      >          Time
F3
Heuristics to identify mentors
                                F4: newcomer “early” emails


                  Is the mentor of

Jim                            IF     Alice

F1

F2                         >        Time
F3

F4	
  -­‐	
  	
  1st	
  
Heuristics to identify mentors
                                F4: newcomer “early” emails


                  Is the mentor of

Jim                            IF     Alice

F1

F2                         >        Time      First emails by Alice
F3                                            in the project

F4	
  -­‐	
  	
  1st	
  
Heuristics to identify mentors
                                           F5:	
  Commits


                  Is the mentor of

Jim                            IF          Alice

F1

F2                         >        Time
F3

F4	
  -­‐	
  	
  1st	
  
Heuristics to identify mentors
                                           F5:	
  Commits


                  Is the mentor of
                                                                F5
Jim                            IF          Alice

F1

F2                         >        Time                When Alice joins
F3                                                      the project

F4	
  -­‐	
  	
  1st	
  

F5
Aggregating the factors


What factors can be
 used to identify
    mentors?
Recommending Mentors
  Project	
  developers




Time
Recommending Mentors
   Past	
  mentors




Time
Recommending Mentors
   Past	
  mentors




Time
                     t0


                          Alice
Recommending Mentors
   Past	
  mentors




                                  Mentor with
                                  adequate skills
Time
                     t0


                          Alice
Recommending Mentors
        Past	
  mentors   Inspired to the
                          work on Bug
                          Triaging by J.
                          Anvik et al.,
                          TOSEM 2011


Time
Recommending Mentors
        Past	
  mentors            Inspired to the
                                   work on Bug
                                   Triaging by J.
                                   Anvik et al.,
                                   TOSEM 2011


Time                      t0


                           Alice
Recommending Mentors
        Past	
  mentors            Inspired to the
                                   work on Bug
                                   Triaging by J.
                                   Anvik et al.,
                                   TOSEM 2011


Time                      t0


                           Alice
Recommending Mentors
        Past	
  mentors            Inspired to the
                                   work on Bug
                                   Triaging by J.
                                   Anvik et al.,
                                   TOSEM 2011


Time                      t0


                           Alice
Recommending Mentors
        Past	
  mentors            Inspired to the
                                   work on Bug
                                   Triaging by J.
                                   Anvik et al.,
                                   TOSEM 2011


Time                      t0


                           Alice
Recommending Mentors
        Past	
  mentors                   Inspired to the
                                          work on Bug
                                          Triaging by J.
                                          Anvik et al.,
                                          TOSEM 2011


Time                            t0


                                  Alice

                             DICE
                          SIMILARITY
Empirical Study
Goal: analyze data from mailing lists and versioning
systems
Purpose: investigating which factors can be used to
identify mentors
Quality focus: recommend mentors in software
 projects
Context: mailing lists and versioning systems of five
software projects:
• Apache, FreeBSD, PostgreSQL, Python and Samba
Context
               Split into a training set and a test set
                   Apache           FreeBSD          PostgreSQL          Python             Samba


   Period
                 08/2001-03/2002   11/1998-02/2000   10/1998-05/2001   05/2000-05/2001   04/1998-09/2000
(Training set)


    Period
                 04/2002-12/2008   03/2000-10/2008   06/2001-03/2008   06/2001-12/2008   10/2000-12/2008
  (Test set)


# of Mentors
(Training set)
                       19                65                10                28                17

     # of
 Newcomers             13                33                8                 32                33
(Training set)

    # of
Newcomers              13                33                7                 31                33
 (Test set)
Research Questions
        RQ1                   RQ2
	

How can we        To what extent would
 identify mentors    it be possible to
 from the past       recommend mentors
 history of a        to newcomers joining
 software project?   a software project?

               ?
RQ1: How can we identify mentors from the
  past history of a software project?
Pair   Score
        2.5

        2.5

        1.5                      F1


        1.5
                                 F2                         >
                                 F3

        1.0                      F4	
  -­‐	
  	
  1st	
  

                                F5
        1.0
RQ1: How can we identify mentors from the
  past history of a software project?
Pair   Score
        2.5

        2.5
                   Manually
        1.5    ✔   validated     F1


        1.5
                                 F2                         >
                                 F3

        1.0                      F4	
  -­‐	
  	
  1st	
  

                                F5
        1.0
RQ1: How can we identify mentors from
      the past history of a software project?
             100%#

              90%#
                                                                      Possible
              80%#
                                                                    Configurations
              70%#
                                                                      f1
Precision)




              60%#

              50%#
                                                                      F1
              40%#

              30%#                                                    F2                          >
              20%#
                                                                      F3
              10%#

               0%#                                                     F4	
  -­‐	
  	
  1st	
  
                     18#   19#   20#    21#   22#    23#      24#
                           Number)of)newcomer0mentor)pairs)           F5
RQ1: How can we identify mentors from
      the past history of a software project?
             100%#

              90%#
                                                                      Possible
              80%#
                                                                    Configurations
              70%#
                                                                     f1	
  +f2+	
  f3
Precision)




              60%#

              50%#
                                                                       F1
              40%#

              30%#                                                     F2                          >
              20%#
                                                                       F3
              10%#

               0%#                                                      F4	
  -­‐	
  	
  1st	
  
                     18#   19#   20#    21#   22#    23#      24#
                           Number)of)newcomer0mentor)pairs)           F5
RQ1: How can we identify mentors from
      the past history of a software project?
             100%#

              90%#
                                                                       Possible
              80%#
                                                                    Configurations
              70%#
                                                                      f1	
  +f2+	
  f4
Precision)




              60%#

              50%#
                                                                       F1
              40%#

              30%#                                                      F2                         >
              20%#
                                                                        F3
              10%#

               0%#                                                      F4	
  -­‐	
  	
  1st	
  
                     18#   19#   20#    21#   22#    23#      24#
                           Number)of)newcomer0mentor)pairs)           F5
RQ1: How can we identify mentors from
      the past history of a software project?
             100%#

              90%#
                                                                      Possible
              80%#
                                                                    Configurations
              70%#
                                                                      f5
Precision)




              60%#

              50%#
                                                                      F1
              40%#

              30%#                                                    F2                          >
              20%#
                                                                      F3
              10%#

               0%#                                                     F4	
  -­‐	
  	
  1st	
  
                     18#   19#   20#    21#   22#    23#      24#
                           Number)of)newcomer0mentor)pairs)           F5                    (Baseline)
RQ1: How can we identify mentors from
 the past history of a software project?
             100%#                 Apache                                     100%#               PostgreSQL
              90%#                                                             90%#
              80%#                                                             80%#
              70%#                                                             70%#




                                                                 Precision)
Precision)




              60%#                                                             60%#
              50%#                                                             50%#
              40%#                                                             40%#
              30%#                                                             30%#
              20%#                                                             20%#
              10%#                                                             10%#
               0%#                                                              0%#
                     18#   19#   20#    21#   22#   23#    24#                        12#   14#      16#        18#      20#   22#
                     Number)of)newcomer0mentor)pairs)                                   Number)of)newcomer0mentor)pairs)


                     f1                f1	
  +f2+	
  f3                   f1	
  +f2+	
  f4                             f5
                                                                                                                        (Baseline)


 F1                         F2          >                 F3                                      F4	
  –	
  1st	
             F5
RQ1: How can we identify mentors from
 the past history of a software project?
             100%#                 Apache                                     100%#               PostgreSQL
              90%#                                                             90%#
              80%#                                                             80%#
              70%#                                                             70%#




                                                                 Precision)
Precision)




              60%#                                                             60%#
              50%#                                                             50%#
              40%#                                                             40%#
              30%#                                                             30%#
              20%#                                                             20%#
              10%#                                                             10%#
               0%#                                                              0%#
                     18#   19#   20#    21#   22#   23#    24#                        12#   14#      16#        18#      20#   22#
                     Number)of)newcomer0mentor)pairs)                                   Number)of)newcomer0mentor)pairs)


                     f1                f1	
  +f2+	
  f3                   f1	
  +f2+	
  f4                             f5
                                                                                                                        (Baseline)


 F1                         F2          >                 F3                                      F4	
  –	
  1st	
             F5
RQ1: How can we identify mentors from
             the past history of a software project?
             100%#
              90%#                             Python                                                      100%#
                                                                                                            90%#
                                                                                                                                 FreeBSD
              80%#                                                                                          80%#
              70%#                                                                                          70%#
Precision)




                                                                                              Precision)
              60%#                                                                                          60%#
              50%#                                                                                          50%#
              40%#                                                                                          40%#
              30%#                                                                                          30%#
              20%#                                                                                          20%#
              10%#                                                                                          10%#
               0%#                                                                                           0%#
                     24# 26# 28# 30# 32# 34# 36# 38# 40# 42# 44# 46# 48#                                           23# 25# 27# 29# 31# 33# 35# 37# 39# 41#
                 Number)of)newcomer0mentor)pairs)                                                                Number)of)newcomer0mentor)pairs)
                                                                  100%#
                                                                   90%#
                                                                   80%#
                                                                   70%#
                                                     Precision)




                                                                   60%#
                                                                   50%#
                                                                   40%#
                                                                   30%#
                                                                   20%#               Samba
                                                                   10%#
                                                                    0%#
                                                                          30#   32#   34#   36#            38#     40#   42#
                                                                      Number)of)newcomer0mentor)pairs)
RQ1: How can we identify mentors from
             the past history of a software project?
             100%#
              90%#                             Python                                                     100%#
                                                                                                           90%#
                                                                                                                                         FreeBSD
              80%#                                                                                         80%#
              70%#
                                  Useful factors for mentor identification                                  70%#
Precision)




                                                                                             Precision)
              60%#                                                                                         60%#
              50%#                                                                                         50%#
              40%#                                                                                         40%#
              30%#                                                                                         30%#
              20%#                                                                                         20%#
              10%#                                                                                         10%#
                                                                                                                          F1
               0%#                                                                                          0%#

                                                                                                                                                         >
                     24# 26# 28# 30# 32# 34# 36# 38# 40# 42# 44# 46# 48#                                          23# 25# 27# 29# 31# 33# 35# 37# 39# 41#
                                     f1
                 Number)of)newcomer0mentor)pairs)                                                                          F2
                                                                                                                Number)of)newcomer0mentor)pairs)
                                                  100%#
                                 0.5*f1	
  +	
  0.25*f2	
  +	
  	
  0.25*f3
                                                   90%#                                                                    F3
                                                   80%#
                                 0.5*f1	
  +	
  0.25*f2	
  +	
  0.25*f4
                                                   70%#
                                                     Precision)




                                                                  60%#                                                        F4	
  -­‐	
  	
  1st	
  
                                                                  50%#
                                                                  40%#
                                                                  30%#
                                                                  20%#               Samba                                    F5
                                                                  10%#
                                                                   0%#
                                                                         30#   32#   34#   36#            38#     40#   42#
                                                                     Number)of)newcomer0mentor)pairs)
RQ2: To what extent would it be possible to
recommend mentors to newcomers joining a
software project?
                                                Top$1$$   Top$2$
            110%$
                                                     100%$ 100%$
            100%$                                                                94%$

             90%$   85%$
                           81%$                                                         82%$
             80%$                                                         77%$
Precision




             70%$                                                  64%$

             60%$
             50%$
             40%$
                                  30%$
             30%$                        24%$

             20%$
             10%$
              0%$
                    Apache$       FreeBSD$          PostgreSQL$    Python$       Samba$
RQ2: To what extent would it be possible to
recommend mentors to newcomers joining a
software project?
                                                Top$1$$   Top$2$
            110%$
                                                     100%$ 100%$
            100%$                                                                94%$

             90%$   85%$
                           81%$                                                         82%$
             80%$                                                         77%$
Precision




             70%$                                                  64%$

             60%$
             50%$
             40%$
                                  30%$
             30%$                        24%$

             20%$
             10%$
              0%$
                    Apache$       FreeBSD$          PostgreSQL$    Python$       Samba$
RQ2: To what extent would it be possible to
recommend mentors to newcomers joining a
software project?
                                                Top$1$$   Top$2$
            110%$
                                                     100%$ 100%$
            100%$                                                                94%$

             90%$   85%$
                           81%$                                                         82%$
             80%$                                                         77%$

                                                     ✔
Precision




             70%$                                                  64%$

             60%$           YODA makes it possible
             50%$
                            to recommend mentors
             40%$
                                  30%$
             30%$                        24%$

             20%$
             10%$
              0%$
                    Apache$       FreeBSD$          PostgreSQL$    Python$       Samba$
Why don’t just using Top
                        Committers?
                                           Top#1##     Top#2#
            60%#

                                                50%#
            50%#


            40%#
Precision




                                                                            35%# 35%#


            30%#
                                                       25%#


            20%#


            10%#         8%#                                          7%#
                               6%#
                                     3%#
                   0%#                                          0%#
             0%#
                   Apache#     FreeBSD#       PostgreSQL#       Python#     Samba#
Why don’t just using Top
                        Committers?
                                           Top#1##     Top#2#
            60%#

                                                50%#
            50%#


            40%#
Precision




                                                                            35%# 35%#


            30%#
                                                       25%#


            20%#


            10%#         8%#                                          7%#
                               6%#
                                     3%#
                   0%#                                          0%#
             0%#
                   Apache#     FreeBSD#       PostgreSQL#       Python#     Samba#
Why don’t just using Top
                        Committers?
                                           Top#1##     Top#2#
            60%#

                                                50%#
            50%#


            40%#
Precision




                                                                            35%# 35%#


            30%#
                                                       25%#


            20%#


            10%#         8%#                                          7%#
                               6%#
                                     3%#
                   0%#                                          0%#
             0%#
                   Apache#     FreeBSD#       PostgreSQL#       Python#     Samba#
Why don’t just using Top
                        Committers?
                                           Top#1##     Top#2#
            60%#


            50%#               Not all committers
                                                50%#




            40%#
                               are good mentors!
Precision




                                                                            35%# 35%#


            30%#
                                                       25%#


            20%#


            10%#         8%#                                          7%#
                               6%#
                                     3%#
                   0%#                                          0%#
             0%#
                   Apache#     FreeBSD#       PostgreSQL#       Python#     Samba#
 	
  	
   Surveying Projects Developers

  Questions asked:

  Done/received mentoring
                             Mentor   Newcomer




  Perceived importance of mentoring


  What makes a good mentor
Sent to 114 Subjects...
Samba
                      .....
                       37

FreeBSD
                      .....
                       37
Postgre-              .....
  SQL
                       15

Python
                      .....
                       23

Apache
                      .....
                       23
Obtained Answers
Samba


FreeBSD


Postgre-
  SQL


Python


Apache     	
   	
   	
   	
  	
  	
  -­‐
Done/received mentoring?

Had#a#mentor?#           58%$                42%$




Did#mentoring?#                 92%$                8%$


                  0%# 20%# 40%# 60%# 80%# 100%#
                                YES#   NO#
Done/received mentoring?

Had#a#mentor?#           58%$                   42%$
                                       Yes, I received
    Yes, I did                         Mentoring. My
    mentoring…                         mentor was…

Did#mentoring?#                 92%$                 8%$


                  0%# 20%# 40%# 60%# 80%# 100%#
                                YES#    NO#
Perceived importance of mentoring
                     0%$
  Useless#at#all#    0%$
                     0%$
Not#important#       0%$
                       11%$
       Neutral#                                  45%$
                                                        56%$
     Important#                           36%$
                                        33%$
Very#important#               18%$

                    0%# 10%# 20%# 30%# 40%# 50%# 60%#
       Effect#of#mentor#              Effect#on#newcomer#
Perceived importance of mentoring
                     0%$
  Useless#at#all#    0%$

Not#important# 0%$ a
  Is very important that
                   0%$
   mentor shares knowledge
   with a mentee… 11%$
        Neutral#                              45%$
                                                     56%$
     Important#                        36%$
                                     33%$
Very#important#            18%$

                    0%# 10%# 20%# 30%# 40%# 50%# 60%#
       Effect#of#mentor#           Effect#on#newcomer#
What makes a good mentor

             Others# 0%$

  Project#knowledge#                  38%$

Communica4on#skills#                   42%$

         Experience#         19%$

                       0%#     20%#     40%#   60%#
What makes a good mentor

               Others# 0%$
 My first mentor
 had a very strong and
  Project#knowledge#
 technical background                 38%$

Communica4on#skills#                   42%$

         Experience#         19%$

                       0%#     20%#     40%#   60%#
Conclusions
Conclusions
Conclusions
Conclusions
Conclusions
Future Work...

	
           	
           Considering factors able to better
	

          	

          capture the technical skills of mentors.



       	

          	

   Replicating the study with different 	

       	

          	

   projects.

Weitere ähnliche Inhalte

Was ist angesagt?

It Role State Exploration 7 Nov Illumine
It Role State Exploration 7 Nov  IllumineIt Role State Exploration 7 Nov  Illumine
It Role State Exploration 7 Nov Illumineibecome
 
SoSoCo project for elderly
SoSoCo project for elderlySoSoCo project for elderly
SoSoCo project for elderlyJackson Choi
 
Social Networking Site Documentation
Social Networking Site Documentation Social Networking Site Documentation
Social Networking Site Documentation Sammi Kumar
 
Formato En Ingles Tecnologos 9d
Formato En Ingles Tecnologos 9dFormato En Ingles Tecnologos 9d
Formato En Ingles Tecnologos 9dIE Simona Duque
 
Embrace Project Report: Hospital Project for Ethnic Minority
Embrace Project Report: Hospital Project for Ethnic MinorityEmbrace Project Report: Hospital Project for Ethnic Minority
Embrace Project Report: Hospital Project for Ethnic MinorityJackson Choi
 
Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...Wolfgang Reinhardt
 
Analysis of the coverage of sustainability aspects in the pmbok
Analysis of the coverage of sustainability aspects in the pmbokAnalysis of the coverage of sustainability aspects in the pmbok
Analysis of the coverage of sustainability aspects in the pmbokGilbert Silvius
 
Stop look and listen before you talk
Stop look and listen before you talkStop look and listen before you talk
Stop look and listen before you talkNuno Brito
 
Leveraging software-reuse-with-knowledge-management-in-software-development
Leveraging software-reuse-with-knowledge-management-in-software-developmentLeveraging software-reuse-with-knowledge-management-in-software-development
Leveraging software-reuse-with-knowledge-management-in-software-developmentDimitris Panagiotou
 
Knowledge Management in Agile Projects
Knowledge Management in Agile ProjectsKnowledge Management in Agile Projects
Knowledge Management in Agile ProjectsCognizant
 
Conference Agenda, Topics
Conference Agenda, TopicsConference Agenda, Topics
Conference Agenda, TopicsKrishna Gorle
 
Aurkut - A social Networking website
Aurkut - A social Networking websiteAurkut - A social Networking website
Aurkut - A social Networking websiteAbhijeet Kalsi
 
Invited Talk at TU Graz
Invited Talk at TU GrazInvited Talk at TU Graz
Invited Talk at TU GrazWalid Maalej
 

Was ist angesagt? (19)

It Role State Exploration 7 Nov Illumine
It Role State Exploration 7 Nov  IllumineIt Role State Exploration 7 Nov  Illumine
It Role State Exploration 7 Nov Illumine
 
SoSoCo project for elderly
SoSoCo project for elderlySoSoCo project for elderly
SoSoCo project for elderly
 
Pert17
Pert17Pert17
Pert17
 
Social Networking Site Documentation
Social Networking Site Documentation Social Networking Site Documentation
Social Networking Site Documentation
 
Formato En Ingles Tecnologos 9d
Formato En Ingles Tecnologos 9dFormato En Ingles Tecnologos 9d
Formato En Ingles Tecnologos 9d
 
T355
T355T355
T355
 
Embrace Project Report: Hospital Project for Ethnic Minority
Embrace Project Report: Hospital Project for Ethnic MinorityEmbrace Project Report: Hospital Project for Ethnic Minority
Embrace Project Report: Hospital Project for Ethnic Minority
 
Asundi
AsundiAsundi
Asundi
 
Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...
 
Analysis of the coverage of sustainability aspects in the pmbok
Analysis of the coverage of sustainability aspects in the pmbokAnalysis of the coverage of sustainability aspects in the pmbok
Analysis of the coverage of sustainability aspects in the pmbok
 
Stop look and listen before you talk
Stop look and listen before you talkStop look and listen before you talk
Stop look and listen before you talk
 
Leveraging software-reuse-with-knowledge-management-in-software-development
Leveraging software-reuse-with-knowledge-management-in-software-developmentLeveraging software-reuse-with-knowledge-management-in-software-development
Leveraging software-reuse-with-knowledge-management-in-software-development
 
Knowledge Management in Agile Projects
Knowledge Management in Agile ProjectsKnowledge Management in Agile Projects
Knowledge Management in Agile Projects
 
Presentation on software assessment in bangladesh basis soft_expo2012_final
Presentation on software assessment in bangladesh basis soft_expo2012_finalPresentation on software assessment in bangladesh basis soft_expo2012_final
Presentation on software assessment in bangladesh basis soft_expo2012_final
 
Conference Agenda, Topics
Conference Agenda, TopicsConference Agenda, Topics
Conference Agenda, Topics
 
Pert15
Pert15Pert15
Pert15
 
Aurkut - A social Networking website
Aurkut - A social Networking websiteAurkut - A social Networking website
Aurkut - A social Networking website
 
Invited Talk at TU Graz
Invited Talk at TU GrazInvited Talk at TU Graz
Invited Talk at TU Graz
 
T355 p iweb
T355 p iwebT355 p iweb
T355 p iweb
 

Andere mochten auch

Andere mochten auch (7)

Dipenta msr2011-renaming
Dipenta msr2011-renamingDipenta msr2011-renaming
Dipenta msr2011-renaming
 
Dipenta msr2011-csbf
Dipenta msr2011-csbfDipenta msr2011-csbf
Dipenta msr2011-csbf
 
Dipenta msr2011-challenge
Dipenta msr2011-challenge Dipenta msr2011-challenge
Dipenta msr2011-challenge
 
SSBSE 2012 Keynote
SSBSE 2012 KeynoteSSBSE 2012 Keynote
SSBSE 2012 Keynote
 
MSR 2015 Announcement
MSR 2015 AnnouncementMSR 2015 Announcement
MSR 2015 Announcement
 
Put Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and HowPut Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and How
 
Most Influential Paper - SANER 2017
Most Influential Paper - SANER 2017Most Influential Paper - SANER 2017
Most Influential Paper - SANER 2017
 

Ähnlich wie FSE 2012 talk: finding mentors in software projects

Do the Software Architects get the Needed Support for the job They Perform?
Do the Software Architects get the Needed Support for the job They Perform?Do the Software Architects get the Needed Support for the job They Perform?
Do the Software Architects get the Needed Support for the job They Perform?Kresimir Popovic
 
Software engineering project management
Software engineering project managementSoftware engineering project management
Software engineering project managementjhudyne
 
White paper - Adhoc 2.0
White paper - Adhoc 2.0White paper - Adhoc 2.0
White paper - Adhoc 2.0Nuno Brito
 
NEED FOR A SOFT DIMENSION
NEED FOR A SOFT DIMENSIONNEED FOR A SOFT DIMENSION
NEED FOR A SOFT DIMENSIONcsandit
 
Using Evolutionary Prototypes To Formalize Product Requirements
Using Evolutionary Prototypes To Formalize Product RequirementsUsing Evolutionary Prototypes To Formalize Product Requirements
Using Evolutionary Prototypes To Formalize Product RequirementsArnold Rudorfer
 
Directional-based Cellular e-Commerce: Undergraduate Systems Engineering Caps...
Directional-based Cellular e-Commerce: Undergraduate Systems Engineering Caps...Directional-based Cellular e-Commerce: Undergraduate Systems Engineering Caps...
Directional-based Cellular e-Commerce: Undergraduate Systems Engineering Caps...wcrolley
 
Social networks, microblogging, virtual worlds, and Web 2.0 in the teaching o...
Social networks, microblogging, virtual worlds, and Web 2.0 in the teaching o...Social networks, microblogging, virtual worlds, and Web 2.0 in the teaching o...
Social networks, microblogging, virtual worlds, and Web 2.0 in the teaching o...Gonçalo Cruz Matos
 
Essentials egov ict_project_management_v1
Essentials egov ict_project_management_v1Essentials egov ict_project_management_v1
Essentials egov ict_project_management_v1John Macasio
 
Agile Adoption Framework
Agile Adoption FrameworkAgile Adoption Framework
Agile Adoption FrameworkVaibhav Sathe
 
User Centred Requirements Processes in MATURE: The Big Picture
User Centred Requirements Processes in MATURE: The Big PictureUser Centred Requirements Processes in MATURE: The Big Picture
User Centred Requirements Processes in MATURE: The Big PictureAndreas Schmidt
 
Collaborate, Innovate, Accelerate
Collaborate, Innovate, AccelerateCollaborate, Innovate, Accelerate
Collaborate, Innovate, AccelerateCI&T
 
Crafting Infrastructures. Requirements, scenarios and evaluation in the SPICE...
Crafting Infrastructures. Requirements, scenarios and evaluation in the SPICE...Crafting Infrastructures. Requirements, scenarios and evaluation in the SPICE...
Crafting Infrastructures. Requirements, scenarios and evaluation in the SPICE...Luca Galli
 
Pm600 1103 a-02-schwappach-loren-p1-t2
Pm600 1103 a-02-schwappach-loren-p1-t2Pm600 1103 a-02-schwappach-loren-p1-t2
Pm600 1103 a-02-schwappach-loren-p1-t2Loren Schwappach
 
Knowledge management systems life cycle
Knowledge management systems life cycleKnowledge management systems life cycle
Knowledge management systems life cycleRaymond Koh
 
Project post-mortem analysis
Project post-mortem analysisProject post-mortem analysis
Project post-mortem analysisJaiveer Singh
 

Ähnlich wie FSE 2012 talk: finding mentors in software projects (20)

Do the Software Architects get the Needed Support for the job They Perform?
Do the Software Architects get the Needed Support for the job They Perform?Do the Software Architects get the Needed Support for the job They Perform?
Do the Software Architects get the Needed Support for the job They Perform?
 
Software engineering project management
Software engineering project managementSoftware engineering project management
Software engineering project management
 
White paper - Adhoc 2.0
White paper - Adhoc 2.0White paper - Adhoc 2.0
White paper - Adhoc 2.0
 
NEED FOR A SOFT DIMENSION
NEED FOR A SOFT DIMENSIONNEED FOR A SOFT DIMENSION
NEED FOR A SOFT DIMENSION
 
MCP1
MCP1MCP1
MCP1
 
Using Evolutionary Prototypes To Formalize Product Requirements
Using Evolutionary Prototypes To Formalize Product RequirementsUsing Evolutionary Prototypes To Formalize Product Requirements
Using Evolutionary Prototypes To Formalize Product Requirements
 
Artigo Educon2012
Artigo Educon2012Artigo Educon2012
Artigo Educon2012
 
Directional-based Cellular e-Commerce: Undergraduate Systems Engineering Caps...
Directional-based Cellular e-Commerce: Undergraduate Systems Engineering Caps...Directional-based Cellular e-Commerce: Undergraduate Systems Engineering Caps...
Directional-based Cellular e-Commerce: Undergraduate Systems Engineering Caps...
 
Social networks, microblogging, virtual worlds, and Web 2.0 in the teaching o...
Social networks, microblogging, virtual worlds, and Web 2.0 in the teaching o...Social networks, microblogging, virtual worlds, and Web 2.0 in the teaching o...
Social networks, microblogging, virtual worlds, and Web 2.0 in the teaching o...
 
Essentials egov ict_project_management_v1
Essentials egov ict_project_management_v1Essentials egov ict_project_management_v1
Essentials egov ict_project_management_v1
 
Agile Adoption Framework
Agile Adoption FrameworkAgile Adoption Framework
Agile Adoption Framework
 
Cpe orientation
Cpe orientationCpe orientation
Cpe orientation
 
Project Scheduling
Project SchedulingProject Scheduling
Project Scheduling
 
User Centred Requirements Processes in MATURE: The Big Picture
User Centred Requirements Processes in MATURE: The Big PictureUser Centred Requirements Processes in MATURE: The Big Picture
User Centred Requirements Processes in MATURE: The Big Picture
 
Collaborate, Innovate, Accelerate
Collaborate, Innovate, AccelerateCollaborate, Innovate, Accelerate
Collaborate, Innovate, Accelerate
 
Crafting Infrastructures. Requirements, scenarios and evaluation in the SPICE...
Crafting Infrastructures. Requirements, scenarios and evaluation in the SPICE...Crafting Infrastructures. Requirements, scenarios and evaluation in the SPICE...
Crafting Infrastructures. Requirements, scenarios and evaluation in the SPICE...
 
Pm600 1103 a-02-schwappach-loren-p1-t2
Pm600 1103 a-02-schwappach-loren-p1-t2Pm600 1103 a-02-schwappach-loren-p1-t2
Pm600 1103 a-02-schwappach-loren-p1-t2
 
4 Scheduling Monitoring
4 Scheduling Monitoring4 Scheduling Monitoring
4 Scheduling Monitoring
 
Knowledge management systems life cycle
Knowledge management systems life cycleKnowledge management systems life cycle
Knowledge management systems life cycle
 
Project post-mortem analysis
Project post-mortem analysisProject post-mortem analysis
Project post-mortem analysis
 

FSE 2012 talk: finding mentors in software projects

  • 1. Who is going to Mentor Newcomers in Open Source Projects? Gerardo Massimiliano Rocco Sebastiano Canfora Di Penta Oliveto Panichella
  • 2. Context and Motivations Software Development How? Training via Mentoring Case Study Exploratory analysis Recommendation system evaluation
  • 3. Training Project Newcomers Technical Skills Organizational Competencies Aspects Newcomer
  • 4. Previous Work Zhou and Mockus, ICSE 2011 Does the Initial Environment Impact the Future of Developers? Low   Minghui Zhou Audris Mockus School of Electronics Engineering and Computer Avaya Labs Research Science, Peking University 233 Mt Airy Rd, Basking Ridge, NJ Key Laboratory of High Confidence Software audris@avaya.com Technologies, Ministry of Education Beijing 100871, China zhmh@pku.edu.cn ABSTRACT Software developers need to develop technical and social skills to be successful in large projects. We model the relative sociality of a developer as a ratio between the size of her communication net- work and the number of tasks she participates in. We obtain both measures from the problem tracking systems. We use her work- according to an expert developer 1 . One possibility suggested by a software project manager, is that many developers tend to fo- cus on the modules they are familiar with, and rarely communicate outside their narrow circle of colleagues to gain expertise in other areas. The software engineering literature has investigated the im- portant role of social and communication aspects in a developer’s work. They might impact developer productivity (Cataldo et al [2]) sociability flow peer network to represent her social learning, and the issues she has worked on to represent her technical learning. Using three and they might affect software quality (Cataldo et al [3]). Further- open source and three traditional projects we investigate how the more, cognitive scientists have argued that interacting with partners project environment reflected by the sociality measure at the time is significantly better than learning alone [5]. In other words, the a developer joins, affects her future participation. We find: a) the developers need both technical and social skills to be capable of probability that a new developer will become one of long-term and solving critical tasks, though that might present two contradicting productive developers is highest when the project sociality is low; or at least competing learning goals. b) times of high sociality are associated with a higher intensity of On the other hand, there may be obstacles for the developers to new contributors joining the project; c) there are significant dif- achieve socio-technical balance, even when they have a strong mo- ferences between the social learning trajectories of the developers tivation to cultivate their social and technical trajectories, because who join in low and in high sociality environments; d) the open the project environment, in particular, the environment at the time source and commercial projects exhibit different nature in the rela- a developers joins (i.e., the initial environment for the developer), tionship between developer’s tenure and the project’s environment may have a significant impact on the individual. For example, in at the time she joins. These findings point out the importance of many offshoring projects, the developers in the offshore location the initial environment in determining the future of the developers were considered to be incompetent to implement new feature de- and may lead to better training and learning strategies in software velopment in legacy projects: “I don’t know if people are “climb- organizations. ing up” (moving from defect fixing to new development) in this site,” because “initially nobody could get trained by experienced Categories and Subject Descriptors mentors”, according to an outsourcing manager. Therefore, “the D.2.8 [Software Engineering]: Metrics—process metrics; D.2.9 offshore team really needs time working with onshore developers [Software Engineering]: Management—productivity to gain mature practices,” according to the same manager. General Terms This anecdotal evidence sparked our interest to investigate how Be/er the initial environment may impact the developers’ learning trajec- Measurement, Performance, Human Factors tories, in particular, the achievement of social and technical bal- Keywords ance. Improving this process may help understand how to increase Socio-technical balance, initial environment, relative sociality, learn- the number of developers capable of solving critical tasks, to im- ing trajectory prove the developers’ training, and to facilitate the project’s suc- cess. 1. INTRODUCTION We have to overcome two challenges to proceed with this inves- tigation. First, we need to measure the socio-technical balance, sec- training The most critical tasks in software projects require “expertise ond, we need to determine how the initial environment affects the across multiple areas”, however, “there are few staff to choose from” trajectories of developers. In addition to the challenges of measur- ing the social and technical achievement in general, we also need to derive these measures from commonly available project data, such Permission to make digital or hard copies of all or part of this work for as version control system and problem tracking system. Such data Permission to make digitalisor hard copies of fee provided that copies are personal or classroom use granted without all or part of this work for are difficult to obtain and even more difficult to interpret. For ex- personal or classroom use is granted without fee provided and that copies not made or distributed for profit or commercial advantage that copies are not made or distributed for profit or commercial advantage and that copies ample, Cataldo et al. [4] compared an MR-induced logical depen- bear this notice and the full citation on the first page. To copy otherwise, to bear this notice and the full citation on the first page. To copy otherwise, to dency graph on source code files with a graph induced by instant republish, to post on servers or to redistribute to lists, requires prior specific republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. permission and/or a fee. 1 ICSE’11,May 21-28 2011, Waikiki, Honolulu , HI, USA ICSE11, May 21–28, 2011, Waikiki, Honolulu, HI, USA The quotes, including the latter ones, are obtained from the inter- Copyright 2011 ACM 978-1-4503-0445-0/11/05 ...$10.00 Copyright 2011 ACM 978-1-4503-0445-0/11/05 ...$10.00. views conducted in our former work [20]. 271
  • 5. Previous Work Dagenais et al., ICSE 2010 Moving into a New Software Project Landscape Barthélémy Dagenais†∗ Harold Ossher‡ , Rachel K. E. Bellamy‡ , Martin P. Robillard† , , Jacqueline P. de Vries‡ School of Computer Science† IBM T.J. Watson Research Center‡ McGill University P.O. Box 704 Montréal, QC, Canada Yorktown Heights, NY 10598 {bart,martin}@cs.mcgill.ca {ossher,rachel,devries}@us.ibm.com ABSTRACT cess of learning about a project, and how that process unfolds over When developers join a software development project, they find time. From the perspective of someone helping newcomers set- themselves in a project landscape, and they must become familiar tle in, the landscape metaphor reveals the need to show them the with the various landscape features. To better understand the nature commonly-traversed routes, to help them learn to interpret aspects of project landscapes and the integration process, with a view to im- of the landscape unique to the project, and to introduce them to the proving the experience of both newcomers and the people responsi- customs of the people who inhabit the landscape. It also suggests ble for orienting them, we performed a grounded theory study with that if the community wants to be welcoming to newcomers, they 18 newcomers across 18 projects. We identified the main features need to be tolerant of cultural faux-pas, be sensitive to mis-steps that characterize a project landscape, together with key orientation caused by a newcomer’s lack of understanding, take the time to aids and obstacles, and we theorize that there are three primary understand why newcomers get lost in their landscape, add readily- factors that impact the integration experience of newcomers: early interpretable signposts and move them as things change. Such sign- experimentation, internalizing structures and cultures, and progress posts are especially important at cross-roads—places with choices validation. where others have tended to get lost. Identifying what counts as a cross-roads and what characterizes the parts of a project that need Categories and Subject Descriptors signposts can be aided by studies such as that presented here. D.2.9 [Software Engineering]: Management Specifically, we were interested in answering three main research questions: what are the key, prominent features in a project land- General Terms scape, what orientation obstacles do new team members face, and Human Factors what orientation aids can be provided? We interviewed 18 develop- ers and team leaders across 18 projects at IBM during the last year 1. INTRODUCTION to answer these questions. Software developers working on a project effectively inhabit a Following these interviews, we theorized that there are three project landscape. They are familiar with its features, such as the main factors that impact how newcomers settle into a project land- product architecture, the team communication strategies and the de- scape: early experimentation, internalizing structures and cultures, velopment process, and they know the shortcuts and the commonly- and progress validation. We also identified the landscape features traveled paths. Newcomers are explorers who must orient them- that newcomers learned while moving into new project landscapes Mentoring selves within an unfamiliar landscape. As they gain experience, and we observed how the features facilitated or hindered the new- they eventually settle in and create their own places within the comers’ integration. When we presented the results of our study to landscape. Like explorers of the natural landscape, they encounter seven of the participants, they all agreed that the factors accurately many obstacles, such as culture shock or getting lost without help. represented their experiences as newcomers and that application of We conducted a qualitative study to better understand what proj- our findings would have eased their integration. ect landscapes look like and how newcomers explore them. Think- In the past, studies on project integration have been performed ing of a project as a landscape, and integration of newcomers as with new employees joining their first software development proj- the process of settling into that landscape, changes what we per- ects [2, 15]. Because these studies were performed with junior and  project    newcomers ceive to be important and helps us see new ways of aiding new- recently-hired developers, many of the difficulties they encountered comers. From a newcomer’s perspective, it emphasizes the pro- related to the newness of the corporate culture and the difference ∗ between academic and industrial environments. We were interested This research was conducted while the author was working at the in understanding specifically the project landscape, independently IBM T.J. Watson Research Center. of the circumstances related to the first-time transition of personnel into an industry environment. To this end, we focused this study on developers with varying degrees of experience in the field and  highly  desirable Permission to make digital or hard copies of all or part of this work for within their company who were joining on-going projects in the personal or classroom use is granted without fee provided that copies are company. We reported preliminary results at a workshop [6]. not made or distributed for profit or commercial advantage and that copies The contributions of this paper include a theory, grounded in em- bear this notice and the full citation on the first page. To copy otherwise, to pirical data, of how newcomers integrate into a project landscape, republish, to post on servers or to redistribute to lists, requires prior specific and a characterization of project landscapes as seen by newcomers. permission and/or a fee. The landscape features identified are well known; the contribution ICSE ’10, May 2-8 2010, Cape Town, South Africa Copyright 2010 ACM 978-1-60558-719-6/10/05 ...$10.00. in this area is the empirical evidence of their impact on integration. 275
  • 6. Characteristics of a Good Mentor enough  exper;se about  the  topic  of  interest     for  the  newcomer… enough  ability  to  help other  people…
  • 7. Sources of Information Expertise        SVN GIT    CVS Ability to help others
  • 8. Our Contribution YODA (Young and newcOmer Developer Assistant) Approach for Mentors Identification in Open Source Projects
  • 9. YODA: Two phases 1) Identify Mentors 2) Recommend in Past Project Mentors History What factors can be used to identify mentors?         S VN ? GIT     C VS
  • 10. RQ1: Identifying mentors in past project history Similar  problem: What factors can be used Iden;fying  advisors  in to identify mentors? academic  collabora;ons ArnetMiner (http://arnetminer.org): popular search engine for academic researchers in computer science identifies relations between students and advisors
  • 11. How does ArnetMiner work? Ranks  pairs  of  researchers  according  to  four  factors: f1 they  published  many   papers  together f2 advisor  published   more  than  the   student f3 advisor  older  than   the  student f4 student  published   her  first  paper(s)   with  the  advisor
  • 12. Heuristics to identify mentors F1: Exchanged emails Is the mentor of Jim IF Alice Time
  • 13. Heuristics to identify mentors F1: Exchanged emails Is the mentor of Jim IF Alice F1 Time When Alice joins the project
  • 14. Heuristics to identify mentors F2: overall amount of emails Is the mentor of Jim IF Alice F1
  • 15. Heuristics to identify mentors F2: overall amount of emails Is the mentor of F2 Jim IF Alice F1 F2 >
  • 16. Heuristics to identify mentors F2: overall amount of emails Is the mentor of F2 Jim IF Alice F1 F2 > >
  • 17. Heuristics to identify mentors F3: age in the project Is the mentor of Jim IF Alice F1 F2 > Time
  • 18. Heuristics to identify mentors F3: age in the project Is the mentor of F3 Jim IF Alice F1 F2 > Time F3
  • 19. Heuristics to identify mentors F4: newcomer “early” emails Is the mentor of Jim IF Alice F1 F2 > Time F3 F4  -­‐    1st  
  • 20. Heuristics to identify mentors F4: newcomer “early” emails Is the mentor of Jim IF Alice F1 F2 > Time First emails by Alice F3 in the project F4  -­‐    1st  
  • 21. Heuristics to identify mentors F5:  Commits Is the mentor of Jim IF Alice F1 F2 > Time F3 F4  -­‐    1st  
  • 22. Heuristics to identify mentors F5:  Commits Is the mentor of F5 Jim IF Alice F1 F2 > Time When Alice joins F3 the project F4  -­‐    1st   F5
  • 23. Aggregating the factors What factors can be used to identify mentors?
  • 24. Recommending Mentors Project  developers Time
  • 25. Recommending Mentors Past  mentors Time
  • 26. Recommending Mentors Past  mentors Time t0 Alice
  • 27. Recommending Mentors Past  mentors Mentor with adequate skills Time t0 Alice
  • 28. Recommending Mentors Past  mentors Inspired to the work on Bug Triaging by J. Anvik et al., TOSEM 2011 Time
  • 29. Recommending Mentors Past  mentors Inspired to the work on Bug Triaging by J. Anvik et al., TOSEM 2011 Time t0 Alice
  • 30. Recommending Mentors Past  mentors Inspired to the work on Bug Triaging by J. Anvik et al., TOSEM 2011 Time t0 Alice
  • 31. Recommending Mentors Past  mentors Inspired to the work on Bug Triaging by J. Anvik et al., TOSEM 2011 Time t0 Alice
  • 32. Recommending Mentors Past  mentors Inspired to the work on Bug Triaging by J. Anvik et al., TOSEM 2011 Time t0 Alice
  • 33. Recommending Mentors Past  mentors Inspired to the work on Bug Triaging by J. Anvik et al., TOSEM 2011 Time t0 Alice DICE SIMILARITY
  • 34. Empirical Study Goal: analyze data from mailing lists and versioning systems Purpose: investigating which factors can be used to identify mentors Quality focus: recommend mentors in software projects Context: mailing lists and versioning systems of five software projects: • Apache, FreeBSD, PostgreSQL, Python and Samba
  • 35. Context Split into a training set and a test set Apache FreeBSD PostgreSQL Python Samba Period 08/2001-03/2002 11/1998-02/2000 10/1998-05/2001 05/2000-05/2001 04/1998-09/2000 (Training set) Period 04/2002-12/2008 03/2000-10/2008 06/2001-03/2008 06/2001-12/2008 10/2000-12/2008 (Test set) # of Mentors (Training set) 19 65 10 28 17 # of Newcomers 13 33 8 32 33 (Training set) # of Newcomers 13 33 7 31 33 (Test set)
  • 36. Research Questions RQ1 RQ2 How can we To what extent would identify mentors it be possible to from the past recommend mentors history of a to newcomers joining software project? a software project? ?
  • 37. RQ1: How can we identify mentors from the past history of a software project? Pair Score 2.5 2.5 1.5 F1 1.5 F2 > F3 1.0 F4  -­‐    1st   F5 1.0
  • 38. RQ1: How can we identify mentors from the past history of a software project? Pair Score 2.5 2.5 Manually 1.5 ✔ validated F1 1.5 F2 > F3 1.0 F4  -­‐    1st   F5 1.0
  • 39. RQ1: How can we identify mentors from the past history of a software project? 100%# 90%# Possible 80%# Configurations 70%# f1 Precision) 60%# 50%# F1 40%# 30%# F2 > 20%# F3 10%# 0%# F4  -­‐    1st   18# 19# 20# 21# 22# 23# 24# Number)of)newcomer0mentor)pairs) F5
  • 40. RQ1: How can we identify mentors from the past history of a software project? 100%# 90%# Possible 80%# Configurations 70%# f1  +f2+  f3 Precision) 60%# 50%# F1 40%# 30%# F2 > 20%# F3 10%# 0%# F4  -­‐    1st   18# 19# 20# 21# 22# 23# 24# Number)of)newcomer0mentor)pairs) F5
  • 41. RQ1: How can we identify mentors from the past history of a software project? 100%# 90%# Possible 80%# Configurations 70%# f1  +f2+  f4 Precision) 60%# 50%# F1 40%# 30%# F2 > 20%# F3 10%# 0%# F4  -­‐    1st   18# 19# 20# 21# 22# 23# 24# Number)of)newcomer0mentor)pairs) F5
  • 42. RQ1: How can we identify mentors from the past history of a software project? 100%# 90%# Possible 80%# Configurations 70%# f5 Precision) 60%# 50%# F1 40%# 30%# F2 > 20%# F3 10%# 0%# F4  -­‐    1st   18# 19# 20# 21# 22# 23# 24# Number)of)newcomer0mentor)pairs) F5 (Baseline)
  • 43. RQ1: How can we identify mentors from the past history of a software project? 100%# Apache 100%# PostgreSQL 90%# 90%# 80%# 80%# 70%# 70%# Precision) Precision) 60%# 60%# 50%# 50%# 40%# 40%# 30%# 30%# 20%# 20%# 10%# 10%# 0%# 0%# 18# 19# 20# 21# 22# 23# 24# 12# 14# 16# 18# 20# 22# Number)of)newcomer0mentor)pairs) Number)of)newcomer0mentor)pairs) f1 f1  +f2+  f3 f1  +f2+  f4 f5 (Baseline) F1 F2 > F3 F4  –  1st   F5
  • 44. RQ1: How can we identify mentors from the past history of a software project? 100%# Apache 100%# PostgreSQL 90%# 90%# 80%# 80%# 70%# 70%# Precision) Precision) 60%# 60%# 50%# 50%# 40%# 40%# 30%# 30%# 20%# 20%# 10%# 10%# 0%# 0%# 18# 19# 20# 21# 22# 23# 24# 12# 14# 16# 18# 20# 22# Number)of)newcomer0mentor)pairs) Number)of)newcomer0mentor)pairs) f1 f1  +f2+  f3 f1  +f2+  f4 f5 (Baseline) F1 F2 > F3 F4  –  1st   F5
  • 45. RQ1: How can we identify mentors from the past history of a software project? 100%# 90%# Python 100%# 90%# FreeBSD 80%# 80%# 70%# 70%# Precision) Precision) 60%# 60%# 50%# 50%# 40%# 40%# 30%# 30%# 20%# 20%# 10%# 10%# 0%# 0%# 24# 26# 28# 30# 32# 34# 36# 38# 40# 42# 44# 46# 48# 23# 25# 27# 29# 31# 33# 35# 37# 39# 41# Number)of)newcomer0mentor)pairs) Number)of)newcomer0mentor)pairs) 100%# 90%# 80%# 70%# Precision) 60%# 50%# 40%# 30%# 20%# Samba 10%# 0%# 30# 32# 34# 36# 38# 40# 42# Number)of)newcomer0mentor)pairs)
  • 46. RQ1: How can we identify mentors from the past history of a software project? 100%# 90%# Python 100%# 90%# FreeBSD 80%# 80%# 70%# Useful factors for mentor identification 70%# Precision) Precision) 60%# 60%# 50%# 50%# 40%# 40%# 30%# 30%# 20%# 20%# 10%# 10%# F1 0%# 0%# > 24# 26# 28# 30# 32# 34# 36# 38# 40# 42# 44# 46# 48# 23# 25# 27# 29# 31# 33# 35# 37# 39# 41# f1 Number)of)newcomer0mentor)pairs) F2 Number)of)newcomer0mentor)pairs) 100%# 0.5*f1  +  0.25*f2  +    0.25*f3 90%# F3 80%# 0.5*f1  +  0.25*f2  +  0.25*f4 70%# Precision) 60%# F4  -­‐    1st   50%# 40%# 30%# 20%# Samba F5 10%# 0%# 30# 32# 34# 36# 38# 40# 42# Number)of)newcomer0mentor)pairs)
  • 47. RQ2: To what extent would it be possible to recommend mentors to newcomers joining a software project? Top$1$$ Top$2$ 110%$ 100%$ 100%$ 100%$ 94%$ 90%$ 85%$ 81%$ 82%$ 80%$ 77%$ Precision 70%$ 64%$ 60%$ 50%$ 40%$ 30%$ 30%$ 24%$ 20%$ 10%$ 0%$ Apache$ FreeBSD$ PostgreSQL$ Python$ Samba$
  • 48. RQ2: To what extent would it be possible to recommend mentors to newcomers joining a software project? Top$1$$ Top$2$ 110%$ 100%$ 100%$ 100%$ 94%$ 90%$ 85%$ 81%$ 82%$ 80%$ 77%$ Precision 70%$ 64%$ 60%$ 50%$ 40%$ 30%$ 30%$ 24%$ 20%$ 10%$ 0%$ Apache$ FreeBSD$ PostgreSQL$ Python$ Samba$
  • 49. RQ2: To what extent would it be possible to recommend mentors to newcomers joining a software project? Top$1$$ Top$2$ 110%$ 100%$ 100%$ 100%$ 94%$ 90%$ 85%$ 81%$ 82%$ 80%$ 77%$ ✔ Precision 70%$ 64%$ 60%$ YODA makes it possible 50%$ to recommend mentors 40%$ 30%$ 30%$ 24%$ 20%$ 10%$ 0%$ Apache$ FreeBSD$ PostgreSQL$ Python$ Samba$
  • 50. Why don’t just using Top Committers? Top#1## Top#2# 60%# 50%# 50%# 40%# Precision 35%# 35%# 30%# 25%# 20%# 10%# 8%# 7%# 6%# 3%# 0%# 0%# 0%# Apache# FreeBSD# PostgreSQL# Python# Samba#
  • 51. Why don’t just using Top Committers? Top#1## Top#2# 60%# 50%# 50%# 40%# Precision 35%# 35%# 30%# 25%# 20%# 10%# 8%# 7%# 6%# 3%# 0%# 0%# 0%# Apache# FreeBSD# PostgreSQL# Python# Samba#
  • 52. Why don’t just using Top Committers? Top#1## Top#2# 60%# 50%# 50%# 40%# Precision 35%# 35%# 30%# 25%# 20%# 10%# 8%# 7%# 6%# 3%# 0%# 0%# 0%# Apache# FreeBSD# PostgreSQL# Python# Samba#
  • 53. Why don’t just using Top Committers? Top#1## Top#2# 60%# 50%# Not all committers 50%# 40%# are good mentors! Precision 35%# 35%# 30%# 25%# 20%# 10%# 8%# 7%# 6%# 3%# 0%# 0%# 0%# Apache# FreeBSD# PostgreSQL# Python# Samba#
  • 54.       Surveying Projects Developers Questions asked: Done/received mentoring Mentor Newcomer Perceived importance of mentoring What makes a good mentor
  • 55. Sent to 114 Subjects... Samba ..... 37 FreeBSD ..... 37 Postgre- ..... SQL 15 Python ..... 23 Apache ..... 23
  • 56. Obtained Answers Samba FreeBSD Postgre- SQL Python Apache            -­‐
  • 57. Done/received mentoring? Had#a#mentor?# 58%$ 42%$ Did#mentoring?# 92%$ 8%$ 0%# 20%# 40%# 60%# 80%# 100%# YES# NO#
  • 58. Done/received mentoring? Had#a#mentor?# 58%$ 42%$ Yes, I received Yes, I did Mentoring. My mentoring… mentor was… Did#mentoring?# 92%$ 8%$ 0%# 20%# 40%# 60%# 80%# 100%# YES# NO#
  • 59. Perceived importance of mentoring 0%$ Useless#at#all# 0%$ 0%$ Not#important# 0%$ 11%$ Neutral# 45%$ 56%$ Important# 36%$ 33%$ Very#important# 18%$ 0%# 10%# 20%# 30%# 40%# 50%# 60%# Effect#of#mentor# Effect#on#newcomer#
  • 60. Perceived importance of mentoring 0%$ Useless#at#all# 0%$ Not#important# 0%$ a Is very important that 0%$ mentor shares knowledge with a mentee… 11%$ Neutral# 45%$ 56%$ Important# 36%$ 33%$ Very#important# 18%$ 0%# 10%# 20%# 30%# 40%# 50%# 60%# Effect#of#mentor# Effect#on#newcomer#
  • 61. What makes a good mentor Others# 0%$ Project#knowledge# 38%$ Communica4on#skills# 42%$ Experience# 19%$ 0%# 20%# 40%# 60%#
  • 62. What makes a good mentor Others# 0%$ My first mentor had a very strong and Project#knowledge# technical background 38%$ Communica4on#skills# 42%$ Experience# 19%$ 0%# 20%# 40%# 60%#
  • 68. Future Work...     Considering factors able to better capture the technical skills of mentors. Replicating the study with different projects.