SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Downloaden Sie, um offline zu lesen
Gender Gap in Collaborative Platforms:
Language and emotions in Wikipedia Discussions
David Laniado, Daniela Iosub, Carlos Castillo,
Mayo Fuster Morell and Andreas Kaltenbrunner
david.laniado@eurecat.org
Universitat Pompeu Fabra, January 17, 2017
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 1 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 2 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 3 / 58
Wikipedia is a teenager
Happy birthday
Wikipedia!
English Wikipedia is
now 16 years old
Catalan Wikipedia will
be 16 in March (the
second oldest one)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 4 / 58
The largest human knowledge repository
Fifth most visited web site
Among top results for search queries about almost any topic
Largest collaborative project
Conditions and reflects public opinion... with some bias
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 5 / 58
Wikipedia’s social experiment
"The problem with Wikipedia is that it only works in practice. In theory,
it can never work."
(Wikipedian popular joke)
A crazy idea: anyone can edit
All relevant points of view should be represented
Policies of Notability and Neutral Point of view
Quality assured by editors’ negotiations over content
The more people with different points of view contributing, the
better the quality
→ Biases in the editor community may cause biases in the content
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 6 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 7 / 58
Bias in the content?
Top global biographies by birth country (Young-Ho Eom et al, 2015)
top central biographies from each of the 24 major Wikipedias
the 100 most central (PageRank) in each version’s hyperlink network
→ striking geographic bias
http://www.quantware.ups-tlse.fr/QWLIB/topwikipeople/geofigs/pagerank24x100.html
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 8 / 58
Top Women Biographies
Rank NA PageRank female figures CC Century LC
1 24 Elizabeth II UK 20 EN
2 17 Mary (mother of Jesus) IL -1 HE
3 12 Queen Victoria UK 19 EN
4 6 Elizabeth I of England UK 16 EN
5 2 Maria Theresa AT 18 DE
6 1 Benazir Bhutto PK 20 HI
7 1 Catherine the Great PL 18 PL
8 1 Anne Frank DE 20 DE
9 1 Indira Gandhi IN 20 HI
10 1 Margrethe II of Denmark DK 20 DA
Top 10 global female historical figures by PageRank for the 24 major
Wikipedia editions (Young-Ho Eom et al, 2015)
NA → number of language editions in which a biography appears in the
top 100 rank
CC → birth country code
LC → language code corresponding to the birth country
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 9 / 58
Top biographies by gender
Number of women among the top global biographies by birth century
Number of women among the 100 most central biographies for each
language edition
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 10 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 11 / 58
Wikipedia editor gender gap
Estimated women participation
2011 editor survey: 9%
2013 editor survey: 13%
corrected with propensity score estimation: 16% (Mako Hill and
Shaw, 2013)
while in most online social networks women are more active!
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 12 / 58
Why women do not edit Wikipedia?
1 A lack of user-friendliness in the editing interface
2 Not having enough free time
3 A lack of self-confidence
4 Aversion to conflict and an unwillingness to participate in lengthy
edit wars
5 Belief that their contributions are too likely to be reverted or
deleted
6 Some find its overall atmosphere misogynistic
7 Wikipedia culture is sexual in ways they find off-putting
8 Being addressed as male is off-putting to women whose primary
language has grammatical gender
9 Fewer opportunities than other sites for social relationships and a
welcoming tone
(Sue Gardener, 2011)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 13 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 14 / 58
Emotional factors and discussions
Importance of
emotional factors
Discussion spaces are
fundamental to the
collaborative process
Discussion triggers
emotions and breeds
particular emotional
environments
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 15 / 58
Wikipedia’s most visible side
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 16 / 58
Article talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 17 / 58
Discussions in article talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 18 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 19 / 58
Studying emotions and language in talk pages
Interactions in Wikipedia
Implicit → editing
Explicit → communication
Article talk pages → discussions about how to improve articles
User talk pages → a kind of public in-boxes
Goal: Shed light on the emotional dimension of the interactions
extensive analysis of emotions in explicit communication
sentiment analysis of comments in article talk and personal talk
pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 20 / 58
Research questions
1 How are the emotional and communication styles of editors
affected by their status?
2 How are the emotional and communication styles of editors
affected by their gender?
3 How are the emotional expressions affected by interacting with
others in comment threads (emotional congruence)?
4 How are the emotional styles of editors related to those of the
editors they interact more frequently with (emotional
homophily)?
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 21 / 58
Publications
Results published in:
Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012)
Emotions and dialogue in a peer-production community: the case of Wikipedia.
8th International Symposium on Wikis and Open Collaboration, WikiSym’12
Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014)
Emotions under Discussion: Gender, Status and Communication in Online Collaboration.
Plos One, 9(8)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 22 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 23 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 24 / 58
Dataset: conversations
Extracting conversations among editors from the English Wikipedia
Articles 3 210 039
Articles with talk page (ATP) 871 485 (27.1%)
Editors who comment articles 350 958
Editors with ≥ 100 comments on ATP 12 231 (3.5%)
Total comments in ATP 11 041 246
Comments containing ANEW words 7 414 411 (67.2%)
Comments by editors with ≥ 100 comments on ATP 5 480 544 (49.6%)
Comments by these editors and with ANEW words 3 649 297 (33.3%)
Table: Data extracted from a complete dump of the English Wikipedia, dated
March 12th, 2010
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 25 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 26 / 58
User gender labelling
≈ 12 000 users wrote ≥ 100 comments in articles talk pages
Gender identified through Wikipedia API for ≈ 2 000 of them
A sample of 1 385 users for manual labelling through
crowdsourcing (Crowdflower)
Non-admins Admins Total
Men 1 087 1 526 2 613
Women 68 97 165
Unknown 6 850 2 603 9 453
Total 8 005 4 226 12 231
Table: Users with ≥ 100 comments by gender and administrator status.
Gender could be identified only for ≈ 50% of users:
real name or username (50% of those identified)
implicitly stated gender (27% of women, 20% of men)
pronoun (15% of women, 10% of men)
other indicators: userboxes, pictures, links to personal blogs...
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 27 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 28 / 58
Measuring the Emotional Content of Discussions
Lexicon-based methods
relying on three different instruments:
Affective norms for English words (ANEW)
Linguistic Inquiry and Word Count (LIWC)
SentiStrength
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 29 / 58
Measuring the Emotional Content of Discussions
Method 1: Affective norms for English words (ANEW)
Rates a list of 1060 frequent words on a 9 point scale in three
dimensions:
Valence
Arousal
Dominance
assign emotion scores to each word from the lexicon
Bradley and Lang. (1999).
Affective norms for English words (ANEW) Technical report C-1.
The Center for Research in Psychophysiology, University of Florida, FL.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 30 / 58
Measuring the Emotional Content of Discussions
Method 2: Linguistic Inquiry and Word Count (LIWC)
Two scores for basic emotion (compared with ANEW valence)
positive emotion
negative emotion
Discrete measures of emotions (anger, anxiety, sadness, affect)
Other classes of words to characterize language (i.e. personal
pronouns, tentative words, fillers...)
→ Count the proportion of words belonging to each class
Pennebaker J, Chung C, Ireland M, Gonzales A, Booth R (2010).
The development and psychometric properties of LIWC2007. Austin, TX.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 31 / 58
Measuring the Emotional Content of Discussions
Method 2: Linguistic Inquiry and Word Count (LIWC)
Dictionary size Examples
Anger 91 hate, kill, annoyed
Anxiety 84 worried, fearful, nervous
Sadness 101 crying, grief, sad
Tentative 155 maybe, perhaps, guess
Certainty 83 always, never
Fillers 9 blah, you know
Past 155 went, ran, had
Present 169 is, does, hear
Future 48 will, gonna
Social words 455 mate, talk, child
Table: Description of LIWC measures (as per http://www.liwc.net).
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 32 / 58
Measuring Relationship-Orientation with LIWC
Definition
Communication that promotes social
affiliation and emotional connection:
preoccupation with others (use of
personal pronouns, e.g., I, you)
preoccupation with the larger social
domain (e.g., references to friends and
family)
expression of positive emotion
Examples
We are glad to have you. If I can help at all let
me know :)
A-giau has smiled at you. Smiles promote
WikiLove and hopefully this one has made
your day better...Happy editing
Congrats! Thank you for your dedication.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 33 / 58
Measuring the Emotional Content of Discussions
Method 3: SentiStrength
SentiStrength
Based on LIWC and developed for short web texts
Accounts for modes of textual expression specific to the online
environment, e.g. emoticons and abbreviations
Provides a positive and a negative score for emotional valence
Emotion score is the strongest positive and negative emotion
expressed in a comment
Final scores are averages over comments in a given category
Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010)
Sentiment strength detection in short informal text.
Journal of the American Society for Information Science and Technology 61: 2544 – 2558.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 34 / 58
Example: results with three different emotional lexica
Table: Example messages with their corresponding Valence(ANEW) or
positive & negative scores (LIWC, SentiStrength)
ANEW LIWC SentiSt.
Valence + - + -
Sounds like a good challenge - to be proven or disproven. I’m
happy if it can be shown to go further using closed cubic poly-
nomial solutions. The nice thing about these are that they are
pretty easy to test numerically . . .
7.4 12.5 0 3 -2
–in “Exact trigonometric constants”
Seems you have not yet seen female lover after having sex
who do not wish to have sex with the same lover any more :)
Once you’ve seen it, you understand very well what war of Venus
means compared to war of Mars.
5.5 6.8 4.5 4 -3
–in “House (astrology)”
What about the whirlie hazing, the alcohol abuse, the emotional
poverty, the suicide in 1995/6, the biotech plans which were
stopped by pitzer protests . . .
1.6 4 8 1 -4
–in “Harvey Mudd College”
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 35 / 58
Sentiment analysis
Statistical tests
Compute average values with the three lexica for each user
Compare distribution of values for two groups of users (e.g.:
admins vs regulars, women vs men)
Most variables are not normally distributed
⇓
Mann-Whitney U-test
Compare distributions of rankings
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 36 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 37 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 38 / 58
Emotions and Status
Table: Emotions and Status: Administrators promote a generally neutral tone
on article talk pages. Regular editors express more negative emotion, and
are more emotional.
(Article Talk) Regular Admin Mann-Whitney U-Test p-value
LIWC
Positive 2.369 2.409 -4.308 p < 0.001
Negative 1.368 1.120 -18.578 p < 0.001
Affect 3.784 3.661 -8.466 p < 0.001
Anxiety 0.180 0.166 -5.834 p < 0.001
Anger 0.554 0.446 -19.217 p < 0.001
Sadness 0.175 0.166 -4.450 p < 0.001
SentiStrength
Positive 1.805 1.774 -14.603 p < 0.001
Negative -2.005 -1.912 -23.046 p < 0.001
When difference is statistically significant (p-value in bold) the larger absolute value is underlined
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 39 / 58
Emotions and Status
Admins:
more positive emotion
(ANEW and LIWC)
generally, emotionally
reserved compared to
regular users (LIWC)
Regular users:
more emotional
more affect, and more
anxiety, anger and
sadness (LIWC)
stronger positive and
negative words than
admins (SentiStrength)
Personal talk pages
In personal talk pages, admins are more emotional compared to
the article talk pages
more positive emotion compared to regular editors, but also more
anxiety and sadness
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 40 / 58
Dialogue and Status
Table: Dialogue and Status: Administrators are more impersonal in article talk
pages. Regular editors are more concerned with others.
(Article Talk) Regular Admin Mann-Whitney U-test p-value
Relationship-orientation
Personal pronouns 5.135 4.815 -13.561 p < 0.001
Use of “I” 2.456 2.429 -1.733 p=0.083
Use of “You” 1.043 0.892 -12.573 p < 0.001
Use of “Shehe” 0.609 0.526 -8.657 p < 0.001
Social words 6.320 5.810 -19.013 p < 0.001
Certainty
Certainty 1.426 1.317 -16.824 p < 0.001
Tentativeness 3.199 3.169 -2.210 p < 0.001
Filler words 0.168 0.155 -6.687 p < 0.001
Temporal Orientation
Past 2.376 2.305 -5.696 p < 0.001
Present 8.011 7.841 -8.060 p < 0.001
Future 1.114 1.166 -9.887 p < 0.001
When difference is statistically significant (p-value in bold) the larger absolute value is underlined
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 41 / 58
Dialogue and Status
Admins
more neutral and
impersonal tone
less relationship oriented
more concerned with the
future
tend to "rule with reason"
Regular users
more relationship-oriented
more personal pronouns
and more social words
more concerned with past
more insecure, but not in
personal spaces
more certainty, tentative
and filler words in article
talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 42 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 43 / 58
Emotions and gender
ANEW Words more used by women and men
Size accounts for difference in frequency
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 44 / 58
Emotions and gender
Women use words associated to more positive emotions
Result consistent and significant with the three lexicons
ANEW: Difference is not significant when normalising by article
→ difference might be due to topic selection: women choose to
participate in topics which have more positive discussions
No significant difference in expression of negative emotions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 45 / 58
Topics, emotions and gender
N≥1 ANEW words; corr=−0.64 (p=0.002)
prop. of male comments
meanvalence
0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96
4.7
4.8
4.9
5
5.1
5.2
5.3
5.4
5.5
5.6
Computing
Arts
Philosophy
Language
Health
Mathematics
Belief
Sports
Agriculture
Environment
Techn. & app. sci.
Law
Society
Business
Education
Culture
People
Science
Politics
Geography and places
History and events
Figure: Mean valence (ANEW) for discussions of articles in different topic
categories, vs the proportion of comments written by men
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 46 / 58
Dialogue and gender
Table: Dialogue and Gender: Women use a more relationship-oriented
speech style.
(Article Talk) Men Women Mann-Whitney U-test p-value
Relationship-orientation
Personal pronouns 4.964 5.420 -4.375 p < 0.001
Use of “I” 2.488 2.764 -3.945 p < 0.001
Use of “You” 0.936 0.957 -0.926 p=0.355
Use of “Shehe” pronouns 0.541 0.713 -4.657 p < 0.001
Social words 5.960 6.353 -3.487 p < 0.001
Certainty
Certainty 1.346 (1397) 1.300 (1263) -2.078 p = 0.038*
Tentativeness 3.150 3.215 -1.162 p=0.245
Filler words 0.161 0.160 -0.137 p=0.891
Temporal Orientation
Past 2.325 2.543 -4.305 p < 0.001
Present 7.897 8.180 -3.086 p = 0.002
Future 1.168 1.147 -1.008 p=0.314
When difference is statistically significant (p-value in bold) the larger absolute value is
underlined. Cases where the averages are not informative are marked with an asterisk * and
include the mean ranks Mann-Whitney U-test next to the averages in parentheses.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 47 / 58
Dialogue and Gender
Women write longer messages
Women are more relationship-oriented
more personal pronouns, in particular “I”, more social words
Women are not more insecure
Less certainty words, no significant difference for tentativeness and
filler words
Women admins are more relationship oriented than men admins
Different leadership style
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 48 / 58
Qualitative analysis: Relationship orientation
Manual classification of 100 comments
Three main types of comments high in relationship-orientation:
inviting comments that explain the edit in a friendly tone, and call for
further intervention and collaboration
common perspective-building comments that are focused on
understanding others and solving debates in a constructive manner
appreciative comments that contain positive emotions and
celebrate others’ actions
⇒ This suggests that relationship-orientation may be conducive to
successful collaboration
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 49 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 50 / 58
Emotional congruence
Comparison of each comment with the comment it replies to
not based only on our set of users, but on all comments (from all
users)
Emotions: editors tend to reply with:
more positive emotion
less negative emotion
less anger, anxiety and sadness
stronger words, both positive and negative (SentiStrength)
Dialogue: editors tend to reply with:
more relationship oriented speech
less tentative and certainty words
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 51 / 58
Emotional homophily
Mixing patterns: do users interact preferentially with similar users?
Disassortativity by activity
users who write more comments tend to reply preferentially to less
active users, and viceversa
Assortativity by gender
Men interact more with other men, and women with other women
Assortativity by emotion and language
Users interact more with others similar in emotional expression and
communication style
also in the network of communication on personal talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 52 / 58
Emotional homophily
Example: homophily by expression of anger
edges connect users who
have exchanged at least 10
replies
node color represents the
level of anger expressed by a
user, from low to high
node size → proportional to
the number of connections of
a user
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 53 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 54 / 58
Conclusions
Administrators and experienced users play a pivotal role
they tend to interact especially with less experienced users
they promote a positive but impersonal environment
Men and women have a different communication style
women participate in discussions that have a more positive tone
men interact more with men, and women with women
women use a more emotional and relationship-oriented language
women admins have a relationship-oriented leadership style
⇒ promoting relationship-orientated leadership could lead to a more
positive environment
⇒ giving women more space in the community could result in a more
welcoming envoronment, for both women and men
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 55 / 58
Future work
Longitudinal analysis
how do emotional styles of editors change over time and with
increasing experience?
how do emotions in the discussions affect participation?
Qualitative analysis and human annotation
include non-textual emotional aspects such as emoticons, barn
stars and virtual gifts
deal with sarcasm, measure the extent of condescending or
paternalistic language in comments addressed at women editors
Examine other online spaces
Similar conclusions might hold for other online spaces
especially in discussions involving conflict, decision making and
power dynamics
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 56 / 58
Some references
M. M. Bradley and P. J .Lang.
Affective norms for English words (ANEW) Technical report C-1.
The Center for Research in Psychophysiology, University of Florida, FL, 2012.
B. Collier and J. Bear.
Conflict, criticism, or confidence: an empirical examination of the gender gap in wikipedia contributions.
In Proc. of CSCW, 2012.
Eom, Y.H., Aragón, P., Laniado, D., Kaltenbrunner, A., Vigna, S., Shepelyansky, D.L.: Interactions of cultures and top
people of wikipedia from ranking of 24 language editions.
PLoS ONE 10(3), e0114,825 (2015).
Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014)
Emotions under Discussion: Gender, Status and Communication in Online Collaboration.
Plos One, 9(8)
O. Kucuktunc, B. B. Cambazoglu, I. Weber, and H. Ferhatosmanoglu.
A large-scale sentiment analysis for Yahoo! answers.
In Proc. of WSDM, 2012.
D. Laniado, R. Tasso, Y. Volkovich, and A. Kaltenbrunner.
When the Wikipedians talk: Network and tree structure of Wikipedia discussion pages.
In Proc. of ICWSM, 2011.
Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012)
Emotions and dialogue in a peer-production community: the case of Wikipedia.
8th International Symposium on Wikis and Open Collaboration, WikiSym’12
H. Zhu, R. Kraut, A. Kittur
Effectiveness of shared leadership in online communities.
In Proc. of CSCW, 2012.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 57 / 58
Questions?
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 58 / 58

Weitere ähnliche Inhalte

Ähnlich wie Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions

Emotions under Discussion: Gender, Status and Communication in Wikipedia
Emotions under Discussion: Gender, Status and Communication in WikipediaEmotions under Discussion: Gender, Status and Communication in Wikipedia
Emotions under Discussion: Gender, Status and Communication in WikipediaDavid Laniado
 
Role of Language in Maintaining Ethnic Identity
Role of Language in Maintaining Ethnic IdentityRole of Language in Maintaining Ethnic Identity
Role of Language in Maintaining Ethnic Identitypaperpublications3
 
Visualizing social interactions in Wikipedia - WikiCorp 2018
Visualizing social interactions in Wikipedia - WikiCorp 2018Visualizing social interactions in Wikipedia - WikiCorp 2018
Visualizing social interactions in Wikipedia - WikiCorp 2018David Laniado
 
NATIONAL FORUM JOURNALS - www.nationalforum.com
NATIONAL FORUM JOURNALS - www.nationalforum.comNATIONAL FORUM JOURNALS - www.nationalforum.com
NATIONAL FORUM JOURNALS - www.nationalforum.comWilliam Kritsonis
 
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...William Kritsonis
 
There Goes Everybody: Social Media and Civic Engagement
There Goes Everybody: Social Media and Civic EngagementThere Goes Everybody: Social Media and Civic Engagement
There Goes Everybody: Social Media and Civic EngagementMinnesota Campus Comapct
 
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...Sina Institute
 
Teaching NonFiction in a Digital Space
Teaching NonFiction in a Digital SpaceTeaching NonFiction in a Digital Space
Teaching NonFiction in a Digital SpaceAngela Maiers
 
Open and Participatory Environments in Language Learning
Open and Participatory Environments in Language LearningOpen and Participatory Environments in Language Learning
Open and Participatory Environments in Language LearningBarbara Dieu
 
Look Sharp 2011 - Monday am
Look Sharp 2011 - Monday amLook Sharp 2011 - Monday am
Look Sharp 2011 - Monday amRoger Sevilla
 
Going Social: What You Need to Know to Launch a Social Media Strategy
Going Social: What You Need to Know to Launch a Social Media StrategyGoing Social: What You Need to Know to Launch a Social Media Strategy
Going Social: What You Need to Know to Launch a Social Media StrategyJim Rattray
 
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...Language and Culture Learning Center
 
What increases (social) media attention: Research impact, author prominence o...
What increases (social) media attention: Research impact, author prominence o...What increases (social) media attention: Research impact, author prominence o...
What increases (social) media attention: Research impact, author prominence o...Olga Zagovora
 
21st Century Skills: What do Adult Educators Need to Know?
21st Century Skills: What do Adult Educators Need to Know?21st Century Skills: What do Adult Educators Need to Know?
21st Century Skills: What do Adult Educators Need to Know?Marian Thacher
 

Ähnlich wie Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions (20)

Emotions under Discussion: Gender, Status and Communication in Wikipedia
Emotions under Discussion: Gender, Status and Communication in WikipediaEmotions under Discussion: Gender, Status and Communication in Wikipedia
Emotions under Discussion: Gender, Status and Communication in Wikipedia
 
Role of Language in Maintaining Ethnic Identity
Role of Language in Maintaining Ethnic IdentityRole of Language in Maintaining Ethnic Identity
Role of Language in Maintaining Ethnic Identity
 
Livemocha Vs[1][1][1]
Livemocha Vs[1][1][1]Livemocha Vs[1][1][1]
Livemocha Vs[1][1][1]
 
Visualizing social interactions in Wikipedia - WikiCorp 2018
Visualizing social interactions in Wikipedia - WikiCorp 2018Visualizing social interactions in Wikipedia - WikiCorp 2018
Visualizing social interactions in Wikipedia - WikiCorp 2018
 
The Networked Librarian: Libraries as social networks
The Networked Librarian: Libraries as social networksThe Networked Librarian: Libraries as social networks
The Networked Librarian: Libraries as social networks
 
NATIONAL FORUM JOURNALS - www.nationalforum.com
NATIONAL FORUM JOURNALS - www.nationalforum.comNATIONAL FORUM JOURNALS - www.nationalforum.com
NATIONAL FORUM JOURNALS - www.nationalforum.com
 
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
 
Scholars imagined audiences
Scholars imagined audiencesScholars imagined audiences
Scholars imagined audiences
 
There Goes Everybody: Social Media and Civic Engagement
There Goes Everybody: Social Media and Civic EngagementThere Goes Everybody: Social Media and Civic Engagement
There Goes Everybody: Social Media and Civic Engagement
 
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
 
Web 2.0 An Introduction
Web 2.0   An IntroductionWeb 2.0   An Introduction
Web 2.0 An Introduction
 
Teaching NonFiction in a Digital Space
Teaching NonFiction in a Digital SpaceTeaching NonFiction in a Digital Space
Teaching NonFiction in a Digital Space
 
Open and Participatory Environments in Language Learning
Open and Participatory Environments in Language LearningOpen and Participatory Environments in Language Learning
Open and Participatory Environments in Language Learning
 
Look Sharp 2011 - Monday am
Look Sharp 2011 - Monday amLook Sharp 2011 - Monday am
Look Sharp 2011 - Monday am
 
Dean R Berry Close Reading: Social Networking Issues
Dean R Berry Close Reading:  Social Networking IssuesDean R Berry Close Reading:  Social Networking Issues
Dean R Berry Close Reading: Social Networking Issues
 
Reading 2.0
Reading 2.0Reading 2.0
Reading 2.0
 
Going Social: What You Need to Know to Launch a Social Media Strategy
Going Social: What You Need to Know to Launch a Social Media StrategyGoing Social: What You Need to Know to Launch a Social Media Strategy
Going Social: What You Need to Know to Launch a Social Media Strategy
 
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
 
What increases (social) media attention: Research impact, author prominence o...
What increases (social) media attention: Research impact, author prominence o...What increases (social) media attention: Research impact, author prominence o...
What increases (social) media attention: Research impact, author prominence o...
 
21st Century Skills: What do Adult Educators Need to Know?
21st Century Skills: What do Adult Educators Need to Know?21st Century Skills: What do Adult Educators Need to Know?
21st Century Skills: What do Adult Educators Need to Know?
 

Mehr von David Laniado

Wikipedia Cultural Diversity Dataset - ICWSM 2019
Wikipedia Cultural Diversity Dataset  - ICWSM 2019Wikipedia Cultural Diversity Dataset  - ICWSM 2019
Wikipedia Cultural Diversity Dataset - ICWSM 2019David Laniado
 
BarcelonaNow dashboard showcase
BarcelonaNow dashboard showcaseBarcelonaNow dashboard showcase
BarcelonaNow dashboard showcaseDavid Laniado
 
Contropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit historyContropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit historyDavid Laniado
 
Gender patterns on a large social network (SocInfo 2014)
Gender patterns on a large social network (SocInfo 2014)Gender patterns on a large social network (SocInfo 2014)
Gender patterns on a large social network (SocInfo 2014)David Laniado
 
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y RolesDinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y RolesDavid Laniado
 
Emotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of WikipediaEmotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of WikipediaDavid Laniado
 
When the Wikipedians talk: network and tree structure of Wikipedia discussion...
When the Wikipedians talk: network and tree structure of Wikipedia discussion...When the Wikipedians talk: network and tree structure of Wikipedia discussion...
When the Wikipedians talk: network and tree structure of Wikipedia discussion...David Laniado
 

Mehr von David Laniado (7)

Wikipedia Cultural Diversity Dataset - ICWSM 2019
Wikipedia Cultural Diversity Dataset  - ICWSM 2019Wikipedia Cultural Diversity Dataset  - ICWSM 2019
Wikipedia Cultural Diversity Dataset - ICWSM 2019
 
BarcelonaNow dashboard showcase
BarcelonaNow dashboard showcaseBarcelonaNow dashboard showcase
BarcelonaNow dashboard showcase
 
Contropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit historyContropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit history
 
Gender patterns on a large social network (SocInfo 2014)
Gender patterns on a large social network (SocInfo 2014)Gender patterns on a large social network (SocInfo 2014)
Gender patterns on a large social network (SocInfo 2014)
 
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y RolesDinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
 
Emotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of WikipediaEmotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of Wikipedia
 
When the Wikipedians talk: network and tree structure of Wikipedia discussion...
When the Wikipedians talk: network and tree structure of Wikipedia discussion...When the Wikipedians talk: network and tree structure of Wikipedia discussion...
When the Wikipedians talk: network and tree structure of Wikipedia discussion...
 

Kürzlich hochgeladen

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 

Kürzlich hochgeladen (20)

VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 

Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions

  • 1. Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions David Laniado, Daniela Iosub, Carlos Castillo, Mayo Fuster Morell and Andreas Kaltenbrunner david.laniado@eurecat.org Universitat Pompeu Fabra, January 17, 2017 David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 1 / 58
  • 2. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 2 / 58
  • 3. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 3 / 58
  • 4. Wikipedia is a teenager Happy birthday Wikipedia! English Wikipedia is now 16 years old Catalan Wikipedia will be 16 in March (the second oldest one) David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 4 / 58
  • 5. The largest human knowledge repository Fifth most visited web site Among top results for search queries about almost any topic Largest collaborative project Conditions and reflects public opinion... with some bias David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 5 / 58
  • 6. Wikipedia’s social experiment "The problem with Wikipedia is that it only works in practice. In theory, it can never work." (Wikipedian popular joke) A crazy idea: anyone can edit All relevant points of view should be represented Policies of Notability and Neutral Point of view Quality assured by editors’ negotiations over content The more people with different points of view contributing, the better the quality → Biases in the editor community may cause biases in the content David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 6 / 58
  • 7. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 7 / 58
  • 8. Bias in the content? Top global biographies by birth country (Young-Ho Eom et al, 2015) top central biographies from each of the 24 major Wikipedias the 100 most central (PageRank) in each version’s hyperlink network → striking geographic bias http://www.quantware.ups-tlse.fr/QWLIB/topwikipeople/geofigs/pagerank24x100.html David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 8 / 58
  • 9. Top Women Biographies Rank NA PageRank female figures CC Century LC 1 24 Elizabeth II UK 20 EN 2 17 Mary (mother of Jesus) IL -1 HE 3 12 Queen Victoria UK 19 EN 4 6 Elizabeth I of England UK 16 EN 5 2 Maria Theresa AT 18 DE 6 1 Benazir Bhutto PK 20 HI 7 1 Catherine the Great PL 18 PL 8 1 Anne Frank DE 20 DE 9 1 Indira Gandhi IN 20 HI 10 1 Margrethe II of Denmark DK 20 DA Top 10 global female historical figures by PageRank for the 24 major Wikipedia editions (Young-Ho Eom et al, 2015) NA → number of language editions in which a biography appears in the top 100 rank CC → birth country code LC → language code corresponding to the birth country David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 9 / 58
  • 10. Top biographies by gender Number of women among the top global biographies by birth century Number of women among the 100 most central biographies for each language edition David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 10 / 58
  • 11. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 11 / 58
  • 12. Wikipedia editor gender gap Estimated women participation 2011 editor survey: 9% 2013 editor survey: 13% corrected with propensity score estimation: 16% (Mako Hill and Shaw, 2013) while in most online social networks women are more active! David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 12 / 58
  • 13. Why women do not edit Wikipedia? 1 A lack of user-friendliness in the editing interface 2 Not having enough free time 3 A lack of self-confidence 4 Aversion to conflict and an unwillingness to participate in lengthy edit wars 5 Belief that their contributions are too likely to be reverted or deleted 6 Some find its overall atmosphere misogynistic 7 Wikipedia culture is sexual in ways they find off-putting 8 Being addressed as male is off-putting to women whose primary language has grammatical gender 9 Fewer opportunities than other sites for social relationships and a welcoming tone (Sue Gardener, 2011) David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 13 / 58
  • 14. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 14 / 58
  • 15. Emotional factors and discussions Importance of emotional factors Discussion spaces are fundamental to the collaborative process Discussion triggers emotions and breeds particular emotional environments David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 15 / 58
  • 16. Wikipedia’s most visible side David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 16 / 58
  • 17. Article talk pages David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 17 / 58
  • 18. Discussions in article talk pages David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 18 / 58
  • 19. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 19 / 58
  • 20. Studying emotions and language in talk pages Interactions in Wikipedia Implicit → editing Explicit → communication Article talk pages → discussions about how to improve articles User talk pages → a kind of public in-boxes Goal: Shed light on the emotional dimension of the interactions extensive analysis of emotions in explicit communication sentiment analysis of comments in article talk and personal talk pages David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 20 / 58
  • 21. Research questions 1 How are the emotional and communication styles of editors affected by their status? 2 How are the emotional and communication styles of editors affected by their gender? 3 How are the emotional expressions affected by interacting with others in comment threads (emotional congruence)? 4 How are the emotional styles of editors related to those of the editors they interact more frequently with (emotional homophily)? David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 21 / 58
  • 22. Publications Results published in: Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012) Emotions and dialogue in a peer-production community: the case of Wikipedia. 8th International Symposium on Wikis and Open Collaboration, WikiSym’12 Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014) Emotions under Discussion: Gender, Status and Communication in Online Collaboration. Plos One, 9(8) David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 22 / 58
  • 23. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 23 / 58
  • 24. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 24 / 58
  • 25. Dataset: conversations Extracting conversations among editors from the English Wikipedia Articles 3 210 039 Articles with talk page (ATP) 871 485 (27.1%) Editors who comment articles 350 958 Editors with ≥ 100 comments on ATP 12 231 (3.5%) Total comments in ATP 11 041 246 Comments containing ANEW words 7 414 411 (67.2%) Comments by editors with ≥ 100 comments on ATP 5 480 544 (49.6%) Comments by these editors and with ANEW words 3 649 297 (33.3%) Table: Data extracted from a complete dump of the English Wikipedia, dated March 12th, 2010 David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 25 / 58
  • 26. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 26 / 58
  • 27. User gender labelling ≈ 12 000 users wrote ≥ 100 comments in articles talk pages Gender identified through Wikipedia API for ≈ 2 000 of them A sample of 1 385 users for manual labelling through crowdsourcing (Crowdflower) Non-admins Admins Total Men 1 087 1 526 2 613 Women 68 97 165 Unknown 6 850 2 603 9 453 Total 8 005 4 226 12 231 Table: Users with ≥ 100 comments by gender and administrator status. Gender could be identified only for ≈ 50% of users: real name or username (50% of those identified) implicitly stated gender (27% of women, 20% of men) pronoun (15% of women, 10% of men) other indicators: userboxes, pictures, links to personal blogs... David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 27 / 58
  • 28. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 28 / 58
  • 29. Measuring the Emotional Content of Discussions Lexicon-based methods relying on three different instruments: Affective norms for English words (ANEW) Linguistic Inquiry and Word Count (LIWC) SentiStrength David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 29 / 58
  • 30. Measuring the Emotional Content of Discussions Method 1: Affective norms for English words (ANEW) Rates a list of 1060 frequent words on a 9 point scale in three dimensions: Valence Arousal Dominance assign emotion scores to each word from the lexicon Bradley and Lang. (1999). Affective norms for English words (ANEW) Technical report C-1. The Center for Research in Psychophysiology, University of Florida, FL. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 30 / 58
  • 31. Measuring the Emotional Content of Discussions Method 2: Linguistic Inquiry and Word Count (LIWC) Two scores for basic emotion (compared with ANEW valence) positive emotion negative emotion Discrete measures of emotions (anger, anxiety, sadness, affect) Other classes of words to characterize language (i.e. personal pronouns, tentative words, fillers...) → Count the proportion of words belonging to each class Pennebaker J, Chung C, Ireland M, Gonzales A, Booth R (2010). The development and psychometric properties of LIWC2007. Austin, TX. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 31 / 58
  • 32. Measuring the Emotional Content of Discussions Method 2: Linguistic Inquiry and Word Count (LIWC) Dictionary size Examples Anger 91 hate, kill, annoyed Anxiety 84 worried, fearful, nervous Sadness 101 crying, grief, sad Tentative 155 maybe, perhaps, guess Certainty 83 always, never Fillers 9 blah, you know Past 155 went, ran, had Present 169 is, does, hear Future 48 will, gonna Social words 455 mate, talk, child Table: Description of LIWC measures (as per http://www.liwc.net). David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 32 / 58
  • 33. Measuring Relationship-Orientation with LIWC Definition Communication that promotes social affiliation and emotional connection: preoccupation with others (use of personal pronouns, e.g., I, you) preoccupation with the larger social domain (e.g., references to friends and family) expression of positive emotion Examples We are glad to have you. If I can help at all let me know :) A-giau has smiled at you. Smiles promote WikiLove and hopefully this one has made your day better...Happy editing Congrats! Thank you for your dedication. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 33 / 58
  • 34. Measuring the Emotional Content of Discussions Method 3: SentiStrength SentiStrength Based on LIWC and developed for short web texts Accounts for modes of textual expression specific to the online environment, e.g. emoticons and abbreviations Provides a positive and a negative score for emotional valence Emotion score is the strongest positive and negative emotion expressed in a comment Final scores are averages over comments in a given category Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology 61: 2544 – 2558. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 34 / 58
  • 35. Example: results with three different emotional lexica Table: Example messages with their corresponding Valence(ANEW) or positive & negative scores (LIWC, SentiStrength) ANEW LIWC SentiSt. Valence + - + - Sounds like a good challenge - to be proven or disproven. I’m happy if it can be shown to go further using closed cubic poly- nomial solutions. The nice thing about these are that they are pretty easy to test numerically . . . 7.4 12.5 0 3 -2 –in “Exact trigonometric constants” Seems you have not yet seen female lover after having sex who do not wish to have sex with the same lover any more :) Once you’ve seen it, you understand very well what war of Venus means compared to war of Mars. 5.5 6.8 4.5 4 -3 –in “House (astrology)” What about the whirlie hazing, the alcohol abuse, the emotional poverty, the suicide in 1995/6, the biotech plans which were stopped by pitzer protests . . . 1.6 4 8 1 -4 –in “Harvey Mudd College” David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 35 / 58
  • 36. Sentiment analysis Statistical tests Compute average values with the three lexica for each user Compare distribution of values for two groups of users (e.g.: admins vs regulars, women vs men) Most variables are not normally distributed ⇓ Mann-Whitney U-test Compare distributions of rankings David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 36 / 58
  • 37. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 37 / 58
  • 38. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 38 / 58
  • 39. Emotions and Status Table: Emotions and Status: Administrators promote a generally neutral tone on article talk pages. Regular editors express more negative emotion, and are more emotional. (Article Talk) Regular Admin Mann-Whitney U-Test p-value LIWC Positive 2.369 2.409 -4.308 p < 0.001 Negative 1.368 1.120 -18.578 p < 0.001 Affect 3.784 3.661 -8.466 p < 0.001 Anxiety 0.180 0.166 -5.834 p < 0.001 Anger 0.554 0.446 -19.217 p < 0.001 Sadness 0.175 0.166 -4.450 p < 0.001 SentiStrength Positive 1.805 1.774 -14.603 p < 0.001 Negative -2.005 -1.912 -23.046 p < 0.001 When difference is statistically significant (p-value in bold) the larger absolute value is underlined David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 39 / 58
  • 40. Emotions and Status Admins: more positive emotion (ANEW and LIWC) generally, emotionally reserved compared to regular users (LIWC) Regular users: more emotional more affect, and more anxiety, anger and sadness (LIWC) stronger positive and negative words than admins (SentiStrength) Personal talk pages In personal talk pages, admins are more emotional compared to the article talk pages more positive emotion compared to regular editors, but also more anxiety and sadness David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 40 / 58
  • 41. Dialogue and Status Table: Dialogue and Status: Administrators are more impersonal in article talk pages. Regular editors are more concerned with others. (Article Talk) Regular Admin Mann-Whitney U-test p-value Relationship-orientation Personal pronouns 5.135 4.815 -13.561 p < 0.001 Use of “I” 2.456 2.429 -1.733 p=0.083 Use of “You” 1.043 0.892 -12.573 p < 0.001 Use of “Shehe” 0.609 0.526 -8.657 p < 0.001 Social words 6.320 5.810 -19.013 p < 0.001 Certainty Certainty 1.426 1.317 -16.824 p < 0.001 Tentativeness 3.199 3.169 -2.210 p < 0.001 Filler words 0.168 0.155 -6.687 p < 0.001 Temporal Orientation Past 2.376 2.305 -5.696 p < 0.001 Present 8.011 7.841 -8.060 p < 0.001 Future 1.114 1.166 -9.887 p < 0.001 When difference is statistically significant (p-value in bold) the larger absolute value is underlined David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 41 / 58
  • 42. Dialogue and Status Admins more neutral and impersonal tone less relationship oriented more concerned with the future tend to "rule with reason" Regular users more relationship-oriented more personal pronouns and more social words more concerned with past more insecure, but not in personal spaces more certainty, tentative and filler words in article talk pages David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 42 / 58
  • 43. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 43 / 58
  • 44. Emotions and gender ANEW Words more used by women and men Size accounts for difference in frequency David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 44 / 58
  • 45. Emotions and gender Women use words associated to more positive emotions Result consistent and significant with the three lexicons ANEW: Difference is not significant when normalising by article → difference might be due to topic selection: women choose to participate in topics which have more positive discussions No significant difference in expression of negative emotions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 45 / 58
  • 46. Topics, emotions and gender N≥1 ANEW words; corr=−0.64 (p=0.002) prop. of male comments meanvalence 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 4.7 4.8 4.9 5 5.1 5.2 5.3 5.4 5.5 5.6 Computing Arts Philosophy Language Health Mathematics Belief Sports Agriculture Environment Techn. & app. sci. Law Society Business Education Culture People Science Politics Geography and places History and events Figure: Mean valence (ANEW) for discussions of articles in different topic categories, vs the proportion of comments written by men David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 46 / 58
  • 47. Dialogue and gender Table: Dialogue and Gender: Women use a more relationship-oriented speech style. (Article Talk) Men Women Mann-Whitney U-test p-value Relationship-orientation Personal pronouns 4.964 5.420 -4.375 p < 0.001 Use of “I” 2.488 2.764 -3.945 p < 0.001 Use of “You” 0.936 0.957 -0.926 p=0.355 Use of “Shehe” pronouns 0.541 0.713 -4.657 p < 0.001 Social words 5.960 6.353 -3.487 p < 0.001 Certainty Certainty 1.346 (1397) 1.300 (1263) -2.078 p = 0.038* Tentativeness 3.150 3.215 -1.162 p=0.245 Filler words 0.161 0.160 -0.137 p=0.891 Temporal Orientation Past 2.325 2.543 -4.305 p < 0.001 Present 7.897 8.180 -3.086 p = 0.002 Future 1.168 1.147 -1.008 p=0.314 When difference is statistically significant (p-value in bold) the larger absolute value is underlined. Cases where the averages are not informative are marked with an asterisk * and include the mean ranks Mann-Whitney U-test next to the averages in parentheses. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 47 / 58
  • 48. Dialogue and Gender Women write longer messages Women are more relationship-oriented more personal pronouns, in particular “I”, more social words Women are not more insecure Less certainty words, no significant difference for tentativeness and filler words Women admins are more relationship oriented than men admins Different leadership style David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 48 / 58
  • 49. Qualitative analysis: Relationship orientation Manual classification of 100 comments Three main types of comments high in relationship-orientation: inviting comments that explain the edit in a friendly tone, and call for further intervention and collaboration common perspective-building comments that are focused on understanding others and solving debates in a constructive manner appreciative comments that contain positive emotions and celebrate others’ actions ⇒ This suggests that relationship-orientation may be conducive to successful collaboration David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 49 / 58
  • 50. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 50 / 58
  • 51. Emotional congruence Comparison of each comment with the comment it replies to not based only on our set of users, but on all comments (from all users) Emotions: editors tend to reply with: more positive emotion less negative emotion less anger, anxiety and sadness stronger words, both positive and negative (SentiStrength) Dialogue: editors tend to reply with: more relationship oriented speech less tentative and certainty words David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 51 / 58
  • 52. Emotional homophily Mixing patterns: do users interact preferentially with similar users? Disassortativity by activity users who write more comments tend to reply preferentially to less active users, and viceversa Assortativity by gender Men interact more with other men, and women with other women Assortativity by emotion and language Users interact more with others similar in emotional expression and communication style also in the network of communication on personal talk pages David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 52 / 58
  • 53. Emotional homophily Example: homophily by expression of anger edges connect users who have exchanged at least 10 replies node color represents the level of anger expressed by a user, from low to high node size → proportional to the number of connections of a user David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 53 / 58
  • 54. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 54 / 58
  • 55. Conclusions Administrators and experienced users play a pivotal role they tend to interact especially with less experienced users they promote a positive but impersonal environment Men and women have a different communication style women participate in discussions that have a more positive tone men interact more with men, and women with women women use a more emotional and relationship-oriented language women admins have a relationship-oriented leadership style ⇒ promoting relationship-orientated leadership could lead to a more positive environment ⇒ giving women more space in the community could result in a more welcoming envoronment, for both women and men David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 55 / 58
  • 56. Future work Longitudinal analysis how do emotional styles of editors change over time and with increasing experience? how do emotions in the discussions affect participation? Qualitative analysis and human annotation include non-textual emotional aspects such as emoticons, barn stars and virtual gifts deal with sarcasm, measure the extent of condescending or paternalistic language in comments addressed at women editors Examine other online spaces Similar conclusions might hold for other online spaces especially in discussions involving conflict, decision making and power dynamics David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 56 / 58
  • 57. Some references M. M. Bradley and P. J .Lang. Affective norms for English words (ANEW) Technical report C-1. The Center for Research in Psychophysiology, University of Florida, FL, 2012. B. Collier and J. Bear. Conflict, criticism, or confidence: an empirical examination of the gender gap in wikipedia contributions. In Proc. of CSCW, 2012. Eom, Y.H., Aragón, P., Laniado, D., Kaltenbrunner, A., Vigna, S., Shepelyansky, D.L.: Interactions of cultures and top people of wikipedia from ranking of 24 language editions. PLoS ONE 10(3), e0114,825 (2015). Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014) Emotions under Discussion: Gender, Status and Communication in Online Collaboration. Plos One, 9(8) O. Kucuktunc, B. B. Cambazoglu, I. Weber, and H. Ferhatosmanoglu. A large-scale sentiment analysis for Yahoo! answers. In Proc. of WSDM, 2012. D. Laniado, R. Tasso, Y. Volkovich, and A. Kaltenbrunner. When the Wikipedians talk: Network and tree structure of Wikipedia discussion pages. In Proc. of ICWSM, 2011. Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012) Emotions and dialogue in a peer-production community: the case of Wikipedia. 8th International Symposium on Wikis and Open Collaboration, WikiSym’12 H. Zhu, R. Kraut, A. Kittur Effectiveness of shared leadership in online communities. In Proc. of CSCW, 2012. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 57 / 58
  • 58. Questions? David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 58 / 58