Positive words carry less information than negative words

Positive words carry less
information than negative
words
D. Garcia, A. Garas and F. Schweitzer
EPJ Data Science, 2012
JClub 2014.4.23
by Kazutoshi Sasahara

Introduction
n  Is human language biased towards positive
emotion or neutral?
n  Statistical properties of word freq. and length
n  Word freq. (word rank)-1 (Zipf 1949)
n  Word freq. predicts word length as a result of a principle
of least eﬀort
n  Word length increases with information content for
eﬃcient communication (Piantadosi et al. 2011).

Introduction (cont.)
n  Pollyanna hypothesis (Boucher & Osgood 1969)
A universal human tendency to use evaluatively positive
words more frequently and than evaluatively negative words
in communicating.
n  Previous researches reported emotional bias but
with the lack of control
n  Problems in the use of Amazon Mechanical Turk
n  Possible biases
n  Acquiescent bias
n  Social desirability bias
n  Framing eﬀects

Data Analysis
n  This paper examined emotional bias in three major
languages on the Internet.
n  English (56.6%), German (6.5%), Spanish (4.6%)
n  Data
n  Established lexica of aﬀective word usage:
English:1,034, German: 2,902, Spanish: 1,034
n  Google N-gram dataset: 1012 tokens
n  Valence (v)
The degree of pleasure induced by the aﬀective word
usage, rescaled between -1 and 1.

Data Science 2012, 1:3
atascience.com/content/1/1/3
Figure 1 Emotion word clouds with frequencies calculated from Google’s crawl. In each word cloud for
English (left), German (middle), and Spanish (right), the size of a word is proportional to its frequency of
appearance in the trillion-token Google N-gram dataset [26]. Word colors are chosen from red (negative) to
green (positive) in the valence range from psychology studies [7–9]. For the three languages, positive words
predominate on the Internet.
Results: Frequency of emotional
words
v=-1: red, v=+1: green
Exception
n  Positive words predominate on the Internet.
English German Spanish

Results: Distribution of
emotional wordsa Science 2012, 1:3 Page 5 of 12
science.com/content/1/1/3
n  The median shifts signiﬁcantly
towards positive values ( 0.3).
n  95% conﬁdence intervals
(Wilcoxon tests):
n  English: 0.257 0.032
n  German: 0.167 0.017
n  Spanish: 0.287 0.035
n  Empirical evidence of positive
bias
No control Control

al. EPJ Data Science 2012, 1:3 Pag
ww.epjdatascience.com/content/1/1/3
Table 1 Correlations between word valence and information measurements.
ρ(v,f) 0.222** 0.144** 0.236**
ρ(v,I) –0.368** –0.325** –0.402**
ρ(v,I′) –0.294** –0.222** –0.311**
ρ(v,I2) –0.332** –0.301** –0.359**
ρ(v,I3) –0.313** –0.201** –0.340**
ρ(v,I4) –0.254** –0.049* –0.162**
Correlation coefficients of the valence (v), frequency f, self-information I, and information content measured for 2-grams I2, 3-
grams I3, and 4-grams I4, and with self-information I′ measured from the frequencies reported in [42–44]. Significance levels:
*p < 0.01, **p < 0.001.
ones, but this nonlinear mapping between frequency and self-information makes the latter
more closely related to word valence than the former. The first two lines of Table  show
the Pearson’s correlation coefficient of word valence and frequency ρ(v,f ), followed by the
correlation coefficient between word valence and self-information, ρ(v,I). For all three
languages, the absolute value of the correlation coefficient with I is larger than with f ,
showing that self-information provides more knowledge about word valence than plain
Results: Relation between
information and valence (1)
n  Information content is measured by self-information I(w),
which provides more knowledge on the valence than the frequency.
n  Negative correlation between v and I:
n  Positive words carry less information than negative words.
n  Correlation coefficient becomes smaller for the larger context
(N).
I(w) = −log2 P(w)
←Control analysis

Data Science 2012, 1:3 Page 7 of 12
jdatascience.com/content/1/1/3
Results: Relation between
information and valence (2)
−
1
N
log2
i=1
N
∑ P(W = w |C = ci )( )
n  For all languages and
context sizes, valence
decreases with information
content.
(Left)
Color: Valence (v)
Size: Self-information (I)
(Right)
Average self-information:

Results: Additional analysis of
valence, length, and self-info (1)
n  The sign of valence matters
n  Word length I(w)
n  Valence ? (word length)-1
n  Combined influence of valence and length to I(w)
n  Additional dimension in the communication process related to
emotional content (v) rather than communication efficiency (l)
valence, which means, indeed, that the usage frequency of a word is not just related to the
overall emotional intensity, but to the positive or negative emotion expressed by the word.
Subsequently, we found that the correlation coefficient between word length and self-
information (ρ(l,I)) is positive, showing that word length increases with self-information.
These values of ρ(l,I) are consistent with previous results [, ]. Pearson’s and Spearman’s
Table 2 Additional correlations between valence, self-information and length.
ρ(abs(v),I) 0.032
◦
0.109*** 0.135***
ρ(l,I) 0.378*** 0.143*** 0.361***
ρ(v,l) –0.044
◦
–0.071*** –0.112***
ρ(v,I|l) –0.379*** –0.319*** –0.399***
ρ(l,I|v) 0.389*** 0.126*** 0.357***
Correlation coefficients of the valence (v), absolute value of the valence (abs(v)), and word length (l) versus self-information
(I). Partial correlations are calculated for both variables (ρ(v,I|l),ρ(l,I|v)), and correlation between valence and length (ρ(v,l)).
Significance levels:
◦
p < 0.3, *p < 0.1, **p < 0.01, ***p < 0.001.
but this trend is not so clear for German. These trends are properly quan-
rson’s correlation coefficients between valence and information content for
size (Table ). Each correlation coefficient becomes smaller for larger sizes of
as the information content estimation includes a larger context but becomes
nal analysis of valence, length and self-information
rovide additional support for our results, we tested different hypotheses im-
elation between word usage and valence. First, we calculated Pearson’s and
orrelation coefficients between the absolute value of the valence and the self-
of a word, ρ(abs(v),I) (see Table ). We found both correlation coefficients
. for German and Spanish, while they are not significant for English. The
between valence and self-information disappears if we ignore the sign of the
h means, indeed, that the usage frequency of a word is not just related to the
onal intensity, but to the positive or negative emotion expressed by the word.
tly, we found that the correlation coefficient between word length and self-
ρ(l,I)) is positive, showing that word length increases with self-information.
of ρ(l,I) are consistent with previous results [, ]. Pearson’s and Spearman’s
ional correlations between valence, self-information and length.
0.032
◦
0.109*** 0.135***
0.378*** 0.143*** 0.361***
–0.044
◦
–0.071*** –0.112***
–0.379*** –0.319*** –0.399***
0.389*** 0.126*** 0.357***
ients of the valence (v), absolute value of the valence (abs(v)), and word length (l) versus self-information
Additional analysis of valence, length and self-information
rder to provide additional support for our results, we tested different hypotheses im-
ting the relation between word usage and valence. First, we calculated Pearson’s and
arman’s correlation coefficients between the absolute value of the valence and the self-
rmation of a word, ρ(abs(v),I) (see Table ). We found both correlation coefficients
e around . for German and Spanish, while they are not significant for English. The
endence between valence and self-information disappears if we ignore the sign of the
nce, which means, indeed, that the usage frequency of a word is not just related to the
rall emotional intensity, but to the positive or negative emotion expressed by the word.
ubsequently, we found that the correlation coefficient between word length and self-
rmation (ρ(l,I)) is positive, showing that word length increases with self-information.
se values of ρ(l,I) are consistent with previous results [, ]. Pearson’s and Spearman’s
e 2 Additional correlations between valence, self-information and length.
s(v),I) 0.032
◦
0.109*** 0.135***
0.378*** 0.143*** 0.361***
) –0.044
◦
–0.071*** –0.112***
|l) –0.379*** –0.319*** –0.399***
v) 0.389*** 0.126*** 0.357***
lation coefficients of the valence (v), absolute value of the valence (abs(v)), and word length (l) versus self-information
rtial correlations are calculated for both variables (ρ(v,I|l),ρ(l,I|v)), and correlation between valence and length (ρ(v,l)).
ficance levels:
◦
p < 0.3, *p < 0.1, **p < 0.01, ***p < 0.001.
Page 9 of 12
ween valence and information content.
German Spanish
–0.100*** –0.058*
–0.070*** –0.149***
–0.020* –0.084**
n content measured on different context sizes (I2, I3, I4) controlling
1, **p < 0.01, ***p < 0.001.
and length ρ(v,l) are very low or not significant.
f valence and length to self-information, we cal-
s ρ(v,I|l) and ρ(l,I|v). The results are shown in
e intervals of the original correlation coefficients
or the existence of an additional dimension in the
o emotional content rather than communication
wn result that word lengths adapt to information
dent semantic feature of valence. Valence is also
the symbolic representation of the word through
context by controlling for word frequency. In Ta-
fficients of valence with information content for
g for self-information. We find that most of the
ve sign, with the exception of I for English. The
probably related to two word constructions such
ents between valence and information content.
h German Spanish
–0.100*** –0.058*
–0.070*** –0.149***
* –0.020* –0.084**
information content measured on different context sizes (I2, I3, I4) controlling
< 0.3, *p < 0.1, **p < 0.01, ***p < 0.001.
valence and length ρ(v,l) are very low or not significant.
fluence of valence and length to self-information, we cal-
oefficients ρ(v,I|l) and ρ(l,I|v). The results are shown in
onfidence intervals of the original correlation coefficients
upport for the existence of an additional dimension in the
elated to emotional content rather than communication
the known result that word lengths adapt to information
independent semantic feature of valence. Valence is also
ut not to the symbolic representation of the word through
uence of context by controlling for word frequency. In Ta-
tion coefficients of valence with information content for
ontrolling for self-information. We find that most of the
of negative sign, with the exception of I for English. The
es of  is probably related to two word constructions such
n valence and information content.
German Spanish
–0.100*** –0.058*
–0.070*** –0.149***
–0.020* –0.084**
tent measured on different context sizes (I2, I3, I4) controlling
p < 0.01, ***p < 0.001.
length ρ(v,l) are very low or not significant.
lence and length to self-information, we cal-
(v,I|l) and ρ(l,I|v). The results are shown in
tervals of the original correlation coefficients
he existence of an additional dimension in the
motional content rather than communication
result that word lengths adapt to information
t semantic feature of valence. Valence is also
symbolic representation of the word through
text by controlling for word frequency. In Ta-
ents of valence with information content for
or self-information. We find that most of the
sign, with the exception of I for English. The

Results: Additional analysis of
valence, length, and self-info (2)et al. EPJ Data Science 2012, 1:3 Pa
www.epjdatascience.com/content/1/1/3
Table 3 Partial correlation coefficients between valence and information content.
ρ(v,I2|I) –0.034
◦
–0.100*** –0.058*
ρ(v,I3|I) –0.101** –0.070*** –0.149***
ρ(v,I4|I) –0.134*** –0.020* –0.084**
Correlation coefficients of the valence (v) and information content measured on different context sizes (I2, I3, I4) controlling
for self-information (I). Significance levels:
◦
p < 0.3, *p < 0.1, **p < 0.01, ***p < 0.001.
correlation coefficients between valence and length ρ(v,l) are very low or not significant.
In order to test the combined influence of valence and length to self-information, we cal-
culated the partial correlation coefficients ρ(v,I|l) and ρ(l,I|v). The results are shown in
Table , and are within the % confidence intervals of the original correlation coefficients
ρ(v,I) and ρ(l,I). This provides support for the existence of an additional dimension in the
communication process closely related to emotional content rather than communication
efficiency. This is consistent with the known result that word lengths adapt to information
n  Most of the correlations keep significant and negative sign,
except I2 for English.
n  Knowing the possible contexts of a word (N=2,3,4) provides
further information about word valence than sole self-
information.

Summary
n  Empirical evidence for a positive bias in language
n  Positive words are more frequently used.
n  Pollyanna hypothesis
n  Facilitation of social links
n  Negative words convey more information content than
positive words.
n  Word frequency is determined by
n  Not only word length and information content
n  But also emotional content

Positive words carry less information than negative words

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (16)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Positive words carry less information than negative words