An empirical examination of the structure of scholarship in the Society for the Psychological Study of Social Issues (SPSSI) grounded in network analyses of shared citations (bibliographic couplings)
4. Network science can inform the study of
inequality
I: Structure
Preferential attachment or
the ‘Matthew Effect’
A common feature of
diverse complex social systems
Not just citations
A source of inequality
5. Network science can inform
the study of inequality
II: Content
A pilot analysis
32 ASAP papers on “inequality”
linked by 1573 references
6 research communities
6. Explorations of the SPSSI citation network
Networks parameters have meaning
at five levels of analysis
Level of analysis Concept / parameter Relevance / interpretation
Network (dynamic) Preferential attachment Developmental trajectories of topics, scholars
Network (static) Giant component Connectedness of a research community
Community Modularity Topics, subdisciplines, cliques, categories
Path Diameter, path length Distance and proximity of papers, scholars…
Author/paper (In)degree, centrality Mechanisms of influence, impact, eminence
7. Bigger data
• All papers published in JSI, ASAP from 2001-2013.
• First author, journal, year, cited papers
• N sources = 855
• 38854 references(45.4 per source)
• - 2,042 self-references (5.3%)
• - 3,198 (8.2%) unusable: references to news articles, government institutes, or
without a date
____________________________
• 33,615 usable citations (86.5%)
• 24,263 unique papers
• 14,702 unique first authors
8. SPSSI citation network: Connectedness
• Of the 24,263 papers,
24,075 (99.2%) are
linked in a single giant
component
• Papers are separated by an
average of 4.2 links
9. Eminence and network
centrality: 3 interpretations
ID: Citation counts from different sources (in-degree),
or total cites (weighted in-degree)
PR (Page Rank, Eigenvector Centrality):
Recursive measures in which the importance
of a paper is dependent upon the importance
of papers which refer to it
BC (Betweenness Centrality): Extent to which a
node bridges different areas of scholarship,
introduces work to a new audience, etc.
10. PageRank is high for papers
with commentary
• King (2011)
• Second highest PR in database
• Explanation
• Papers which are cited by papers with few
references (such as commentaries) can have
a disproportionate impact in a sparse network
• Two solutions
• Omit commentaries and book reviews
• Treat authors rather than papers as the unit of analysis
• Limitations of citation networks: sparseness, time-constraint
11. The SPSSI author network: (almost) no
one is an island
• 14,703 unique authors
• All but 6 are linked to the main
• Average path between nodes =
5.1
• 32-38 communities*
• Average author is linked
1.9 times
Whole network
12. The SPSSI
author
network:
Most cited
Includes 68 authors
with 20 or more
citations. Nodes ranked
by eigenvector
centrality
13. The SPSSI author network:
Centrality
• Content of rankings
• Betweenness
(bridging centrality)
vs. other measures
14. Gender effects in citation
networks?
• King (2014): Self-citations
• Here, a modest but
possibly consequential
effect
Directed Undirected
r (gender, BC-EC) = 0.17 0.22
t = 1.70 2.18
p (one tailed) 0.05 0.02
JSI/ASAP network; analysis includes
only top, bottom 50 in BC-EC
(not effect sizes)
15. The SPSSI author
network: Allport and
Lewin communities
compared
Lewin community includes authors with 5 or
more cites; Allport includes authors with 13+
cites. Nodes ranked by eigenvector centrality
16. (How) has ASAP changed SPSSI?
Total JSI ASAP
only
ASAP
unique authors 696 491 233 205
unique cited 14568 11704 4848 2864
unique scholars
(nodes) 14702 11778 4942 2924
17. On articulating the space
Clustering: “Communities” are fuzzy, artificial, and lack robustness
Distance and proximity
Obsolescence
Primitive: Summary, small big data – King studied 1.6 million concluding cites. Others have looked at similar qs in thoughts
a much more sophisticated way.
First authors as opposed to all authors
Authors as compared with full citation
Boyack - more coherent networks can be obtained if one also assesses how far apart they are cited in the source...for example, in the beginning of the introduction or in the methods • Eminence: Great persons and beyond
• Centrality: Different measures have distinct interpretations
• Connectedness: To see small worlds, you need big data
• Communities: Discrete clusters are artificial
• Distance: Is more interpretable than proximity
• Obsolescence: This work is primitive
• Bigger data and much more sophisticated methods lie ahead
…Safe home
Hinweis der Redaktion
Historical: Lewin’s topology, notions of distance, forces;
Heider’s balance theory and the study of triads
Milgram’s study of small worlds
Structurally, a language which takes us beyond individuals, beyond dyads,
Constructs such as cooperation, for example, can be studied not just in terms of individual characteristics such as agreeableness, or dyadic relations using simple games, but in terms of communities
A recent study by Apicella in Science describes desired campmates and gift-giving networks among the Hadza, a population of hunter-gatherers in Tanzania; the graphs show a number of properties shared by other social networks.
Historical: Lewin’s topology, notions of distance, forces;
Heider’s balance theory and the study of triads
Milgram’s study of small worlds
Structurally, a language which takes us beyond individuals, beyond dyads,
Constructs such as cooperation, for example, can be studied not just in terms of individual characteristics such as agreeableness, or dyadic relations using simple games, but in terms of communities
A recent study by Apicella in Science describes desired campmates and gift-giving networks among the Hadza, a population of hunter-gatherers in Tanzania; the graphs show a number of properties shared by other social networks.
The Hadza form one model of a community, here is another. Because a successful kidney transplant requires a close match in the kidney proteins of donor and recipient, if you have kidney failure, the likelihood that I can directly help you is low. But it is likely that I can help someone else, who can help someone else, who can ultimately work. 30 kidney donors and 30 recipients.
Best way to study this network is to look at it backwards – Cienfuegos needed a kidney, and his son was willing to give him his own, but they didn’t match sufficiently closely. Mary Jane Wilson, too needed a kidney, and her son, too was willing to give him one of his; but they did not match either. But Bowen matched Cienfuegos; and Wilson, in turn, found a donor in Tremayne Wilkins, the wife of Michael… and so on.
So we, like the Hanza, live in communities characterized by complex sharing. And advances in network science can help bring us together.
A mechanism which contributes to skewed distributions – more formally, a power function - in many domains, in many complex social systems, ranging from websites to social reputation.
For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken even that which he hath.
Or as we would have sung at the bar last night,
Them that’s got shall get, them that’s not shall lose…
32 recent papers which appeared in ASAP and included the term “inequality.” I then explored the network based on these papers based on the 1573 references which they collectively cited. 19 source papers, together with 38 linking references, form the giant component illustrated here.
This is pilot data, but nonetheless illuminates, I think, the central role of Social Dominance in the ASAP literature on inequality.
We have already considered how networks structures can, when examined over time, reveal preferential attachment as a factor which contributes to the development of inequality.
Simpler features of networks, too, have meaning. We can examine the overall connectedness of a network in order to determine if it is indeed true that no man is an island.
We can look, or try to look, for communities within the overall network, and whether a structure of non-arbitrary, robust set of mutually-exclusive and all-inclusive set of communities can be articulated.
We can, also, study proximity and distance between papers, ideas, and scholars.
Finally, we can think about scholarly eminence in a richer way than simple citation counts, one which recognizes diverse mechanisms of social and scholarly influence.
Not yet big data – other analyses of citation networks examine as many as 1.6 million citations. But as much as I can handle
A sense of this network can be obtained in a low-budget animation of zooming in on the most important authors. As you can see, the most cited paper – or rather book – is Allport’s The nature of prejudice.
These diagrams aren’t pictures of molecules, but the same small network depicted 3 times, with font size is ordered by three different parameters or conceptions of centrality. These illustrate some of the different forms of scholarly impact.
The simplest of these is citation counts. In the top diagram, the node labeled ID has the highest in-degree or number of cites.
The more interesting measures are recursive. Imagine – it is perhaps too easy to imagine- that you pick up or click on an article at random on your desk, then move to another cited within it. Then the likelihood that you’ll come across a given article is its eigenvector centrality, or the closely related PageRank. One feature of these approaches is that if you are reading a particular paper (ID in the graph here) that refers only to 1 other paper, then you will certainly land upon the second paper (PR).
Still another approach is betweenness, in which the measure of centrality is the extent to which one ties together different parts of the network.
SBetweenness – the citing of different groups of papers – is of little interest in this network, as we have only a small number of source papers (33) for which betweenness may be computed. It will be of interest in the second dataset, however.
In looking at the different measures of centrality here, there were some surprises. In particular, King has the second highest page rank, despite being cited only 7 times
In order to understand this, first consider that citation networks are very sparse, full of zeros – each of us cites only the tiniest fraction of the literature. Consider, too, that these recursive measures can be thought of as random walks.
If you land on a ‘comment’, there’s a high likelihood that you’ll next land on the target paper. So papers which have comments associated tend to have high PageRank scores.
Another problem with citation networks is that they can only move backwards through scholarly time.,
The SPSSI author network is the third we will consider. It is an impressive continent, linking virtually all authors and targets into a single whole. The 6 who aren’t linked: (3 authors of book reviews, each with 1 unconnected target)
68 authors were cited 20 or more times; some statistics are presented on the next slide.
Content of rankings
Ascendance of Tajfel
Continued importance of Allport, Pettigrew
The most common ways of thinking about network influence are closely inter-related – but not so closely that an individual scholar might have high scores on one measure but not another.
The most striking effect here is that the scholars who are highest in bridging centrality appear to be disproportionately women
Where is Lewin? –
One omission on the list of most cited is Lewin, who has been called the father of social psychology.
Ranked 23rd in EC, 35th in PR, 21st in weighted cites
Betweenness and gender role
King (2014) recently reported a large effect of gender in citation networks – namely, that male authors cite each other about 50% more often than do females. There may be a subtle gender effect for gender as well.
Here, all 6 of the authors who are in the top 10 on betweenness, but not the other measures, are women.
Among top 20 in EC-BC, 15 are males; among top 20 in BC-EC, 7 are males (p = .02)
On closer examination, however, it became apparent that at least part of the effect was due to the fact that my measure of betweenness centrality was directed. In a directed citation network, BC scores will be non-zero only for those who are both authors and have been cited. My data included a number of individuals - disproportionately male - who had been cited, but had not been first authors of source documents, which includes only papers published since 2001. Secular changes in the field - a greater role for women among (active) authors than among (possibly inactive, even deceased) cited scholars could account for the effect.
Going back to the content of the SPSSI network, one name that is not on any of these top 10 lists is Kurt Lewin
Where is Lewin? – ranked 23rd in EC, 35th in PR, 21st in weighted cites
Lewin is revered, but his influence appears increasingly indirect
In comparison to the Allport community, the Lewin community appears to includea disproportionate number of historians of psychology.
Content:
Allport, Pettigrew, Tajfel. But we should resist the temptation to focus only on Great Men, on persons without situations, on the fallacy of independence. As Heather Bullock reminded us in her talk, none of us has built our work alone; as Stephanie Fryberg noted, we need to consider interdependence as well as independent sources of scholarly achievements.
Centrality
Small worlds:
The law of large numbers applies in how we get to the truth of our connectedness. Giant components and small worlds are more apparent as our data become more complete.
Citation networks in personality and social psychology are small worlds in which virtually all of us can be connected
On articulating the space
Clustering: “Communities” are fuzzy, artificial, and lack robustness
Distance and proximity
Obsolescence
Primitive: small big data – King studied 1.6 million cites. Others have looked at similar qs in a much more sophisticated way.
First authors as opposed to all authors
Authors as compared with full citation
Boyack - more coherent networks can be obtained if one also assesses how far apart they are cited in the source...for example, in the beginning of the introduction or in the methods section.