Haustein, S. & Costas, R. (2015). Identifying Twitter audiences: Who is tweeting about scientific papers?
Presentation at METRICS2015 ASIS&T SIG/MET Workshop
https://www.asist.org/SIG/SIGMET/
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Identifying Twitter audiences: Who is tweeting about scientific papers?
1. Stefanie Haustein and Rodrigo Costas
@stefhaustein @RodrigoCostas1
Identifying Twitter audiences
Who is tweeting about scientific papers?
2. Background
• ~20% of recent journal papers shared on Twitter
• ~10-15% of researchers use Twitter for work
• <3% of researchers’ tweets contain links to papers
• Who tweets scientific papers?
• Altmetric.com classification*:
• Among a random sample of 2,000 accounts tweeting
papers, 34% of individuals identified as having PhD
• Of 286 users linking to SciELO articles, 24% employed at
university, 23% students, 36% not university affiliated
*based on Altmetric.com data 06/2015
(e.g., Haustein, Costas, & Larivière, 2015)
(e.g., Rowlands et al. 2011; van Noorden, 2014)
(Priem & Costello, 2010)
(Tsou, Bowman, Ghazinejad, & Sugimoto, 2010)
(Alperin, 2015)
3. Research motivation and objective
• Identifying Twitter user types and engagement related to
scientific papers
• Distinguishing user groups based on:
• Twitter account descriptions
• Number of followers
• Level of engagement with paper
4. Methods
• 1.3 million papers published in WoS papers in 2012
• 663,547 original tweets (no RTs) as captured by
Altmetric.com until July 2014 linked to papers via DOI
• Twitter profile information for 115,053 handles via Twitter
API in April 2015
• Reduction to 89,768 users with English account settings
• Account description
• Number of followers = exposure
• Dissimmilarity with paper title = engagement
6. Methods
• Noun phrases extraction with VOSviewer part-of-speech
tagger based on 80,939 account descriptions
• 185,824 unique terms extracted from 78,991 accounts
• Visualization and clustering of co-occurrence network of
325 most frequent terms (≥100)
• Identification of 3 clusters
Clustering resolution = 0.9, minimum cluster size = 5
• Calculation per term:
Number of Twitter accounts associated with term
Average exposure of accounts associated with term
Average engagement of accounts associated with term
Identification of predominant quadrant of term
7. 1
2
3
Network of 325 most frequent terms
Node size
number of accounts
associated with term
Node color
cluster affiliation
topics and
collectives
academic
personal
Results
8. low high
Node color
average engagement of
accounts associated
with term
Node size
average exposure of
accounts associated
with term
Co-occurrence network of frequent terms
topics and
collectives
academic
personal
1
2
3
Results
10. Users
• High exposure
• Low engagement
Terms
• Science and
research
• Organizational
focus
• News
Results
11. Users
• Low exposure
• High engagement
Terms
• Scientists and
students
• Personal
preferences
Results
12. Conclusions
• Scientific papers are tweeted by
• Individuals who identify professionally, personally or both
• Organizations or interest groups
• Accounts with organizational descriptions seemed to
have disseminative role
• Accounts with academic or personal terms exhibit higher
engagement
13. Limitations and Outlook
• VOSviewer noun phrase extraction
• limited to English language
• not optimized for Twitter account descriptions
• Uncontrolled, uncleaned vocabulary
• Reduction to top terms
• No systematic analysis of terms
Qualitative coding of accounts
Systematic identification of keywords associated with
account types
Testing of four-quadrant hypothesis (engagement↔exposure)
Testing of other user characteristics
14. Stefanie Haustein and Rodrigo Costas
@stefhaustein @RodrigoCostas1
Thank you for your attention!
Tuesday, 11/10
1:30pm
Grand D
Editor's Notes
Alperin, 2015:
University affiliated: 64%
unspecified: 16%
employed: 24%
student: 23%
Not university affiliated: 36%
8.9% Twitter handles did not exist anymore
Cluster 3 can be clearly identified as ‘academic’ based on terms such as university, science, professor and PhD, which reflect that academics often identify themselves professionally on Twitter.
Cluster 1 consists of terms that describe users with a focus on ‘personal’ attributes such as life, lover, father, husband, fan or geek as well as non-academic professional terms such as consultant, advocate, work, founder, co-founder or entrepreneur.
Overlapping with cluster 3, Cluster 1 also contains terms such as scientist, student and biologist. As the network structure is based on the cooccurrence of terms, it can be inferred that Twitter descriptions are often used to identify professionally but reveal also private interests.
The terms in cluster 2 seem to focus on ‘topics and collectives’, suggesting descriptions for interest groups, organizations or journals, particularly on health-related topics.
There is a clear tendency of the academic and personal cluster to show more engagement (red) than the organizational cluster (blue).
This might be explained by accounts that frequently distribute only paper titles, for example, those of scientific journals, publishers, or even automated accounts.
accounts with a high exposure but low engagement (broadcasters) focus on cluster two (organizational)
accounts classified as B (orators, discussing) use mostly terms from the personal and academic clusters