This document summarizes popular Twitter research in the social sciences. It discusses exemplary research areas like brand communication, crises/disasters, elections/politics, and popular culture. Popular research visions see Twitter as a new data source with insights into human behavior and thoughts. However, there are limitations and challenges. The document reviews the top cited Twitter publications, which applied methods like interviews, experiments, quantitative analysis, and network/linguistic analysis on datasets ranging from thousands to millions of tweets. Challenges include technical limitations, sampling biases, and different approaches to problems like sentiment analysis.
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
What do we get from Twitter - and what not?
1. What Do We Get From Twitter
– And What Not?
An Introduction to (Popular) Twitter Research in
the Social Sciences
World Social Science Forum 2013
Montréal, Canada
Dr. Katrin Weller
katrin.weller@gesis.org, @kwelle, http://kwelle.wordpress.org
4. Exemplary Twitter research areas
Brand Communication &
Marketing
Crises & Natural Disasters
Elections & Politics
Popular Culture
Education & Scholarly
Communication
Health Care & Diseases
4
5. Visions
•
•
•
•
•
New source of research data
Availability, Popularity, Metadata
Insights into what people do and think
Big data
Predictions
But: there are limitations and challenges.
6. Growing interest
450
Number of Publications
(Scopus total)
number of Publications
(social sciences only)
400
350
300
250
200
150
100
50
0
2007
2008
2009
2010
2011
2012
2013
SCOPUS search: TITLE(twitter) AND PUBYEAR > 2005, 22.08.2013.
6
8. Disciplines
Immunology and Microbiology
Dentistry
Veterinary
Chemical Engineering
Multidisciplinary
Energy
Neuroscience
Materials Science
Chemistry
Pharmacology, Toxicology and Pharmaceutics
Physics and Astronomy
Environmental Science
Earth and Planetary Sciences
Health Professions
Biochemistry, Genetics and Molecular Biology
Nursing
Economics, Econometrics and Finance
Agricultural and Biological Sciences
Psychology
Decision Sciences
Arts and Humanities
Business, Management and Accounting
Mathematics
Medicine
Engineering
Social Sciences
Computer Science
298 publications
from social sciences
0
100
200
300
400
500
600
700
9. Methods? Objectives? Data?
• Current approach: close look at the most
popular (= highly cited) publications.
• Next steps:
– Bibliometric analyses including more sources
– Qualitative interviews with Twitter researchers
10. Top 17 (more than 50% of all
citations), plus top 3 from 2012.
11. No.
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
Publication
Huberman, B. A., Romero, D. M., & Wu, F. (2009). Social networks that matter: Twitter under the microscope. First Monday, 14(1).
Retrieved from http://firstmonday.org/ojs/index.php/fm/article/view/2317/2063
Marwick, A. E., & boyd, d. (2011). I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience. New
Media & Society, 13(1), 114–133. doi:10.1177/1461444810365313
Junco, R., Heiberger, G., & Loken, E. (2011). The effect of Twitter on college student engagement and grades. Journal of Computer Assisted
Learning, 27(2), 119–132. doi:10.1111/j.1365-2729.2010.00387.x
Yardi, S., Romero, D., Schoenebeck, G., & boyd, d. (2010). Detecting spam in a Twitter network. First Monday, 15(1). Retrieved from
http://firstmonday.org/ojs/index.php/fm/article/view/2793/2431
Ritter, A., Cherry, C., & Dolan, B. (2010). Unsupervised modeling of Twitter conversations. In HTL'10 Human Language Technologies. The
2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 172–180). Stroudsburg, Pa:
Association for Computational Linguistics (ACL). Retrieved from http://dl.acm.org/citation.cfm?id=1858019
Petrovic, S., Osborne, M., & Lavrenko, V. (2010). Streaming first story detection with application to Twitter. In HTL'10 Human Language
Technologies. The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 181–189).
Stroudsburg, Pa: Association for Computational Linguistics (ACL). Retrieved from http://dl.acm.org/citation.cfm?id=1858020
Jiang, L., Yu, M., Zhou, M., Liu, X., & Zhao, T. (2011). Target-dependent Twitter sentiment classification. In HLT '11 Proceedings of the 49th
Annual Meeting of the Association for Computational Linguistics: Human Language Technologies:. Short papers - Volume 2 (pp. 151–160).
Retrieved from http://dl.acm.org/citation.cfm?id=2002492
Han, B., & Baldwin, T. (2011). Lexical normalisation of short text messages: makn sens a #twitter. In HLT '11 Proceedings of the 49th Annual
Meeting of the Association for Computational Linguistics: Human Language Technologies. Short papers - Volume 2 (pp. 368–378). Retrieved
from http://dl.acm.org/citation.cfm?id=2002520
Gimpel, K., Schneider, N., O'Connor, B., Das, D., Mills, D., Eisenstein, J., Heilmann, M., … (2011). Part-of-speech tagging for Twitter:
Annotation, features, and experiments. In HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:
Human Language Technologies. Short papers - Volume 2 (pp. 42–47). Retrieved from http://dl.acm.org/citation.cfm?id=2002747
Schultz, F., Utz, S., & Göritz, A. (2011). Is the medium the message? Perceptions of and reactions to crisis communication via twitter, blogs
and traditional media. Public Relations Review, 37(1), 20–27. doi:10.1016/j.pubrev.2010.12.001
Barbosa, L., & Feng, J. (2010). Robust sentiment detection on twitter from biased and noisy data. In COLING '10 Proceedings of the 23rd
International Conference on Computational Linguistics (pp. 36–44).
Davidov, D., Tsur, O., & Rappoport, A. (2010). Enhanced sentiment lerarning using Twitter hashtags and smileys. In COLING '10 Proceedings
of the 23rd International Conference on Computational Linguistics (pp. 241–249). Retrieved from
http://dl.acm.org/citation.cfm?id=1944566.1944594
Hargittai, E., & Litt, E. (2011). The tweet smell of celebrity success: Explaining variation in Twitter adoption among a diverse group of young
adults. New Media & Society, 13(5), 824–842. doi:10.1177/1461444811405805
Zhou, X., Lee, W.-C., Peng, W.-C., Xie, X., Lee, R., & Sumiya, K. Measuring geographical regularities of crowd behaviors for Twitter-based geosocial event detection, 1. doi:10.1145/1867699.1867701
Gruzd, A., Wellman, B., & Takhteyev, Y. (2011). Imagining Twitter as an Imagined Community. American Behavioral Scientist, 55(10), 1294–
1318. doi:10.1177/0002764211409378
Johnson, K. A. (2011). The effect of Twitter posts on students’ perceptions of instructor credibility. Learning, Media and Technology, 36(1),
Results
Citations
155
77
55
28
27
26
22
22
21
19
19
19
18
18
17
16
12. Results I
Applied methods include:
• interviews with Twitter users,
• experimental settings of using Twitter in certain
environments,
• quantitative analysis of tweets and their
characteristics,
• network analysis of (following) users,
• linguistic analyses, e.g. word clustering, event
detection, sentiment analysis,
• analysis of tweets.
Almost no combination of methods
13. Results II
Data
• collections of tweets (retrieved randomly or
based on specific search criteria)
• user profiles / user networks
• data from experiments, surveys, interviews.
Different datasets
Different sizes
14. Results III
Datasets - Examples
•
•
•
•
•
•
•
•
•
•
•
•
309740 Twitter users (with followers and tweets)
Experiment with 125 students.
17,803 tweets from 8,616 users + 1st degree network (3,048,360 directed
edges, 631,416 unique followers, and 715,198 unique friends)
1.3 million Twitter conversations, with each conversation containing
between 2 and 243 posts
20,000 tweets
1,827 annotated tweets
Experiment with 1677 participants
Survey with 505 young American adults
21,623,947 geo-tagged tweets
One person’s Twitter network (652 followers, 114 followings).
none
99,832 tweets
15. Results IV
Challanges and Limitations
- Technical challenges (API limitations, data quality)
- Ethical issues (?)
- Co-existing approaches for the same problem
(e.g. sentiment analysis)
- Sampling (size of dataset, random sample,
subsample, biases, representativeness)
- User information, geo information often
unavailable.
17. Thank you for your attention!
Coming soon:
Weller, K., Bruns, A., Burgess, J., Mahrt , M. &
Puschmann, C. (Ed.) (2013). Twitter and Society.
New York: Peter Lang.