SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Challenges in Archiving Social Media
Data for Research:
The Case of Twitter
Dr. Katrin Weller
GESIS – Leibniz-Institute for the Social Sciences
Data Archive for the Social Sciences / Computational Social Science
Cologne, Germany
●
Digital Studies Fellow at John W. Kluge Center
Library of Congress
Washington D.C.
E-Mail: katrin.weller@gesis.org ●Twitter: @kwelle ● Web: www.katrinweller.net
Slides are available at: http://de.slideshare.net/katrinweller
2
SERIOUSLY? DO THEY NOT REALIZE THAT 99%
OF TWEETS ARE WORTHLESS BABBLE THAT
READ SOMETHING LIKE ‘JUST WOKE UP. GOING
TO STARBUCKS NOW. GETTING LATTE.’
READER’SCOMMENTFOUNDINTHECOMMENTSECTIONFORGROSS,D.(2010,APRIL14).LIBRARYOFCONGRESSTOARCHIVEYOURTWEETS.CNN.RETRIEVEDFROMHTTP://EDITION.CNN.COM/2010/TECH/04/14/LIBRARY.CONGRESS.TWITTER/,
RETRIEVEDNOVEMBER19.
PHOTOS:HTTPS://WWW.FLICKR.COM/SEARCH/?TEXT=COFFEE&LICENSE=4%2C5%2C6%2C9%2C10
Background
0
100
200
300
400
500
600
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Twitter
Facebook
YouTube
Blogs
Wikis
Foursquare
LinkedIn
MySpace
Number of publications per year, which mention the respective social media platform‘s name in their title. Scopus
Title Search. For details: http://kwelle.wordpress.com/2014/04/07/bibliometric-analysis-of-social-media-research/
Twitter research until 2013 by discipline
4
Chances in Social Media Research
• Researchers value social media as a new type of data
• Previously „ephemeral data“ become visible
• Immediate – quick reaction to events
• Structured
• „natural“ data
5
“What I find really interesting is that structure becomes manifest in
internet communication. So it’s the first time in history actually that
we can, that social structures between people become manifest
within a technology. (...) They become visible, they become
crawlable, they become analyzable.”
Kinder-Kurlanda, Katharina E., and Katrin Weller. 2014. "'I always feel it must be great to be a hacker!': The role of interdisciplinary work
in social media research." In Proceedings of the 2014 ACM conference on Web Science, 91-98. New York: ACM.
One of the Challenges: Data Sharing
6
“But you can’t make your data available for others to look at, which means both
your study can’t really be replicated and it can’t be tested for review. But also it
just means your data can’t be made available for other people to say, Ah you
have done this with it, I’ll see what I can do with it, (…) There is no open data.”
Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2015. "Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden
Data of Social Media Research?." In Standards and Practices in Large-Scale Social Media Research: Papers from the 2015 ICWSM
Workshop. Proceedings Ninth International AAAI Conference on Web and Social Media Oxford University, May 26, 2015 – May 29, 2015,
28-37. Ann Arbor, MI: AAAI Press.
What is Twitter data?
“I actually only use [other researcher’s datasets] where I’m very sure about
where it comes from and how it was processed and analyzed. There is too
much uncertainty in it.”
7
Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2015. "Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden
Data of Social Media Research?." In Standards and Practices in Large-Scale Social Media Research: Papers from the 2015 ICWSM
Workshop. Proceedings Ninth International AAAI Conference on Web and Social Media Oxford University, May 26, 2015 – May 29, 2015,
28-37. Ann Arbor, MI: AAAI Press.
8
Different methods and types of datasets, examples from popular social science papers
Weller, K. (2014). What do we get from Twitter – and what not? A close look at Twitter research in the social sciences.
Knowledge Organization. 41(3), 238-248
Example 2008-2013 papers on Twitter and elections:
data sources
Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big
Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.
9
Data source number
No information 11
Collected manually from Twitter website (Copy-Paste /
Screenshot)
6
Twitter API (no further information) 8
Twitter Search API 3
Twitter Streaming API 1
Twitter Rest API 1
Twitter API user timeline 1
Own program for accessing Twitter APIs 4
Twitter Gardenhose 1
Official Reseller (Gnip, DataSift) 3
YourTwapperKeeper 3
Other tools (e.g. Topsy) 6
Received from colleagues 1
Archiving Twitter Datasets?
Current approaches
10
11
12
13
Format supported by
Twitter Terms of services
Available datasets
• From individual researchers/groups (sometimes
„black market“).
• From conferences: e.g. ICWSM
• Archival institutions? GESIS working on first release.
14
Challenges in Archiving Twitter Data
15
Sources for Challenges
(1) the Twitter Terms of Services
(2) ethical challenges
(3) lack of standard metadata
(4) the ever changing nature of Twitter – and
Twitter users
16
Sources for Challenges
(1) the Twitter Terms of Services
(2) ethical challenges
(3) lack of standard metadata
(4) the ever changing nature of Twitter – and
Twitter users
17
The changing nature of Twitter
in 5 examples
18
#1
Deleted content
19
#2
Lost context: interfaces, look and feel
20
#3
Lost context: stories, meanings
21
#4
Lost context: user names
22
#5
URLs and images
23
Supplement: some useful references
Tools / Methods for collecting tweets:
• Borra, E., & Rieder, D. (2014). Programmed method: developing a toolset
for capturing and analyzing tweets, Aslib Journal of Information
Management, 66(3), 262 – 278. DOI: http://dx.doi.org/10.1108/AJIM-09-
2013-0094
• Bruns, A., & Liang, Y. E. (2012). Tools and methods for capturing Twitter
data during natural disasters. First Monday, 17(4).
doi:10.5210/fm.v17i4.3937
• Gaffney, D., & Puschmann, C. (2014). Data collection on Twitter. In Weller,
A. Bruns, J. Burgess., M. Mahrt and C. Puschmann (Ed.), Twitter and
Society (pp. 55–68). New York: Peter Lang.
[There are much more tools, though. See, e.g. collection at:
https://docs.google.com/document/d/1UaERzROI986HqcwrBDLaqGG8X_lYwctj6ek6ryqDOiQ/edit (curated by D. Freelon).
24
Supplement: some useful references
Challenges in collecting tweets / data quality:
• Bruns, A. (2011, June 21). Switching from Twapperkeeper to yourTwapperkeeper. Retrieved
January 31, 2015 from http://www.mappingonlinepublics.net/2011/06/21/switching-from-
twapperkeeper-to-yourtwapperkeeper/.
• Bruns, A. and Stieglitz, S. (2014), “Twitter data: what do they represent?” IT Information
Technology, Vol. 59 No. 5, pp. 240-5, [online], available at:
http://www.degruyter.com/view/j/itit.2014.56.issue-5/itit-2014-1049/itit-2014-1049.xml
(accessed 28 February 2015), DOI: 10.1515/itit-2014-1049.
• Jungherr, A., Jurgens, P. and Schoen, H. (2012), “Why the Pirate Party won the German
Election of 2009 or The trouble with predictions: a response to Tumasjan, A., Sprenger, T. O.,
Sander, P. G. and Welpe, I. M. Predicting elections with Twitter: what 140 characters reveal
about political sentiment”, Social Science Computer Review, Vol. 30 No. 2, pp. 229-34,
[online], available at: http://ssc.sagepub.com/content/30/2/229 (accessed 28 February
2015), DOI: 10.1177/0894439311404119.
• Morstatter, Fred, Jürgen Pfeffer, Huan Liu, and Kathleen M. Carley. 2013. “Is the Sample Good
Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose.”
http://arxiv.org/abs/1306.5204.
• Sumers, E. (2015). Tweets and Deletes. Retrieved, June 9, 2015 from:
https://medium.com/on-archivy/tweets-and-deletes-727ed74f84ed (see also:
https://github.com/edsu/twarc) 25
Supplement: some useful references
Bibliometric studies of Twitter researchers:
• Williams, S. A., Terras, M. M., & Warwick, C. (2013a). What do people study when
they study Twitter? Classifying Twitter related academic papers. Journal of
Documentation, 69(3): 384-410.
• Williams, S. A., Terras, M. M., & Warwick, C. (2013b). How Twitter Is Studied in the
Medical Professions: A Classification of Twitter Papers Indexed in PubMed. In Med
2.0 2013. doi: 10.2196/med20.2269.
• Weller, K. (2014b). What do we get from Twitter – and what not? A close look at
Twitter research in the social sciences. Knowledge Organization 41(3), 238-248.
• Zimmer, M., & Proferes, J.N. (2014). A topology of Twitter research: disciplines,
methods, and ethics. Aslib Journal of Information Management, 66(3), 250–261.
doi:10.1108/AJIM-09-2013-0083
26
Supplement: some useful references
Critical perspectives on data access and inequalities:
• Boyd, D. and Crawford, K. (2012), “Critical questions for Big Data: provocations for
a cultural, technological, and scholarly phenomenon”, Information,
Communication & Society, Vol. 15 No. 5, pp. 662-79, [online], available at:
http://www.tandfonline.com/doi/full/10.1080/1369118X.2012.678878#abstract
(accessed 28 February 2015), DOI: 10.1080/1369118X.2012.678878.
27
Supplement: some useful references
Legal and ethical challenges:
• Beurskens, M. (2014). Legal Questions of Twitter Research. In K. Weller, A. Bruns, J. Burgess.,
M. Mahrt and C. Puschmann (Eds.), Twitter and Society (pp. 123-133). New York: Peter Lang.
• Markham, A. and Buchanan, E. (2012), “Ethical decision-making and internet research 2.0:
recommendations from the AoIR Ethics Working Committee”, available at:
http://www.aoir.org/reports/ethics2.pdf (accessed 28 February 2015).
• Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2014. "I love thinking about ethics:
Perspectives on ethics in social media research." In Selected Papers of Internet Research
(SPIR). Proceedings of ir15 - Boundaries and Intersections,
http://spir.aoir.org/index.php/spir/article/view/997.
• Zimmer, M. & Proferes, J.N. (2014). Privacy on Twitter, Twitter on privacy. In Weller, A. Bruns,
J. Burgess., M. Mahrt and C. Puschmann (Eds.), Twitter & Society (pp. 169-182), New York:
Peter Lang.
• Zimmer, M. (2010), “But the data is already public: on the ethics of research in Facebook”,
Ethics and Information Technology, Vol. 12 No. 4, pp. 313-25, DOI: 10.1007/s10676-010-
9227-5.
28
Supplement: some useful references
Twitter‘s activities:
• Krikorian, R. (2014a), “Introducing Twitter Data Grants”, [online], available at:
https://blog.twitter.com/2014/introducing-twitter-data-grants (accessed 28
February 2015).
• Krikorian, R. (2014b), “Twitter #DataGrants selections”, available at:
https://blog.twitter.com/2014/twitter-datagrants-selections (accessed 28
February 2015).
• Stone, B. (2010). Tweet Preservation. Blog post, April 14, 2010. Retrieved from
https://blog.twitter.com/2010/tweet-preservation
• Twitter (2014). Developer Agreement & Policy. Twitter Developer Agreement,
retrieved January 31, 2015 from
https://dev.twitter.com/overview/terms/agreement-and-policy.
• Twitter (no date). Guidelines for using Tweets in broadcast, retrieved January 31,
2015, from https://support.twitter.com/articles/114233.
29
Supplement: some useful references
Library of Congress‘ activities:
• Allen, E. (2013, January 4). Update on the Twitter Archive at the Library of
Congress. Retrieved January 31, 2015 from
http://blogs.loc.gov/loc/2013/01/update-on-the-twitter-archive-at-the-
library-of-congress/
• McLemmee, S. (2015). The Archive is closed. Inside Higher Education.
Retrieved June 9, 2015 from:
https://www.insidehighered.com/views/2015/06/03/article-difficulties-
social-media-research
• Raymond, M. (2010). How Tweet It Is! Library Acquires Entire Twitter
Archive. Retrieved January 31, from
http://blogs.loc.gov/loc/2010/04/how-tweet-it-is-library-acquires-entire-
twitter-archive/
30
Supplement: some useful references
Examples of Twitterdatasets shared publicly:
• CrisisLex on Github: https://github.com/sajao/CrisisLex/tree/master/data/CrisisLexT26/
• Hadgu & Jäschke 2014 dataset on Github: https://github.com/L3S/twitter-researcher
• ICWSM 2012 datasets: http://www.icwsm.org/2012/submitting/datasets/ ICWSM 2014
datasets: http://www.icwsm.org/2014/datasets/datasets/
• MPI-SWS (no date). The Twitter Project Page at MPI-SWS. Retrieved January 26, 2015
from http://twitter.mpi-sws.org/ (Archived by WebCite® at
http://www.webcitation.org/6VsuuxQlU)
• TREC 2011: http://trec.nist.gov/data/tweets/
• sananalytics (2011). Public domain twitter sentiment corpus. Post in Twitter Developers
Forums. Retrieved Jan 31, 2015 from https://twittercommunity.com/t/public-domain-
twitter-sentiment-corpus/13290
31
GREETINGS FROM COLOGNE
QUESTIONS AND FEEDBACK
katrin.weller@gesis.org
@kwelle
http://katrinweller.net

Weitere ähnliche Inhalte

Was ist angesagt?

Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data Interaction
University of Washington
 
Altmetrics: Listening & Giving Voice to Ideas with Social Media Data
Altmetrics: Listening & Giving Voice to Ideas with Social Media DataAltmetrics: Listening & Giving Voice to Ideas with Social Media Data
Altmetrics: Listening & Giving Voice to Ideas with Social Media Data
Toronto Metropolitan University
 

Was ist angesagt? (20)

History of the future
History of the futureHistory of the future
History of the future
 
Joining the ‘buzz’ : the role of social media in raising research visibility ...
Joining the ‘buzz’ : the role of social media in raising research visibility ...Joining the ‘buzz’ : the role of social media in raising research visibility ...
Joining the ‘buzz’ : the role of social media in raising research visibility ...
 
Computational Approaches to Studying Anti-Social Behaviour on Social Media
Computational Approaches to Studying Anti-Social Behaviour on Social MediaComputational Approaches to Studying Anti-Social Behaviour on Social Media
Computational Approaches to Studying Anti-Social Behaviour on Social Media
 
2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted
 
Science Data, Responsibly
Science Data, ResponsiblyScience Data, Responsibly
Science Data, Responsibly
 
Research Life Cycle for GeoData 2014
Research Life Cycle for GeoData 2014Research Life Cycle for GeoData 2014
Research Life Cycle for GeoData 2014
 
Digital Scholarship
Digital ScholarshipDigital Scholarship
Digital Scholarship
 
20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...
 
2013 Oxford Digital Humanities Summer School Workshop
2013 Oxford Digital Humanities Summer School Workshop2013 Oxford Digital Humanities Summer School Workshop
2013 Oxford Digital Humanities Summer School Workshop
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data Interaction
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data Science
 
Tweet Your Pubs: How Altmetrics are Changing the Way We Measure Research Impact
Tweet Your Pubs: How Altmetrics are Changing the Way We Measure Research ImpactTweet Your Pubs: How Altmetrics are Changing the Way We Measure Research Impact
Tweet Your Pubs: How Altmetrics are Changing the Way We Measure Research Impact
 
Research with Social Media Data: Stewardship & Ethical Considerations
Research with Social Media Data: Stewardship & Ethical ConsiderationsResearch with Social Media Data: Stewardship & Ethical Considerations
Research with Social Media Data: Stewardship & Ethical Considerations
 
Web Futures: Inclusive, Intelligent, Sustainable
Web Futures: Inclusive, Intelligent, SustainableWeb Futures: Inclusive, Intelligent, Sustainable
Web Futures: Inclusive, Intelligent, Sustainable
 
Altmetrics: Listening & Giving Voice to Ideas with Social Media Data
Altmetrics: Listening & Giving Voice to Ideas with Social Media DataAltmetrics: Listening & Giving Voice to Ideas with Social Media Data
Altmetrics: Listening & Giving Voice to Ideas with Social Media Data
 
Picturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolPicturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter School
 
Librarians & altmetrics: Tools, tips and use cases
Librarians & altmetrics: Tools, tips and use casesLibrarians & altmetrics: Tools, tips and use cases
Librarians & altmetrics: Tools, tips and use cases
 
Thinking About the Making of Data
Thinking About the Making of DataThinking About the Making of Data
Thinking About the Making of Data
 
Research-Open Access-Social Media: a winning combination
Research-Open Access-Social Media: a winning combinationResearch-Open Access-Social Media: a winning combination
Research-Open Access-Social Media: a winning combination
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social media
 

Ähnlich wie Challenges in-archiving-twitter

Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
SC CTSI at USC and CHLA
 
What Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesWhat Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker Notes
KrisKasianovitz
 
Grounded theory meets big data: One way to marry ethnography and digital methods
Grounded theory meets big data: One way to marry ethnography and digital methodsGrounded theory meets big data: One way to marry ethnography and digital methods
Grounded theory meets big data: One way to marry ethnography and digital methods
Citizens in the Making
 

Ähnlich wie Challenges in-archiving-twitter (20)

Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
 
Social Media Research Methods
Social Media Research MethodsSocial Media Research Methods
Social Media Research Methods
 
Accessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeAccessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science Knowledge
 
Improving the Coverage of Complex Issues with Data Journalism and Digital Met...
Improving the Coverage of Complex Issues with Data Journalism and Digital Met...Improving the Coverage of Complex Issues with Data Journalism and Digital Met...
Improving the Coverage of Complex Issues with Data Journalism and Digital Met...
 
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
 
Digital Scholarship: building an online scholarly presence
Digital Scholarship: building an online scholarly presenceDigital Scholarship: building an online scholarly presence
Digital Scholarship: building an online scholarly presence
 
Digital Scholarship: building an online scholarly presence
Digital Scholarship: building an online scholarly presenceDigital Scholarship: building an online scholarly presence
Digital Scholarship: building an online scholarly presence
 
"Who is this redchanit?" Applying digital methods for issue mapping to one we...
"Who is this redchanit?" Applying digital methods for issue mapping to one we..."Who is this redchanit?" Applying digital methods for issue mapping to one we...
"Who is this redchanit?" Applying digital methods for issue mapping to one we...
 
Fail! workshop introduction at Web Science Conference
Fail! workshop introduction at Web Science ConferenceFail! workshop introduction at Web Science Conference
Fail! workshop introduction at Web Science Conference
 
What Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesWhat Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker Notes
 
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
 
#ELearn14 Digital Scholarship
#ELearn14 Digital Scholarship#ELearn14 Digital Scholarship
#ELearn14 Digital Scholarship
 
Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving Up
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
 
Who are We Studying: Humans or Bots?
Who are We Studying: Humans or Bots? Who are We Studying: Humans or Bots?
Who are We Studying: Humans or Bots?
 
Grounded theory meets big data: One way to marry ethnography and digital methods
Grounded theory meets big data: One way to marry ethnography and digital methodsGrounded theory meets big data: One way to marry ethnography and digital methods
Grounded theory meets big data: One way to marry ethnography and digital methods
 
What Your Tweets Tell Us About You: Identity, Ownership, and Privacy in Twitter
What Your Tweets Tell Us About You: Identity, Ownership, and Privacy in TwitterWhat Your Tweets Tell Us About You: Identity, Ownership, and Privacy in Twitter
What Your Tweets Tell Us About You: Identity, Ownership, and Privacy in Twitter
 
Pasquinietall2014 digital scholarship_e_learn14
Pasquinietall2014 digital scholarship_e_learn14Pasquinietall2014 digital scholarship_e_learn14
Pasquinietall2014 digital scholarship_e_learn14
 
What’s new in social media research?
What’s new in social media research?What’s new in social media research?
What’s new in social media research?
 
Stepping out of the echo chamber - Alternative indicators of scholarly commun...
Stepping out of the echo chamber - Alternative indicators of scholarly commun...Stepping out of the echo chamber - Alternative indicators of scholarly commun...
Stepping out of the echo chamber - Alternative indicators of scholarly commun...
 

Mehr von Katrin Weller

Quantität vor Qualität? Big Data im Kontext von Social Media Daten
Quantität vor Qualität? Big Data im Kontext von Social Media DatenQuantität vor Qualität? Big Data im Kontext von Social Media Daten
Quantität vor Qualität? Big Data im Kontext von Social Media Daten
Katrin Weller
 

Mehr von Katrin Weller (20)

Fail ir16 intro
Fail ir16 introFail ir16 intro
Fail ir16 intro
 
The digital traces of user generated content
The digital traces of user generated contentThe digital traces of user generated content
The digital traces of user generated content
 
The Hidden Data of Social Media Rearch_CSS-winter-symposium
The Hidden Data of Social Media Rearch_CSS-winter-symposiumThe Hidden Data of Social Media Rearch_CSS-winter-symposium
The Hidden Data of Social Media Rearch_CSS-winter-symposium
 
Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...
Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...
Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...
 
Publishing with impact
Publishing with impactPublishing with impact
Publishing with impact
 
"I always feel it must be great to be a hacker"
"I always feel it must be great to be a hacker" "I always feel it must be great to be a hacker"
"I always feel it must be great to be a hacker"
 
Social-Media-Forschung
Social-Media-ForschungSocial-Media-Forschung
Social-Media-Forschung
 
Hidden Data of Social Media Research
Hidden Data of Social Media ResearchHidden Data of Social Media Research
Hidden Data of Social Media Research
 
Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...
Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...
Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...
 
Twitter-Daten in der sozialwissenschaftlichen Forschung
Twitter-Daten in der sozialwissenschaftlichen ForschungTwitter-Daten in der sozialwissenschaftlichen Forschung
Twitter-Daten in der sozialwissenschaftlichen Forschung
 
Quantität vor Qualität? Big Data im Kontext von Social Media Daten
Quantität vor Qualität? Big Data im Kontext von Social Media DatenQuantität vor Qualität? Big Data im Kontext von Social Media Daten
Quantität vor Qualität? Big Data im Kontext von Social Media Daten
 
The pleasures and perils of studying Twitter
The pleasures and perils of studying TwitterThe pleasures and perils of studying Twitter
The pleasures and perils of studying Twitter
 
Friends or Followers. German Soccer Clubs and Their Fans on Twitter
Friends or Followers. German Soccer Clubs and Their Fans on TwitterFriends or Followers. German Soccer Clubs and Their Fans on Twitter
Friends or Followers. German Soccer Clubs and Their Fans on Twitter
 
What do we get from Twitter - and what not?
What do we get from Twitter - and what not?What do we get from Twitter - and what not?
What do we get from Twitter - and what not?
 
Tweeting the news
Tweeting the news Tweeting the news
Tweeting the news
 
Diata12 opening: Twitter research in Düsseldorf
Diata12 opening: Twitter research in DüsseldorfDiata12 opening: Twitter research in Düsseldorf
Diata12 opening: Twitter research in Düsseldorf
 
Twitter tipps online11
Twitter tipps online11Twitter tipps online11
Twitter tipps online11
 
Diata11_opening
Diata11_openingDiata11_opening
Diata11_opening
 
Approaches to Analyzing Scientific Communication on Twitter
Approaches to Analyzing Scientific Communication on TwitterApproaches to Analyzing Scientific Communication on Twitter
Approaches to Analyzing Scientific Communication on Twitter
 
Twitter for Scientific Communication
Twitter for Scientific CommunicationTwitter for Scientific Communication
Twitter for Scientific Communication
 

Kürzlich hochgeladen

Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdf
eliklein8
 
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
Cara Menggugurkan Kandungan 087776558899
 
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
ZurliaSoop
 
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Heena Escort Service
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
Health
 
Capstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdfCapstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdf
eliklein8
 
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
Cara Menggugurkan Kandungan 087776558899
 
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
ZurliaSoop
 
Sociocosmos empowers you to go trendy on social media with a few clicks..pdf
Sociocosmos empowers you to go trendy on social media with a few clicks..pdfSociocosmos empowers you to go trendy on social media with a few clicks..pdf
Sociocosmos empowers you to go trendy on social media with a few clicks..pdf
SocioCosmos
 

Kürzlich hochgeladen (17)

Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdf
 
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
 
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
 
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
 
The Butterfly Effect
The Butterfly EffectThe Butterfly Effect
The Butterfly Effect
 
Capstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdfCapstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdf
 
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
 
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
 
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIRBVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
 
Sociocosmos empowers you to go trendy on social media with a few clicks..pdf
Sociocosmos empowers you to go trendy on social media with a few clicks..pdfSociocosmos empowers you to go trendy on social media with a few clicks..pdf
Sociocosmos empowers you to go trendy on social media with a few clicks..pdf
 
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdfSEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
 
Marketing Plan - Social Media. The Sparks Foundation
Marketing Plan -  Social Media. The Sparks FoundationMarketing Plan -  Social Media. The Sparks Foundation
Marketing Plan - Social Media. The Sparks Foundation
 
Capstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolutionCapstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolution
 
Content strategy : Content empire and cash in
Content strategy : Content empire and cash inContent strategy : Content empire and cash in
Content strategy : Content empire and cash in
 
Enhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content MarketingEnhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content Marketing
 
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic HappensIgnite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
 

Challenges in-archiving-twitter

  • 1. Challenges in Archiving Social Media Data for Research: The Case of Twitter Dr. Katrin Weller GESIS – Leibniz-Institute for the Social Sciences Data Archive for the Social Sciences / Computational Social Science Cologne, Germany ● Digital Studies Fellow at John W. Kluge Center Library of Congress Washington D.C. E-Mail: katrin.weller@gesis.org ●Twitter: @kwelle ● Web: www.katrinweller.net Slides are available at: http://de.slideshare.net/katrinweller
  • 2. 2 SERIOUSLY? DO THEY NOT REALIZE THAT 99% OF TWEETS ARE WORTHLESS BABBLE THAT READ SOMETHING LIKE ‘JUST WOKE UP. GOING TO STARBUCKS NOW. GETTING LATTE.’ READER’SCOMMENTFOUNDINTHECOMMENTSECTIONFORGROSS,D.(2010,APRIL14).LIBRARYOFCONGRESSTOARCHIVEYOURTWEETS.CNN.RETRIEVEDFROMHTTP://EDITION.CNN.COM/2010/TECH/04/14/LIBRARY.CONGRESS.TWITTER/, RETRIEVEDNOVEMBER19. PHOTOS:HTTPS://WWW.FLICKR.COM/SEARCH/?TEXT=COFFEE&LICENSE=4%2C5%2C6%2C9%2C10
  • 3. Background 0 100 200 300 400 500 600 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Twitter Facebook YouTube Blogs Wikis Foursquare LinkedIn MySpace Number of publications per year, which mention the respective social media platform‘s name in their title. Scopus Title Search. For details: http://kwelle.wordpress.com/2014/04/07/bibliometric-analysis-of-social-media-research/
  • 4. Twitter research until 2013 by discipline 4
  • 5. Chances in Social Media Research • Researchers value social media as a new type of data • Previously „ephemeral data“ become visible • Immediate – quick reaction to events • Structured • „natural“ data 5 “What I find really interesting is that structure becomes manifest in internet communication. So it’s the first time in history actually that we can, that social structures between people become manifest within a technology. (...) They become visible, they become crawlable, they become analyzable.” Kinder-Kurlanda, Katharina E., and Katrin Weller. 2014. "'I always feel it must be great to be a hacker!': The role of interdisciplinary work in social media research." In Proceedings of the 2014 ACM conference on Web Science, 91-98. New York: ACM.
  • 6. One of the Challenges: Data Sharing 6 “But you can’t make your data available for others to look at, which means both your study can’t really be replicated and it can’t be tested for review. But also it just means your data can’t be made available for other people to say, Ah you have done this with it, I’ll see what I can do with it, (…) There is no open data.” Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2015. "Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden Data of Social Media Research?." In Standards and Practices in Large-Scale Social Media Research: Papers from the 2015 ICWSM Workshop. Proceedings Ninth International AAAI Conference on Web and Social Media Oxford University, May 26, 2015 – May 29, 2015, 28-37. Ann Arbor, MI: AAAI Press.
  • 7. What is Twitter data? “I actually only use [other researcher’s datasets] where I’m very sure about where it comes from and how it was processed and analyzed. There is too much uncertainty in it.” 7 Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2015. "Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden Data of Social Media Research?." In Standards and Practices in Large-Scale Social Media Research: Papers from the 2015 ICWSM Workshop. Proceedings Ninth International AAAI Conference on Web and Social Media Oxford University, May 26, 2015 – May 29, 2015, 28-37. Ann Arbor, MI: AAAI Press.
  • 8. 8 Different methods and types of datasets, examples from popular social science papers Weller, K. (2014). What do we get from Twitter – and what not? A close look at Twitter research in the social sciences. Knowledge Organization. 41(3), 238-248
  • 9. Example 2008-2013 papers on Twitter and elections: data sources Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript. 9 Data source number No information 11 Collected manually from Twitter website (Copy-Paste / Screenshot) 6 Twitter API (no further information) 8 Twitter Search API 3 Twitter Streaming API 1 Twitter Rest API 1 Twitter API user timeline 1 Own program for accessing Twitter APIs 4 Twitter Gardenhose 1 Official Reseller (Gnip, DataSift) 3 YourTwapperKeeper 3 Other tools (e.g. Topsy) 6 Received from colleagues 1
  • 11. 11
  • 12. 12
  • 13. 13 Format supported by Twitter Terms of services
  • 14. Available datasets • From individual researchers/groups (sometimes „black market“). • From conferences: e.g. ICWSM • Archival institutions? GESIS working on first release. 14
  • 15. Challenges in Archiving Twitter Data 15
  • 16. Sources for Challenges (1) the Twitter Terms of Services (2) ethical challenges (3) lack of standard metadata (4) the ever changing nature of Twitter – and Twitter users 16
  • 17. Sources for Challenges (1) the Twitter Terms of Services (2) ethical challenges (3) lack of standard metadata (4) the ever changing nature of Twitter – and Twitter users 17
  • 18. The changing nature of Twitter in 5 examples 18
  • 20. #2 Lost context: interfaces, look and feel 20
  • 24. Supplement: some useful references Tools / Methods for collecting tweets: • Borra, E., & Rieder, D. (2014). Programmed method: developing a toolset for capturing and analyzing tweets, Aslib Journal of Information Management, 66(3), 262 – 278. DOI: http://dx.doi.org/10.1108/AJIM-09- 2013-0094 • Bruns, A., & Liang, Y. E. (2012). Tools and methods for capturing Twitter data during natural disasters. First Monday, 17(4). doi:10.5210/fm.v17i4.3937 • Gaffney, D., & Puschmann, C. (2014). Data collection on Twitter. In Weller, A. Bruns, J. Burgess., M. Mahrt and C. Puschmann (Ed.), Twitter and Society (pp. 55–68). New York: Peter Lang. [There are much more tools, though. See, e.g. collection at: https://docs.google.com/document/d/1UaERzROI986HqcwrBDLaqGG8X_lYwctj6ek6ryqDOiQ/edit (curated by D. Freelon). 24
  • 25. Supplement: some useful references Challenges in collecting tweets / data quality: • Bruns, A. (2011, June 21). Switching from Twapperkeeper to yourTwapperkeeper. Retrieved January 31, 2015 from http://www.mappingonlinepublics.net/2011/06/21/switching-from- twapperkeeper-to-yourtwapperkeeper/. • Bruns, A. and Stieglitz, S. (2014), “Twitter data: what do they represent?” IT Information Technology, Vol. 59 No. 5, pp. 240-5, [online], available at: http://www.degruyter.com/view/j/itit.2014.56.issue-5/itit-2014-1049/itit-2014-1049.xml (accessed 28 February 2015), DOI: 10.1515/itit-2014-1049. • Jungherr, A., Jurgens, P. and Schoen, H. (2012), “Why the Pirate Party won the German Election of 2009 or The trouble with predictions: a response to Tumasjan, A., Sprenger, T. O., Sander, P. G. and Welpe, I. M. Predicting elections with Twitter: what 140 characters reveal about political sentiment”, Social Science Computer Review, Vol. 30 No. 2, pp. 229-34, [online], available at: http://ssc.sagepub.com/content/30/2/229 (accessed 28 February 2015), DOI: 10.1177/0894439311404119. • Morstatter, Fred, Jürgen Pfeffer, Huan Liu, and Kathleen M. Carley. 2013. “Is the Sample Good Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose.” http://arxiv.org/abs/1306.5204. • Sumers, E. (2015). Tweets and Deletes. Retrieved, June 9, 2015 from: https://medium.com/on-archivy/tweets-and-deletes-727ed74f84ed (see also: https://github.com/edsu/twarc) 25
  • 26. Supplement: some useful references Bibliometric studies of Twitter researchers: • Williams, S. A., Terras, M. M., & Warwick, C. (2013a). What do people study when they study Twitter? Classifying Twitter related academic papers. Journal of Documentation, 69(3): 384-410. • Williams, S. A., Terras, M. M., & Warwick, C. (2013b). How Twitter Is Studied in the Medical Professions: A Classification of Twitter Papers Indexed in PubMed. In Med 2.0 2013. doi: 10.2196/med20.2269. • Weller, K. (2014b). What do we get from Twitter – and what not? A close look at Twitter research in the social sciences. Knowledge Organization 41(3), 238-248. • Zimmer, M., & Proferes, J.N. (2014). A topology of Twitter research: disciplines, methods, and ethics. Aslib Journal of Information Management, 66(3), 250–261. doi:10.1108/AJIM-09-2013-0083 26
  • 27. Supplement: some useful references Critical perspectives on data access and inequalities: • Boyd, D. and Crawford, K. (2012), “Critical questions for Big Data: provocations for a cultural, technological, and scholarly phenomenon”, Information, Communication & Society, Vol. 15 No. 5, pp. 662-79, [online], available at: http://www.tandfonline.com/doi/full/10.1080/1369118X.2012.678878#abstract (accessed 28 February 2015), DOI: 10.1080/1369118X.2012.678878. 27
  • 28. Supplement: some useful references Legal and ethical challenges: • Beurskens, M. (2014). Legal Questions of Twitter Research. In K. Weller, A. Bruns, J. Burgess., M. Mahrt and C. Puschmann (Eds.), Twitter and Society (pp. 123-133). New York: Peter Lang. • Markham, A. and Buchanan, E. (2012), “Ethical decision-making and internet research 2.0: recommendations from the AoIR Ethics Working Committee”, available at: http://www.aoir.org/reports/ethics2.pdf (accessed 28 February 2015). • Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2014. "I love thinking about ethics: Perspectives on ethics in social media research." In Selected Papers of Internet Research (SPIR). Proceedings of ir15 - Boundaries and Intersections, http://spir.aoir.org/index.php/spir/article/view/997. • Zimmer, M. & Proferes, J.N. (2014). Privacy on Twitter, Twitter on privacy. In Weller, A. Bruns, J. Burgess., M. Mahrt and C. Puschmann (Eds.), Twitter & Society (pp. 169-182), New York: Peter Lang. • Zimmer, M. (2010), “But the data is already public: on the ethics of research in Facebook”, Ethics and Information Technology, Vol. 12 No. 4, pp. 313-25, DOI: 10.1007/s10676-010- 9227-5. 28
  • 29. Supplement: some useful references Twitter‘s activities: • Krikorian, R. (2014a), “Introducing Twitter Data Grants”, [online], available at: https://blog.twitter.com/2014/introducing-twitter-data-grants (accessed 28 February 2015). • Krikorian, R. (2014b), “Twitter #DataGrants selections”, available at: https://blog.twitter.com/2014/twitter-datagrants-selections (accessed 28 February 2015). • Stone, B. (2010). Tweet Preservation. Blog post, April 14, 2010. Retrieved from https://blog.twitter.com/2010/tweet-preservation • Twitter (2014). Developer Agreement & Policy. Twitter Developer Agreement, retrieved January 31, 2015 from https://dev.twitter.com/overview/terms/agreement-and-policy. • Twitter (no date). Guidelines for using Tweets in broadcast, retrieved January 31, 2015, from https://support.twitter.com/articles/114233. 29
  • 30. Supplement: some useful references Library of Congress‘ activities: • Allen, E. (2013, January 4). Update on the Twitter Archive at the Library of Congress. Retrieved January 31, 2015 from http://blogs.loc.gov/loc/2013/01/update-on-the-twitter-archive-at-the- library-of-congress/ • McLemmee, S. (2015). The Archive is closed. Inside Higher Education. Retrieved June 9, 2015 from: https://www.insidehighered.com/views/2015/06/03/article-difficulties- social-media-research • Raymond, M. (2010). How Tweet It Is! Library Acquires Entire Twitter Archive. Retrieved January 31, from http://blogs.loc.gov/loc/2010/04/how-tweet-it-is-library-acquires-entire- twitter-archive/ 30
  • 31. Supplement: some useful references Examples of Twitterdatasets shared publicly: • CrisisLex on Github: https://github.com/sajao/CrisisLex/tree/master/data/CrisisLexT26/ • Hadgu & Jäschke 2014 dataset on Github: https://github.com/L3S/twitter-researcher • ICWSM 2012 datasets: http://www.icwsm.org/2012/submitting/datasets/ ICWSM 2014 datasets: http://www.icwsm.org/2014/datasets/datasets/ • MPI-SWS (no date). The Twitter Project Page at MPI-SWS. Retrieved January 26, 2015 from http://twitter.mpi-sws.org/ (Archived by WebCite® at http://www.webcitation.org/6VsuuxQlU) • TREC 2011: http://trec.nist.gov/data/tweets/ • sananalytics (2011). Public domain twitter sentiment corpus. Post in Twitter Developers Forums. Retrieved Jan 31, 2015 from https://twittercommunity.com/t/public-domain- twitter-sentiment-corpus/13290 31