SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Chances and Challenges of
Studying Social Media Data
Dr. Katrin Weller
GESIS – Leibniz-Institute for the Social Sciences
Data Archive for the Social Sciences / Computational Social Science
Cologne, Germany
●
Digital Studies Fellow at John W. Kluge Center
Library of Congress
Washington D.C.
E-Mail: katrin.weller@gesis.org ●Twitter: @kwelle ● Web: www.katrinweller.net
My Background
• PhD in Information Science (until 2012 University of
Düsseldorf)
• Interests: Web Science, Social Media (focus on Twitter),
Knoweledge representation + Semantic Web, informetrics +
altmetrics, scholarly communication
• Since 2013: GESIS, Social Web Data: New data types for social
science research; research methods and data archiving.
• Jan-May 2015: Digital Studies Fellowship at the Library of
Congress
2
Recent and Current Work
• Co-editor of „Twitter & Society“ (Peter Lang, 2014).
• With Katharina Kinder-Kurlanda: „The hidden data of social
media research“
• #FAIL! Things that didn‘t work out in social media research –
and what we can learn from them (#fail2015a at Web Science
Conference, Oxford, #fail2015b at Internet Research 16,
Phoenix). https://failworkshops.wordpress.com
• Pilotproject for archiving social media datasets in an election
study. (http://arxiv.org/abs/1312.4476v2)
3
Social media research –
some insights from bibliometrics
5
What is social media research?
457,000
4,991
19,440
Social media research 2000-2013
0
1000
2000
3000
4000
5000
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
No. of publications (Scopus)
Scopus search, conducted March 2014: (TITLE-ABS-KEY("social media") OR TITLE-ABS-KEY("social web") OR TITLE-ABS-KEY("social
software") OR TITLE-ABS-KEY("web 2.0")) AND PUBYEAR > 1999
Scopus: 2000-2013 by country
0 1000 2000 3000 4000 5000 6000 7000
United States
United Kingdom
Germany
Australia
China
Spain
Canada
Italy
France
Taiwan
Netherlands
South Korea
Finland
Austria
Japan
Greece
India
Singapore
Switzerland
Hong Kong
Ireland
Scopus: 2000-2013 by subject area
10650; 36%
5542; 19%
2384
2288
2151
1535
773
772
65 Computer Science
Social Sciences
Engineering
Medicine
Business, Management and Accounting
Mathematics
Arts and Humanities
Decision Sciences
Psychology
Nursing
Economics, Econometrics and Finance
Biochemistry, Genetics and Molecular Biology
Health Professions
Environmental Science
Earth and Planetary Sciences
Agricultural and Biological Sciences
Pharmacology, Toxicology and Pharmaceutics
Physics and Astronomy
Materials Science
Multidisciplinary
Neuroscience
Immunology and Microbiology
Chemical Engineering
Veterinary
Dentistry
Chemistry
Energy
Twitter research by discipline
10
Challenge
• Interdisciplinarity!
• „Social media research“ is not a coherent research field.
• Influences from lots of different disciplines.
• Some disciplines still isolated, not all equally advanced in
technical tasks.
• Challenge of keeping track of what is going on – across
disciplines.
11
Example: Twitter research in social sciences
12
Weller, K. (2014). What do we get from Twitter – and What Not? A Close Look at Twitter Research in the
Social Sciences. Knowledge Organization 41(3), 238-248.
Challenge vs. Chance
• Lots of room for exploration and innovation
• Few or no standards
13
Comparability
• Data?
• Collection Period?
• Method?
• Tools?
14
15
Different methods even in social science Twitter research
Weller, K. (2014). What do we get from Twitter – and what not? A close look at Twitter research in the social sciences.
Knowledge Organization. 41(3), 238-248
Example
0
10
20
30
40
50
60
2008 2009 2010 2011 2012 2013
Publications on „Twitter and elections“
(Scopus and Web of Science)
Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data:
Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.
16
Year of
election
Name of election Country/region No. of papers
(2013)
Date of
election
2008 40th Canadian General Election Canada 1 14.10.200
8
2009 European Parliament election, 2009 Europe 1 07.06.200
9
2009 German federal election, 2009 Germany 2 27.09.200
9
2010 2010 UK general election United Kingdom 4 06.05.201
0
2010 South Korean local elections, 2010 South Korea 1 02.06.201
0
2010 Dutch general election, 2010 Netherlands 2 09.06.201
0
2010 Australian federal election, 2010 Australia 1 21.08.201
0
2010 Swedish general election, 2010 Sweden 1 19.09.201
0
2010 Midterm elections / United States House of Representatives elections, 2010 USA 4 02.11.201
0
2010 Gubernational elections: Georgia USA 1 02.11.201
0
2010 Gubernational elections: Ohio USA 1 02.11.201
0
2010 Gubernational elections: Rhode Island USA 1 02.11.201
0
2010 Gubernational elections: Vermont USA 1 02.11.201
0
2010 2010 superintendent elections South Korea 1 17.12.201
0
2011 Baden-Württemberg state election, 2011 Germany 1 27.03.201
1
2011 Rhineland-Palatinate state election, 2011 Germany 1 27.03.201
1
2011 Scottish parliament election 2011 Scotland 1 05.05.201
1
2011 Singapore’s 16th parliamentary General Election Singapore 1 07.05.201
1
2011 Norwegian local elections, 2011 Norway 2 12.09.201
1
2011 2011 Danish parliamentary election Denmark 2 15.09.201
1
2011 Scottish parliament election 2011 Scotland 1 05.05.201
1
2011 Singapore’s 16th parliamentary General Election Singapore 1 07.05.201
1
2011 Norwegian local elections, 2011 Norway 2 12.09.201
1
2011 2011 Danish parliamentary election Denmark 2 15.09.201
1
2011 Berlin state election, 2011 Germany 2 18.09.201
1
2011 Gubernational elections: West Virginia USA 1 04.10.201
1
2011 Gubernational elections: Louisiana USA 1 22.10.201
1
2011 Swiss federal election, 2011 Switzerland 1 23.10.201
1
2011 2011 Seoul mayoral elections South Korea 1 26.10.201
1
2011 Gubernational eletions: Kentucky USA 1 08.11.201
1
2011 Gubernational elections: Mississippi USA 1 08.11.201
1
2011 Spanish national election 2011 Spain 1 20.11.201
1
2012 Queensland State election Australia 1 24.03.201
2
2012 South Korean legislative election, 2012 South Korea 1 11.04.201
2
2012 French presidential election, 2012 France 2 22.04.201
2
2012 Mexican general election, 2012 Mexico 1 01.07.201
2
2012 United States presidential election, 2012 / United States House of
Representatives elections, 2012
USA 17 06.11.201
2
2012 South Korean presidential election, 2012 South Korea 2 19.12.201
2
2013 Ecuadorian general election, 2013 Ecuador 1 17.02.201
3
2013 Venezuelan presidential election, 2013 Venezuela 1 14.04.201
3
2013 Paraguayan general election, 2013 Paraguay 1 21.04.201
3
Big DATA?
2013: twitter and election
No. of Tweets No. Of publications (2013)
0-500 3
501-1.000 4
1.001-5.000 1
5.001-10.000 1
10.001-50.000 7
50.001-100.000 4
100.001-500.000 5
500.001-1.000.000. 3
1.000.001-5.000.000 3
More than 5.000.000 3
More than 100.000.000 1
More than 1.000.000.000 1
No/unsufficient information 13
Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big
Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.
19
Comparability
twitter and election
Data collection methods
Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big
Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.
20
Data source number
No information 11
Collected manually from Twitter website (Copy-Paste /
Screenshot)
6
Twitter API (no further information) 8
Twitter Search API 3
Twitter Streaming API 1
Twitter Rest API 1
Twitter API user timeline 1
Own program for accessing Twitter APIs 4
Twitter Gardenhose 1
Official Reseller (Gnip, DataSift) 3
YourTwapperKeeper 3
Other tools (e.g. Topsy) 6
Received from colleagues 1
Comparability
twitter and election
Data collection periods
Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big
Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.
21
period Number of publications
(2013)
0-10 hours 1
1-2 days 6
3-7 days 3
8-14 days 5
2-4 weeks 7
1-2 months 13
2-6 months 5
7-12 months 3
More than 12 months 0
No/unsufficient information 6
What is being studied?
0
100
200
300
400
500
600
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Twitter
Facebook
YouTube
Blogs
Wikis
Foursquare
LinkedIn
MySpace
Number of publications per year, which mention the respective social media platform‘s name in their title. Scopus
Title Search. For details: http://kwelle.wordpress.com/2014/04/07/bibliometric-analysis-of-social-media-research/.22
Challenges
• Quickly changing landscape of social media
platforms
• Twitter as a model organism of social media
research?
23
Scopus: 2000-2013 popular keywords
Social networks (897), Social network (657)
User interfaces (1,007)
Social networking (online) (2,291)
Facebook (847)
Knowledge management (860)
Web services (869)
Information systems (810)
Twitter (667)
Semantics(765), Semantic Web(669)
Communication(650)
Information technology (639)
E-learning (623),
Students(579)
Education(520)
Teaching (504)
Scopus search, conducted March 2014: (TITLE-ABS-KEY("social media") OR TITLE-ABS-KEY("social web") OR TITLE-ABS-KEY("social
software") OR TITLE-ABS-KEY("web 2.0")) AND PUBYEAR > 1999
Social Media Research: Topics
• Political communication / elections
• Activism
• Popular culture, memes
• Brand communication, marketing
• Journalism (incl. agenda setting, citizen journalism, TV
backchannel)
• Crisis communication, disaster response
• Scholarly communication
• Language
• And many more
25
pointless babble?
26
Social media research –
some insights from expert interviews
• Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2014. ““I love thinking about
ethics!” Perspectives on ethics in social media research.” Internet Research (IR15),
Deagu, South Korea, 22.-24.10.2014. Paper to be published in Selected Papers of
Internet Research, view preprintHiddenDataEthics_Weller+Kinder-Kurlanda_IR15-
preprint.
• Kinder-Kurlanda, K. E.; Weller, K. (2014). “I always feel it must be great to be a
Hacker!”: The role of interdisciplinary work in social media research. In
Proceedings of the 2014 ACM Web Science Conference WebSci’14, June 23–26,
2014, Bloomington, IN, USA, pp. 91-98.
doi:http://dx.doi.org/10.1145/2615569.2615685. View preprint
versionhiddendata_websci14-preprint_Kinder-Kurlanda+Weller(2014).
• Weller, K. & Kinder-Kurlanda, K. (in press). Uncovering the Challenges in Collection,
Sharing and Documentation: The Hidden Data of Social Media Research? To
appear in Workshop on Standards and Practices in Large-Scale Social Media
Research. ICWSM, Oxford, May 2015.
http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/viewFile/10657/105
52.
28
CHANCES
• Researchers value social media as a new type of data.
• Previously „ephemeral data“ become visible
• Immediate – quick reaction to events
• Structured
• „natural“ data
29
What I find really interesting is that structure becomes manifest in
internet communication. So it’s the first time in history actually that
we can, that social structures between people become manifest
within a technology. (...) They become visible, they become
crawlable, they become analyzable.
Some of the CHALLENGES
Preliminary results, more detailed analysis to follow.
- Interdisciplinarity
- Ethics
- Standards
- Data access & infrastructure
- …
30
Unregulated and developing field
• Researchers showed a high awareness of the unregulated
and developing character of social media research
methods.
But, I think that (…) in like a couple of years, maybe
five – it depends a lot, because the subject of the
research is changing every day, (…) but I think that
we’re going to have, (…) more or less shared
qualitative approaches with a lot of good practices.
Data Sharing
32
But you can’t make your data available for others to look
at, which means both your study can’t really be replicated
and it can’t be tested for review. But also it just means your
data can’t be made available for other people to say, Ah
you have done this with it, I’ll see what I can do with it, (…)
There is no open data.
Data Sharing
“I think probably a couple times we’ve asked around if anyone
else happened to have a particular dataset. […] but not so much,
because they probably have tracked in a different data format,
and then merging the two together actually becomes quite
difficult as well.”
33
Ethics / privacy
34
“I will not quote tweets.”
“if somebody plays a really
important role in a particular event
then maybe they deserve the credit
of being accredited as well.”
Social media research –
some more insights from critical literature
Representativeness
Blank, G. (2014). Who uses Twitter? Representativeness of Twitter Users. Presentation at General Online Research GOR 14.
Retrieved from: http://conftool.gor.de/conftool14/index.php?page=downloadPaper&filename=Blank-
Who_uses_Twitter_Representativeness-119.pptx&form_id=119&form_version=final
34
26
8
12
18
14
10
17
12
23
28
3330
35
0
20
40
60
80
100
InterestPolitical activities
Interest
in politics
Send
political
message
Contact
MP online
Re-post
political
news
Political
comment
on SNS
Find
political
facts
Sign
online
petition
OxIS current users: 2013 N=1,613
Figure 6: Political Activities of Twitter Users
Twitter user Non-user
Data Quality
• E.g. comparison of Twitter API and Reseller
data.
37
Morstatter, F., Pfeffer, J., Liu, H., & Carley, K. M. (2013). Is the Sample Good Enough? Comparing Data from Twitter’s
Streaming API with Twitter’s Firehose. Retrieved from http://arxiv.org/abs/1306.5204
Inequality in data access possibilities
38
boyd, d., and Crawford, K. 2012. Critical questions for Big Data: Provocations for a cultural, technological, and scholarly
phenomenon. Information, Communication & Society 15(5):662–679. DOI: 10.1080/1369118X.2012.678878
• Data haves and data have nots
– Financial reasons
– Connections to companies
– Different skills
– …
Top 5 Challenges in Twitter research
• Representativeness and validity
• Cross-platform studies
• Comparisons (e.g. different countries, points in time)
• Multi-method approaches
• Context and meaning
Bruns, Axel, and Katrin Weller. 2014. "Twitter data analytics – or: the pleasures and perils of studying Twitter (guest editorial
for special issue)". Aslib Journal of Information Management 66 (3): 246-249. 39
Summary
Three sources of challenges in social media research:
• The variety of user interactions that count as social media
and their ever changing nature that makes social media a
moving target.
• The diversity of the research community, which challenges
knowledge transfer and development of standards.
• The dependency on commercial companies to open up
access to their data. Researchers themselves only have
limited means to change these sources of challenges.
40
Weller, K. (2015). Accepting the challenges of social media research. Online Information Review 39(3).
Summary
Currently addressed challenges
• research infrastructure, including data collection and
sharing facilities, training in new methods and
technologies.
• The call for more thoughtfulness in research ethics.
• Critical considerations on big data and data quality,
including reflection of the power of algorithms and
misrepresentations through big data approaches. Requests
for broader scopes by facilitating multi-method and multi-
platform studies, as well as longitudinal studies
41
Outlook
• Long term preservation of social media, i.e. archiving
of data and as well as of social media platforms’ look
and feel.
• Documentation of applied research methods which
should enable comparative studies across the single
use cases, quality control and verification of results.
• Accessing social media users’ expectations on privacy
in order to respond to them through ethical standards.
42
GREETINGS FROM COLOGNE AND DC!
QUESTIONS AND FEEDBACK
katrin.weller@gesis.org
@kwelle
http://katrinweller.net

Weitere ähnliche Inhalte

Mehr von Katrin Weller

Weller pleasures+perils social media
Weller pleasures+perils social mediaWeller pleasures+perils social media
Weller pleasures+perils social mediaKatrin Weller
 
Weller social media as research data_psm15
Weller social media as research data_psm15Weller social media as research data_psm15
Weller social media as research data_psm15Katrin Weller
 
Fail! workshop introduction at Web Science Conference
Fail! workshop introduction at Web Science ConferenceFail! workshop introduction at Web Science Conference
Fail! workshop introduction at Web Science ConferenceKatrin Weller
 
Challenges in-archiving-twitter
Challenges in-archiving-twitterChallenges in-archiving-twitter
Challenges in-archiving-twitterKatrin Weller
 
The digital traces of user generated content
The digital traces of user generated contentThe digital traces of user generated content
The digital traces of user generated contentKatrin Weller
 
The Hidden Data of Social Media Rearch_CSS-winter-symposium
The Hidden Data of Social Media Rearch_CSS-winter-symposiumThe Hidden Data of Social Media Rearch_CSS-winter-symposium
The Hidden Data of Social Media Rearch_CSS-winter-symposiumKatrin Weller
 
Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...
Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...
Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...Katrin Weller
 
Publishing with impact
Publishing with impactPublishing with impact
Publishing with impactKatrin Weller
 
"I always feel it must be great to be a hacker"
"I always feel it must be great to be a hacker" "I always feel it must be great to be a hacker"
"I always feel it must be great to be a hacker" Katrin Weller
 
Social-Media-Forschung
Social-Media-ForschungSocial-Media-Forschung
Social-Media-ForschungKatrin Weller
 
Hidden Data of Social Media Research
Hidden Data of Social Media ResearchHidden Data of Social Media Research
Hidden Data of Social Media ResearchKatrin Weller
 
Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...
Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...
Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...Katrin Weller
 
What’s new in social media research?
What’s new in social media research?What’s new in social media research?
What’s new in social media research?Katrin Weller
 
Twitter-Daten in der sozialwissenschaftlichen Forschung
Twitter-Daten in der sozialwissenschaftlichen ForschungTwitter-Daten in der sozialwissenschaftlichen Forschung
Twitter-Daten in der sozialwissenschaftlichen ForschungKatrin Weller
 
Social Media Research Methods
Social Media Research MethodsSocial Media Research Methods
Social Media Research MethodsKatrin Weller
 
Quantität vor Qualität? Big Data im Kontext von Social Media Daten
Quantität vor Qualität? Big Data im Kontext von Social Media DatenQuantität vor Qualität? Big Data im Kontext von Social Media Daten
Quantität vor Qualität? Big Data im Kontext von Social Media DatenKatrin Weller
 
The pleasures and perils of studying Twitter
The pleasures and perils of studying TwitterThe pleasures and perils of studying Twitter
The pleasures and perils of studying TwitterKatrin Weller
 
Friends or Followers. German Soccer Clubs and Their Fans on Twitter
Friends or Followers. German Soccer Clubs and Their Fans on TwitterFriends or Followers. German Soccer Clubs and Their Fans on Twitter
Friends or Followers. German Soccer Clubs and Their Fans on TwitterKatrin Weller
 
What do we get from Twitter - and what not?
What do we get from Twitter - and what not?What do we get from Twitter - and what not?
What do we get from Twitter - and what not?Katrin Weller
 

Mehr von Katrin Weller (20)

Weller pleasures+perils social media
Weller pleasures+perils social mediaWeller pleasures+perils social media
Weller pleasures+perils social media
 
Weller social media as research data_psm15
Weller social media as research data_psm15Weller social media as research data_psm15
Weller social media as research data_psm15
 
Fail ir16 intro
Fail ir16 introFail ir16 intro
Fail ir16 intro
 
Fail! workshop introduction at Web Science Conference
Fail! workshop introduction at Web Science ConferenceFail! workshop introduction at Web Science Conference
Fail! workshop introduction at Web Science Conference
 
Challenges in-archiving-twitter
Challenges in-archiving-twitterChallenges in-archiving-twitter
Challenges in-archiving-twitter
 
The digital traces of user generated content
The digital traces of user generated contentThe digital traces of user generated content
The digital traces of user generated content
 
The Hidden Data of Social Media Rearch_CSS-winter-symposium
The Hidden Data of Social Media Rearch_CSS-winter-symposiumThe Hidden Data of Social Media Rearch_CSS-winter-symposium
The Hidden Data of Social Media Rearch_CSS-winter-symposium
 
Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...
Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...
Twitter-Daten in der sozialwissenschaftlichen Forschung – Möglichkeiten und H...
 
Publishing with impact
Publishing with impactPublishing with impact
Publishing with impact
 
"I always feel it must be great to be a hacker"
"I always feel it must be great to be a hacker" "I always feel it must be great to be a hacker"
"I always feel it must be great to be a hacker"
 
Social-Media-Forschung
Social-Media-ForschungSocial-Media-Forschung
Social-Media-Forschung
 
Hidden Data of Social Media Research
Hidden Data of Social Media ResearchHidden Data of Social Media Research
Hidden Data of Social Media Research
 
Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...
Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...
Big data - Gewinnung, Auswertung und Darstellung großer Mengen onlinegenerier...
 
What’s new in social media research?
What’s new in social media research?What’s new in social media research?
What’s new in social media research?
 
Twitter-Daten in der sozialwissenschaftlichen Forschung
Twitter-Daten in der sozialwissenschaftlichen ForschungTwitter-Daten in der sozialwissenschaftlichen Forschung
Twitter-Daten in der sozialwissenschaftlichen Forschung
 
Social Media Research Methods
Social Media Research MethodsSocial Media Research Methods
Social Media Research Methods
 
Quantität vor Qualität? Big Data im Kontext von Social Media Daten
Quantität vor Qualität? Big Data im Kontext von Social Media DatenQuantität vor Qualität? Big Data im Kontext von Social Media Daten
Quantität vor Qualität? Big Data im Kontext von Social Media Daten
 
The pleasures and perils of studying Twitter
The pleasures and perils of studying TwitterThe pleasures and perils of studying Twitter
The pleasures and perils of studying Twitter
 
Friends or Followers. German Soccer Clubs and Their Fans on Twitter
Friends or Followers. German Soccer Clubs and Their Fans on TwitterFriends or Followers. German Soccer Clubs and Their Fans on Twitter
Friends or Followers. German Soccer Clubs and Their Fans on Twitter
 
What do we get from Twitter - and what not?
What do we get from Twitter - and what not?What do we get from Twitter - and what not?
What do we get from Twitter - and what not?
 

Kürzlich hochgeladen

Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw DigitalTop 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digitalmacawdigitalseo2023
 
Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In DubaiDubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubaikojalkojal131
 
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsUnveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsSocioCosmos
 
Values Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdfValues Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdfSoftServe HRM
 
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECTTHE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT17mos052
 
Amplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesAmplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesNetqom Solutions
 
The--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media PitchThe--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media Pitch17mos052
 

Kürzlich hochgeladen (7)

Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw DigitalTop 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digital
 
Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In DubaiDubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
 
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsUnveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
 
Values Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdfValues Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdf
 
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECTTHE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
 
Amplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesAmplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing Services
 
The--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media PitchThe--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media Pitch
 

Chances and Challenges of Studying Social Media Data

  • 1. Chances and Challenges of Studying Social Media Data Dr. Katrin Weller GESIS – Leibniz-Institute for the Social Sciences Data Archive for the Social Sciences / Computational Social Science Cologne, Germany ● Digital Studies Fellow at John W. Kluge Center Library of Congress Washington D.C. E-Mail: katrin.weller@gesis.org ●Twitter: @kwelle ● Web: www.katrinweller.net
  • 2. My Background • PhD in Information Science (until 2012 University of Düsseldorf) • Interests: Web Science, Social Media (focus on Twitter), Knoweledge representation + Semantic Web, informetrics + altmetrics, scholarly communication • Since 2013: GESIS, Social Web Data: New data types for social science research; research methods and data archiving. • Jan-May 2015: Digital Studies Fellowship at the Library of Congress 2
  • 3. Recent and Current Work • Co-editor of „Twitter & Society“ (Peter Lang, 2014). • With Katharina Kinder-Kurlanda: „The hidden data of social media research“ • #FAIL! Things that didn‘t work out in social media research – and what we can learn from them (#fail2015a at Web Science Conference, Oxford, #fail2015b at Internet Research 16, Phoenix). https://failworkshops.wordpress.com • Pilotproject for archiving social media datasets in an election study. (http://arxiv.org/abs/1312.4476v2) 3
  • 4. Social media research – some insights from bibliometrics
  • 5. 5 What is social media research?
  • 7. Social media research 2000-2013 0 1000 2000 3000 4000 5000 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 No. of publications (Scopus) Scopus search, conducted March 2014: (TITLE-ABS-KEY("social media") OR TITLE-ABS-KEY("social web") OR TITLE-ABS-KEY("social software") OR TITLE-ABS-KEY("web 2.0")) AND PUBYEAR > 1999
  • 8. Scopus: 2000-2013 by country 0 1000 2000 3000 4000 5000 6000 7000 United States United Kingdom Germany Australia China Spain Canada Italy France Taiwan Netherlands South Korea Finland Austria Japan Greece India Singapore Switzerland Hong Kong Ireland
  • 9. Scopus: 2000-2013 by subject area 10650; 36% 5542; 19% 2384 2288 2151 1535 773 772 65 Computer Science Social Sciences Engineering Medicine Business, Management and Accounting Mathematics Arts and Humanities Decision Sciences Psychology Nursing Economics, Econometrics and Finance Biochemistry, Genetics and Molecular Biology Health Professions Environmental Science Earth and Planetary Sciences Agricultural and Biological Sciences Pharmacology, Toxicology and Pharmaceutics Physics and Astronomy Materials Science Multidisciplinary Neuroscience Immunology and Microbiology Chemical Engineering Veterinary Dentistry Chemistry Energy
  • 10. Twitter research by discipline 10
  • 11. Challenge • Interdisciplinarity! • „Social media research“ is not a coherent research field. • Influences from lots of different disciplines. • Some disciplines still isolated, not all equally advanced in technical tasks. • Challenge of keeping track of what is going on – across disciplines. 11
  • 12. Example: Twitter research in social sciences 12 Weller, K. (2014). What do we get from Twitter – and What Not? A Close Look at Twitter Research in the Social Sciences. Knowledge Organization 41(3), 238-248.
  • 13. Challenge vs. Chance • Lots of room for exploration and innovation • Few or no standards 13
  • 14. Comparability • Data? • Collection Period? • Method? • Tools? 14
  • 15. 15 Different methods even in social science Twitter research Weller, K. (2014). What do we get from Twitter – and what not? A close look at Twitter research in the social sciences. Knowledge Organization. 41(3), 238-248
  • 16. Example 0 10 20 30 40 50 60 2008 2009 2010 2011 2012 2013 Publications on „Twitter and elections“ (Scopus and Web of Science) Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript. 16
  • 17. Year of election Name of election Country/region No. of papers (2013) Date of election 2008 40th Canadian General Election Canada 1 14.10.200 8 2009 European Parliament election, 2009 Europe 1 07.06.200 9 2009 German federal election, 2009 Germany 2 27.09.200 9 2010 2010 UK general election United Kingdom 4 06.05.201 0 2010 South Korean local elections, 2010 South Korea 1 02.06.201 0 2010 Dutch general election, 2010 Netherlands 2 09.06.201 0 2010 Australian federal election, 2010 Australia 1 21.08.201 0 2010 Swedish general election, 2010 Sweden 1 19.09.201 0 2010 Midterm elections / United States House of Representatives elections, 2010 USA 4 02.11.201 0 2010 Gubernational elections: Georgia USA 1 02.11.201 0 2010 Gubernational elections: Ohio USA 1 02.11.201 0 2010 Gubernational elections: Rhode Island USA 1 02.11.201 0 2010 Gubernational elections: Vermont USA 1 02.11.201 0 2010 2010 superintendent elections South Korea 1 17.12.201 0 2011 Baden-Württemberg state election, 2011 Germany 1 27.03.201 1 2011 Rhineland-Palatinate state election, 2011 Germany 1 27.03.201 1 2011 Scottish parliament election 2011 Scotland 1 05.05.201 1 2011 Singapore’s 16th parliamentary General Election Singapore 1 07.05.201 1 2011 Norwegian local elections, 2011 Norway 2 12.09.201 1 2011 2011 Danish parliamentary election Denmark 2 15.09.201 1
  • 18. 2011 Scottish parliament election 2011 Scotland 1 05.05.201 1 2011 Singapore’s 16th parliamentary General Election Singapore 1 07.05.201 1 2011 Norwegian local elections, 2011 Norway 2 12.09.201 1 2011 2011 Danish parliamentary election Denmark 2 15.09.201 1 2011 Berlin state election, 2011 Germany 2 18.09.201 1 2011 Gubernational elections: West Virginia USA 1 04.10.201 1 2011 Gubernational elections: Louisiana USA 1 22.10.201 1 2011 Swiss federal election, 2011 Switzerland 1 23.10.201 1 2011 2011 Seoul mayoral elections South Korea 1 26.10.201 1 2011 Gubernational eletions: Kentucky USA 1 08.11.201 1 2011 Gubernational elections: Mississippi USA 1 08.11.201 1 2011 Spanish national election 2011 Spain 1 20.11.201 1 2012 Queensland State election Australia 1 24.03.201 2 2012 South Korean legislative election, 2012 South Korea 1 11.04.201 2 2012 French presidential election, 2012 France 2 22.04.201 2 2012 Mexican general election, 2012 Mexico 1 01.07.201 2 2012 United States presidential election, 2012 / United States House of Representatives elections, 2012 USA 17 06.11.201 2 2012 South Korean presidential election, 2012 South Korea 2 19.12.201 2 2013 Ecuadorian general election, 2013 Ecuador 1 17.02.201 3 2013 Venezuelan presidential election, 2013 Venezuela 1 14.04.201 3 2013 Paraguayan general election, 2013 Paraguay 1 21.04.201 3
  • 19. Big DATA? 2013: twitter and election No. of Tweets No. Of publications (2013) 0-500 3 501-1.000 4 1.001-5.000 1 5.001-10.000 1 10.001-50.000 7 50.001-100.000 4 100.001-500.000 5 500.001-1.000.000. 3 1.000.001-5.000.000 3 More than 5.000.000 3 More than 100.000.000 1 More than 1.000.000.000 1 No/unsufficient information 13 Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript. 19
  • 20. Comparability twitter and election Data collection methods Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript. 20 Data source number No information 11 Collected manually from Twitter website (Copy-Paste / Screenshot) 6 Twitter API (no further information) 8 Twitter Search API 3 Twitter Streaming API 1 Twitter Rest API 1 Twitter API user timeline 1 Own program for accessing Twitter APIs 4 Twitter Gardenhose 1 Official Reseller (Gnip, DataSift) 3 YourTwapperKeeper 3 Other tools (e.g. Topsy) 6 Received from colleagues 1
  • 21. Comparability twitter and election Data collection periods Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript. 21 period Number of publications (2013) 0-10 hours 1 1-2 days 6 3-7 days 3 8-14 days 5 2-4 weeks 7 1-2 months 13 2-6 months 5 7-12 months 3 More than 12 months 0 No/unsufficient information 6
  • 22. What is being studied? 0 100 200 300 400 500 600 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Twitter Facebook YouTube Blogs Wikis Foursquare LinkedIn MySpace Number of publications per year, which mention the respective social media platform‘s name in their title. Scopus Title Search. For details: http://kwelle.wordpress.com/2014/04/07/bibliometric-analysis-of-social-media-research/.22
  • 23. Challenges • Quickly changing landscape of social media platforms • Twitter as a model organism of social media research? 23
  • 24. Scopus: 2000-2013 popular keywords Social networks (897), Social network (657) User interfaces (1,007) Social networking (online) (2,291) Facebook (847) Knowledge management (860) Web services (869) Information systems (810) Twitter (667) Semantics(765), Semantic Web(669) Communication(650) Information technology (639) E-learning (623), Students(579) Education(520) Teaching (504) Scopus search, conducted March 2014: (TITLE-ABS-KEY("social media") OR TITLE-ABS-KEY("social web") OR TITLE-ABS-KEY("social software") OR TITLE-ABS-KEY("web 2.0")) AND PUBYEAR > 1999
  • 25. Social Media Research: Topics • Political communication / elections • Activism • Popular culture, memes • Brand communication, marketing • Journalism (incl. agenda setting, citizen journalism, TV backchannel) • Crisis communication, disaster response • Scholarly communication • Language • And many more 25
  • 27. Social media research – some insights from expert interviews
  • 28. • Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2014. ““I love thinking about ethics!” Perspectives on ethics in social media research.” Internet Research (IR15), Deagu, South Korea, 22.-24.10.2014. Paper to be published in Selected Papers of Internet Research, view preprintHiddenDataEthics_Weller+Kinder-Kurlanda_IR15- preprint. • Kinder-Kurlanda, K. E.; Weller, K. (2014). “I always feel it must be great to be a Hacker!”: The role of interdisciplinary work in social media research. In Proceedings of the 2014 ACM Web Science Conference WebSci’14, June 23–26, 2014, Bloomington, IN, USA, pp. 91-98. doi:http://dx.doi.org/10.1145/2615569.2615685. View preprint versionhiddendata_websci14-preprint_Kinder-Kurlanda+Weller(2014). • Weller, K. & Kinder-Kurlanda, K. (in press). Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden Data of Social Media Research? To appear in Workshop on Standards and Practices in Large-Scale Social Media Research. ICWSM, Oxford, May 2015. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/viewFile/10657/105 52. 28
  • 29. CHANCES • Researchers value social media as a new type of data. • Previously „ephemeral data“ become visible • Immediate – quick reaction to events • Structured • „natural“ data 29 What I find really interesting is that structure becomes manifest in internet communication. So it’s the first time in history actually that we can, that social structures between people become manifest within a technology. (...) They become visible, they become crawlable, they become analyzable.
  • 30. Some of the CHALLENGES Preliminary results, more detailed analysis to follow. - Interdisciplinarity - Ethics - Standards - Data access & infrastructure - … 30
  • 31. Unregulated and developing field • Researchers showed a high awareness of the unregulated and developing character of social media research methods. But, I think that (…) in like a couple of years, maybe five – it depends a lot, because the subject of the research is changing every day, (…) but I think that we’re going to have, (…) more or less shared qualitative approaches with a lot of good practices.
  • 32. Data Sharing 32 But you can’t make your data available for others to look at, which means both your study can’t really be replicated and it can’t be tested for review. But also it just means your data can’t be made available for other people to say, Ah you have done this with it, I’ll see what I can do with it, (…) There is no open data.
  • 33. Data Sharing “I think probably a couple times we’ve asked around if anyone else happened to have a particular dataset. […] but not so much, because they probably have tracked in a different data format, and then merging the two together actually becomes quite difficult as well.” 33
  • 34. Ethics / privacy 34 “I will not quote tweets.” “if somebody plays a really important role in a particular event then maybe they deserve the credit of being accredited as well.”
  • 35. Social media research – some more insights from critical literature
  • 36. Representativeness Blank, G. (2014). Who uses Twitter? Representativeness of Twitter Users. Presentation at General Online Research GOR 14. Retrieved from: http://conftool.gor.de/conftool14/index.php?page=downloadPaper&filename=Blank- Who_uses_Twitter_Representativeness-119.pptx&form_id=119&form_version=final 34 26 8 12 18 14 10 17 12 23 28 3330 35 0 20 40 60 80 100 InterestPolitical activities Interest in politics Send political message Contact MP online Re-post political news Political comment on SNS Find political facts Sign online petition OxIS current users: 2013 N=1,613 Figure 6: Political Activities of Twitter Users Twitter user Non-user
  • 37. Data Quality • E.g. comparison of Twitter API and Reseller data. 37 Morstatter, F., Pfeffer, J., Liu, H., & Carley, K. M. (2013). Is the Sample Good Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose. Retrieved from http://arxiv.org/abs/1306.5204
  • 38. Inequality in data access possibilities 38 boyd, d., and Crawford, K. 2012. Critical questions for Big Data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society 15(5):662–679. DOI: 10.1080/1369118X.2012.678878 • Data haves and data have nots – Financial reasons – Connections to companies – Different skills – …
  • 39. Top 5 Challenges in Twitter research • Representativeness and validity • Cross-platform studies • Comparisons (e.g. different countries, points in time) • Multi-method approaches • Context and meaning Bruns, Axel, and Katrin Weller. 2014. "Twitter data analytics – or: the pleasures and perils of studying Twitter (guest editorial for special issue)". Aslib Journal of Information Management 66 (3): 246-249. 39
  • 40. Summary Three sources of challenges in social media research: • The variety of user interactions that count as social media and their ever changing nature that makes social media a moving target. • The diversity of the research community, which challenges knowledge transfer and development of standards. • The dependency on commercial companies to open up access to their data. Researchers themselves only have limited means to change these sources of challenges. 40 Weller, K. (2015). Accepting the challenges of social media research. Online Information Review 39(3).
  • 41. Summary Currently addressed challenges • research infrastructure, including data collection and sharing facilities, training in new methods and technologies. • The call for more thoughtfulness in research ethics. • Critical considerations on big data and data quality, including reflection of the power of algorithms and misrepresentations through big data approaches. Requests for broader scopes by facilitating multi-method and multi- platform studies, as well as longitudinal studies 41
  • 42. Outlook • Long term preservation of social media, i.e. archiving of data and as well as of social media platforms’ look and feel. • Documentation of applied research methods which should enable comparative studies across the single use cases, quality control and verification of results. • Accessing social media users’ expectations on privacy in order to respond to them through ethical standards. 42