1
The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 stu...
4
5
Research Questions
RQ3: Does the number of tweets about a certain article
correlate to a recent edit and hence, an updat...
6
Dataset
• Crawl of Twitter using keyword „wikipedia“
• 2014/10/20 – 2015/03/10
• Total of 4.5 million tweets
• Cleaning ...
7
Dataset
Characteristic Raw Cleaned
Tweets 4,530,967 2,468,055
Retweets 1,440,122 659,641
Distinct Users 1,730,984 844,97...
8
Tweets per Day
9
General Observations: Users
• Long-tailed distribution
• Average number of tweets per user: 2.92
• However: maximum numb...
RQ1
Language Analyses
11
Language Distribution
• Analysis of tweeted Wikipedia article in regards to language
• Extract Wikipedia edition (langu...
12
Correlation of Language and Wikipedia Size Measures
Measure Spearman‘s ρ
Total number of articles .76*
Edits .65*
Users...
13
Tweet Languages
Language Share
English 42.90%
Japanese 21.92%
Spanish 5.77%
Arabian 2.56%
French 2.37%
Turkish 2.24%
Ge...
14
Inter-language links
Wikipedia Language
TwitterLanguage
en ja es ar fr tr de id ru pt
en 97.33% 0.19% 0.42% 0.03% 0.33%...
15
Inter-Language Links
• 85% of all links leading to a Wikipedia of a language different from the
tweet‘s language do not...
RQ2
Top Articles and Categories
17
Methods
• Tweets about English Wikipedia
• 52.81% of all tweets
• Total of 724,974 references to Wikipedia
• Total of 3...
18
Distribution: Tweets per Articles
64% of all articles
only tweeted once
19
Top Articles
Article No. of Tweets Share
diff 54,432 7,51%
cod_wars 6,868 0,95%
user:Giraffedata/comprised_of 4,541 0,6...
20
Top Categories
Category No. of Tweets Share
Living people 105,895 14,61%
English-language films 18,331 2,53%
American f...
RQ3
Edits and Tweets
22
Methods
• Crawled via MediaWiki API
• Tweets about English Wikipedia articles (724,974 references to 336,605 distinct a...
23
Conclusion
RQ1: 20% of all tweets link to a Wikipedia of another language.
RQ2: No particular categories or articles ar...
24
Future Work
• Look into inter-language links
• Tweets as quality measure
• Look into those tweets about Wikipedia witho...
25
#questions? http://en.wikipedia.org/wiki/Q&A #wikipedia
@eva_zangerle
eva.zangerle@uibk.ac.at
http://www.evazangerle.at...
26
The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 st...
Wikipedia on Twitter: Analyzing Tweets about Wikipedia
Wikipedia on Twitter: Analyzing Tweets about Wikipedia
Nächste SlideShare
Wird geladen in …5
×

Wikipedia on Twitter: Analyzing Tweets about Wikipedia

563 Aufrufe

Veröffentlicht am

Wikipedia has long become a standard source of informa-
tion on the web and as such is widely referenced on the
web and in social media. This paper analyzes the usage
of Wikipedia on Twitter by looking into languages used on
both platforms, content features of posted articles and re-
cent edits of those articles. The analysis is based on a set of
four million tweets and links these tweets to Wikipedia arti-
cles and their features to identify interesting relations. We
find that within English and Japanese tweets containing a
link to Wikipedia, 97% of the links lead to the English resp.
Japanese Wikipedia, whereas for other languages 20% of the
tweets contain a link to a Wikipedia of a different language.
Our results also indicate that the number of tweets about a
certain topic is not correlated to the number of recent edits
on the particular page at the time of sending the tweet.

Veröffentlicht in: Wissenschaft
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Wikipedia on Twitter: Analyzing Tweets about Wikipedia

  1. 1. 1 The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 students and 4.500 staff, it is western Austria’s largest institution of higher education and research. For further information visit: www.uibk.ac.at. #Wikipedia on Twitter: Analyzing Tweets about Wikipedia Eva Zangerle, Georg Schmidhammer, Günther Specht
  2. 2. 4
  3. 3. 5 Research Questions RQ3: Does the number of tweets about a certain article correlate to a recent edit and hence, an update of the page? RQ2: Which features do Wikipedia articles that are popular on Twitter exhibit/share? RQ1: How popular are the various Wikipedias on Twitter and in which language contexts are these referenced?
  4. 4. 6 Dataset • Crawl of Twitter using keyword „wikipedia“ • 2014/10/20 – 2015/03/10 • Total of 4.5 million tweets • Cleaning of dataset • Tweets with Wikipedia URL • Normalization of URLs (also mobile URLs) • Retweets remain within the set 22% of all Wikipedia-URLs articles are mobile URLs
  5. 5. 7 Dataset Characteristic Raw Cleaned Tweets 4,530,967 2,468,055 Retweets 1,440,122 659,641 Distinct Users 1,730,984 844,975 Mentions 3,334,848 1,880,687 Distinct Hashtags 159,231 118,912 Hashtag Usages 1,528,458 778,737 Distinct URLs 1,447,124 1,121,825 URL Usages 3,393,846 2,793,900 63.24% of all tweets contain 1 URL (maximum: 6 URLs) 77.72% of all URLs point to a Wikipedia page
  6. 6. 8 Tweets per Day
  7. 7. 9 General Observations: Users • Long-tailed distribution • Average number of tweets per user: 2.92 • However: maximum number of tweets per user: 64,521 • 19 of 20 most popular users are bots (404 users in total; 264k tweets) E. Zangerle, G. Schmidhammer, G. Specht: Analysing the Usage of Wikipedia on Twitter: Understanding Inter-Language Links (accepted at HICSS 2016)
  8. 8. RQ1 Language Analyses
  9. 9. 11 Language Distribution • Analysis of tweeted Wikipedia article in regards to language • Extract Wikipedia edition (language) from URL Missing: context, underlying data. Language Total Share English (en) 1,349,623 52.81% Japanese (ja) 579,157 22.66% Spanish (es) 140,396 5.49% Turkish (tr) 78,235 3.06% French (fr) 64,139 2.51% German (de) 52,256 2.04% Russian (ru) 44,347 1.74% Arabian (ar) 38,757 1.52% Korean (ko) 27,261 1.07% Portuguese (pt) 26,442 1.03%
  10. 10. 12 Correlation of Language and Wikipedia Size Measures Measure Spearman‘s ρ Total number of articles .76* Edits .65* Users .46* Admins .42* Active users .39* Images .39* Depth1 .35* * Significant at the 0.001 level 1 Depth = Edits/Articles x Non-Articles/Articles x [1-Stub-ratio]
  11. 11. 13 Tweet Languages Language Share English 42.90% Japanese 21.92% Spanish 5.77% Arabian 2.56% French 2.37% Turkish 2.24% German 1.75% Indonesian 1.56% Russian 1.35% Language Share English (en) 52.81% Japanese (ja) 22.66% Spanish (es) 5.49% Turkish (tr) 3.06% French (fr) 2.51% German (de) 2.04% Russian (ru) 1.74% Arabian (ar) 1.52% Korean (ko) 1.07% Tweets Wikipedias referenced
  12. 12. 14 Inter-language links Wikipedia Language TwitterLanguage en ja es ar fr tr de id ru pt en 97.33% 0.19% 0.42% 0.03% 0.33% 0.05% 0.35% 0.12% 0.10% 0.05% ja 5.48% 93.56% 0.04% 0.01% 0.11% 0.03% 0.20% 0.01% 0.05% 0.01% es 19.65% 0.28% 77.48% 0.01% 0.62% 0.03% 0.32% 0.07% 0.03% 0.51% ar 26.58% 0.02% 0.12% 72.79% 0.17% 0.02% 0.02% 0.00% 0.00% 0.00% fr 20.21% 0.19% 1.11% 1.92% 74.73% 0.03% 0.73% 0.02% 0.05% 0.17% tr 20.78% 0.01% 0.17% 0.00% 0.18% 77.62% 0.83% 0.04% 0.10% 0.02% de 21.15% 0.59% 1.41% 0.06% 0.44% 0.13% 74.94% 0.04% 0.04% 0.06% id 49.83% 1.20% 1.77% 0.16% 0.60% 0.40% 0.91% 42.84% 0.06% 0.26% ru 17.74% 0.10% 0.05% 0.00% 0.14% 0.03% 0.32% 0.00% 78.38% 0.01% pt 28.90% 0.73% 6.91% 0.01% 0.75% 0.05% 0.46% 0.09% 0.03% 60.87% 20% of all tweets link to another language. 85% of all inter-language links do not have a counterpart in original language.
  13. 13. 15 Inter-Language Links • 85% of all links leading to a Wikipedia of a language different from the tweet‘s language do not have a counterpart in the user‘s language • Remaining 15%: Wikipedia actually used is significantly better in terms of quality than language in tweet‘s language E. Zangerle, G. Schmidhammer, G. Specht: Analysing the Usage of Wikipedia on Twitter: Understanding Inter-Language Links (accepted at HICSS 2016)
  14. 14. RQ2 Top Articles and Categories
  15. 15. 17 Methods • Tweets about English Wikipedia • 52.81% of all tweets • Total of 724,974 references to Wikipedia • Total of 336,605 distinct English Wikipedia articles • Extract article titles and categories from DBPedia • Resolve extended URLs (e.g., diff-pages, access to old revisions, etc).
  16. 16. 18 Distribution: Tweets per Articles 64% of all articles only tweeted once
  17. 17. 19 Top Articles Article No. of Tweets Share diff 54,432 7,51% cod_wars 6,868 0,95% user:Giraffedata/comprised_of 4,541 0,63% matthew_ziff 2,100 0,29% kidz_bop 2,015 0,28% gamergate 1,703 0,23% old_revision 1,517 0,21% search 1,383 0,19% the_little_mermaid_(1989_film) 1,370 0,19% No article standing out particularly.
  18. 18. 20 Top Categories Category No. of Tweets Share Living people 105,895 14,61% English-language films 18,331 2,53% American films 9,605 1,32% Wars involving the United Kingdom 7,487 1,03% American male television actors 7,255 1,00% 20th-century conflicts 7,158 0,99% American male film actors 6,981 0,96% 20th-century military history of the United Kingdom 6,968 0,96% Law of the sea 6,953 0,96% Wars involving Iceland 6,928 0,96%
  19. 19. RQ3 Edits and Tweets
  20. 20. 22 Methods • Crawled via MediaWiki API • Tweets about English Wikipedia articles (724,974 references to 336,605 distinct articles) • Observation period: +/- 24 hours of a tweet • 543,788 edits in total • 91,577 edits marked as minor • 312,160 tweets link to an article edited within +/- 24 hours of tweet • 233,962 tweets: edit occured before tweet • 215,192 tweets: edit occured after tweet • No correlation between number of edits and number of tweets: Pearson‘s r: 0.06 (at 0.001 significance level) • Exception: events
  21. 21. 23 Conclusion RQ1: 20% of all tweets link to a Wikipedia of another language. RQ2: No particular categories or articles are significantly more popular on Twitter. Longtail-distribution for articles (64% of all English articles only tweeted once). RQ3: No correlation between number of edits and popularity of article on Twitter can be detected.
  22. 22. 24 Future Work • Look into inter-language links • Tweets as quality measure • Look into those tweets about Wikipedia without mentioning a particular article (qualitatively) • Interested in joining forces?
  23. 23. 25 #questions? http://en.wikipedia.org/wiki/Q&A #wikipedia @eva_zangerle eva.zangerle@uibk.ac.at http://www.evazangerle.at @dbisibk http://dbis-informatik.uibk.ac.at https://www.facebook.com/dbisibk
  24. 24. 26 The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 students and 4.500 staff, it is western Austria’s largest institution of higher education and research. For further information visit: www.uibk.ac.at. #Wikipedia on Twitter: Analyzing Tweets about Wikipedia Eva Zangerle, Georg Schmidhammer, Günther Specht

×