2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
Data Science in the era of Fake News
1. DATA SCIENCE IN THE ERA OF
Pablo Aragón
Universitat Pompeu Fabra
Eurecat, Centre of Technology of Catalonia
Oxford Internet Institute, University of Oxford (until next week) March 23, 2018
2.
3. Data Science Fake News
Many data science projects have focused on fake news
4. What if fake news are about
your data science projects?
5. Aragón, P., Kaltenbrunner, A., Laniado, D., & Volkovich, Y. (2012).
Biographical Social Networks on Wikipedia – A cross-cultural study of links that made history.
WikiSym ’12 – 8th International Symposium on Wikis and Open Collaboration
PROJECT #1
● Biographical articles are identified a priori in the English Wikipedia
● Networks of links
between biographical
articles in 15 language
editions
6. Aragón, P., Kaltenbrunner, A., Laniado, D., & Volkovich, Y. (2012).
Biographical Social Networks on Wikipedia – A cross-cultural study of links that made history.
WikiSym ’12 – 8th International Symposium on Wikis and Open Collaboration
7. Aragón, P., Kaltenbrunner, A., Laniado, D., & Volkovich, Y. (2012).
Biographical Social Networks on Wikipedia – A cross-cultural study of links that made history.
WikiSym ’12 – 8th International Symposium on Wikis and Open Collaboration
For each Wikipedia edition, most central biographies by betweenness
9. CONCLUSION
Finally, the gender gap among Wikipedia editors is a serious concern
for the community, and has been related to the topics covered in the encyclopedia.
Our results point out a very small presence of females also among the most central persons
in the encyclopedic content, suggesting the link between these two phenomena as an
intriguing subject for future investigation.
13. Eom, Y. H., Aragón, P., Laniado, D., Kaltenbrunner, A., Vigna, S., & Shepelyansky, D. L. (2015).
Interactions of cultures and top people of Wikipedia from ranking of 24 language editions.
PloS one, 10(3), e0114825.
PROJECT #2
● Networks of links between
Wikipedia articles in 24
language editions
● Biographical articles are
identified a posteriori
● Extraction of features of
biographies with DBpedia
(birth place, birth date, gender)
14. Eom, Y. H., Aragón, P., Laniado, D., Kaltenbrunner, A., Vigna, S., & Shepelyansky, D. L. (2015).
Interactions of cultures and top people of Wikipedia from ranking of 24 language editions.
PloS one, 10(3), e0114825.
For each Wikipedia edition, birth date distribution of central biographies
Arabic &
Persian from
6th
to 10th
century AD
Greek from
6th
to 5th
century BC
15. “The reason for a somewhat unexpected PageRank leader Carl Linnaeus is related to
the fact that he laid the foundations for the modern biological naming scheme
so that plenty of articles about animals, insects and plants point to the
Wikipedia article about him, which strongly increases the PageRank probability.”