Time Series Foundation Models - current state and future directions
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
1. Digital Enterprise Research Institute www.deri.ie
Deletion Discussions in Wikipedia*:
Decision Factors and Outcomes
Jodi Schneider, Alexandre Passant, & Stefan Decker
WikiSym 2012 Wednesday 29th August 2012
Linz, Austria
*enWP
Copyright 2011 Digital Enterprise Research Institute. All rights reserved.
Enabling Networked Knowledge
1
2. Big questions about WP
Digital Enterprise Research Institute www.deri.ie
Is crowdsourcing sustainable?
Is content bias manageable?
Does it matter who writes WP?
How can newcomers be welcomed and socialized?
Enabling Networked Knowledge
2
3. … are related to Deletion
Digital Enterprise Research Institute www.deri.ie
Is crowdsourcing sustainable?
How do we maintain content through deletion?
Is content bias manageable?
Are new articles needed? Are they welcomed?
Does it matter who writes WP?
… or who makes deletion decisions?
How can newcomers be welcomed and socialized?
Deletion threatens editor retention
– 1 in 3 editors begin by creating a new article
– 7 times as likely to stay if their article is kept
Source: [[User:Mr.Z-man/newusers]] via
[[Wikipedia:Wikipedia_Signpost/2011-04-04/Editor_retention]]
Enabling Networked Knowledge
3
4. Overall Goals
Digital Enterprise Research Institute www.deri.ie
Understand outcomes of deletion discussions
What are good outcomes for articles?
... for the community?
Provide support to various groups
Readers/New Editors
Debate Closers
People Reading Archived Debates
Enabling Networked Knowledge
4
5. This Study’s Research
Questions
Digital Enterprise Research Institute www.deri.ie
1. What factors contribute to the decision about whether
to delete a given article?
2. When multiple factors are given, what is the relative
importance of those factors?
3. What are the outcomes of deletion discussions, both
for articles and for the community?
Enabling Networked Knowledge
5
6. Overview
Digital Enterprise Research Institute www.deri.ie
Outcomes (RQ3)
Data, Methods, Previous Research
Factors (RQs 1&2)
Future Work on Support (Demo)
Enabling Networked Knowledge
6
11. Community: Good Outcomes
Digital Enterprise Research Institute www.deri.ie
Learning to argue effectively
Becoming more detached from content
Introducing new editors to community values
Developing new editors’ editing skills
Enabling Networked Knowledge
11
12. Example: Good
Community Outcomes
Digital Enterprise Research Institute www.deri.ie
William Vickers (fiddler)
1 main author – their first article
Nominated for deletion after 1 hour and 20 minutes
Shaped during the process
Enabling Networked Knowledge
12
13. Changes During AfD
Digital Enterprise Research Institute www.deri.ie
Article renamed to William Vickers manuscript
Discography added
26 edits from this author
Enabling Networked Knowledge
13
14. Supporting the Editor
Digital Enterprise Research Institute www.deri.ie
First article this editor created.
Overall 11 articles later created by this editor.
Creator made many more edits to this article.
26 edits, compared to 3-9 edits to his later articles.
Enabling Networked Knowledge
14
16. Overview
Digital Enterprise Research Institute www.deri.ie
Outcomes (RQ3)
Data, Methods, Previous Research
Factors (RQs 1&2)
Future Work on Support (Demo)
Enabling Networked Knowledge
16
17. Discussion-based Deletion
Digital Enterprise Research Institute www.deri.ie
“Articles for Deletion” (AfD)
Most contentious
Articulated decision-making
500+ deletion discussions/week
~12% of deletions Lam & Riedl. “Is Wikipedia growing a longer tail?”
GROUP ’09
Enabling Networked Knowledge
17
18. Dataset
Digital Enterprise Research Institute www.deri.ie
Data Corpus: “Typical Day”
72 deletion discussions
January 29, 2011
English Wikipedia only
Enabling Networked Knowledge
18
19. Methods
Digital Enterprise Research Institute www.deri.ie
Deep analysis of a moderate-sized dataset
Representative sample
Intensive manual analysis
Annotation with multiple coders
Descriptive statistics
Enabling Networked Knowledge
19
20. Previous Research
Digital Enterprise Research Institute www.deri.ie
Shallow analysis of large datasets
Redacted content
– West & Lee, “What Wikipedia deletes” WikiSym 2011
Vote sequencing
– Taraborelli & Ciampaglia “Beyond notability” SASOW 2011
Decision quality
– Lam, Karim & Riedl “The effects of group composition on decision
quality in a social production community”, GROUP 2010
Who participates, what & how much gets deleted
– Priedhorsky, Chen, Lam, Panciera, Terveen, & Riedl. “Creating,
destroying, and restoring value in Wikipedia”, GROUP 2007
– Geiger & Ford “Participation in Wikipedia’s article deletion processes”,
WikiSym 2011
Enabling Networked Knowledge
20
21. From Reading to Editing
Digital Enterprise Research Institute www.deri.ie
How can newcomers be welcomed and socialized?
Deletion threatens editor retention
– 1 in 3 editors begin by creating a new article
– 7 times as likely to stay if their article is kept
Source: [[User:Mr.Z-man/newusers]] via
[[Wikipedia:Wikipedia_Signpost/2011-04-04/Editor_retention]]
Enabling Networked Knowledge
21
23. Notabili-what?
Digital Enterprise Research Institute www.deri.ie
22% of all deletions are speedy deleted for
A7: No indication of importance
Geiger & Ford WikiSym 2011
Enabling Networked Knowledge
23
24. Reader’s View of Deletion
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
24
25. Novices vs. Experts in
deletion discussions
Digital Enterprise Research Institute www.deri.ie
Worthwhile content that is poorly defended -> deleted
Need Wikipedia knowledge (procedural knowledge)
Need content knowledge
Enabling Networked Knowledge
25
26. Articulate Values/Criteria
Digital Enterprise Research Institute www.deri.ie
4 Factors in Deletion Discussions cover
91% of comments
70% of discussions
Enabling Networked Knowledge
26
27. Articulate Values/Criteria
Digital Enterprise Research Institute www.deri.ie
4 Factors in Deletion Discussions cover
91% of comments
70% of discussions
The best way to avoid deletion is for readers to
understand these criteria.
Enabling Networked Knowledge
27
29. 4 Factors (RQ1)
Factor Example (used to justify `keep')
Notability Anyone covered by another encyclopedic
reference is considered notable enough
for inclusion in Wikipedia.
Sources Basic information about this album at a
minimum is certainly verifiable, it's a
major label release, and a highly notable
band.
Maintenance …this article is savable but at its current
state, needs a lot of improvement.
Bias It is by no means spam (it does not
promote the products).
Other I'm advocating a blanket "hangon" for all
articles on newly- drafted players
Jodi Schneider, Alexandre Passant & Stefan Decker
Deletion Discussions in Wikipedia: Decision Factors and Outcomes
30. Articulate Values/Criteria
Digital Enterprise Research Institute www.deri.ie
4 Factors in Deletion Discussions cover
91% of comments
70% of discussions
The best way to avoid deletion is for readers to
understand these 4 criteria:
Notability
Sources
Maintenance
Bias
Enabling Networked Knowledge
30
31. Other [Size], is not some little stub article, either. If you don't
Maintenance like the way the article is written, then fix it. …
Factors in Context
Digital Enterprise Research Institute www.deri.ie
Decision Messages
Factors
Sources, Read likes an WP:OR book report, only two citations,
Notability no content about why book is notable
Agreed... Also somewhat biased in tone. Merge and
Bias Redirect
Keep. This article has been in existence since 2004. It
Other [Size], is not some little stub article, either. If you don't
Maintenance like the way the article is written, then fix it. …
Enabling Networked Knowledge
31
32. Relative importance (R2)
Digital Enterprise Research Institute www.deri.ie
Notability trumped by other values
Comprehensiveness > Notability (given Sources)
Keeping a (non-notable) Velvet Underground album
we shouldn’t mechanically apply notability guidelines in this
instance, where it would “[punch a] hole in their otherwise
comprehensive discography.”
Maintenance > Notability
Deleting a notable topic due to maintenance
this is the rare case where notability is not the main argument in
favor of deletion. It has been demonstrated that the subject is
already covered in numerous other articles and that those articles
do a much better, more thorough job of covering the topic.
Enabling Networked Knowledge
32
33. Issues
Digital Enterprise Research Institute www.deri.ie
Discussions fail without comments
Interactions with article creators
Contentious
Learning opportunity
Conflicts around consensus values
Notability
– Why just because it is a small team and not major does it not
deserve it’s (sic) own page on here?
Reliable sources
Policy development is separated from case debates
Frankly, the basis of my disagreement with you here is that I
don’t agree with the guideline.
Enabling Networked Knowledge
33
34. Future Work
Digital Enterprise Research Institute www.deri.ie
Factor-based view of deletion
Please give me feedback!
Enabling Networked Knowledge
34
38. Thanks!
Digital Enterprise Research Institute www.deri.ie
jodi.schneider@deri.org
http://jodischneider.com/jodi.html
@jschneider
User:Jodi.a.schneider
Enabling Networked Knowledge
38
43. Novices don’t understand
notability
Digital Enterprise Research Institute www.deri.ie
Notability vs. real-world importance
Emsworth Cricket Club is one of the oldest cricket clubs in the world, and this
really is worth a mention. Especially on a website, where pointless people …
gets a mention.
Why just because it is a small team and not major does it not deserve it’s (sic)
own page on here?
Enabling Networked Knowledge
43
Hinweis der Redaktion
Many readers are shocked to learn that Wikipedia deletes articles, and some new editors first learn about Wikipedia’s quality standards and the deletion process when an article they wrote is removed. Retaining these editors is more challenging, particularly for the large percentage (~33%) of novice editors who begin editing by creating new articles.
Wales:Ok if there’s nothing more to say about a topicProblematic if content gets deleted
And an entire category “individual garments”
“Scaring away” editors who “don’t get it”
William Vickers (fiddler).Nominated for deletion 1 hour and 20 minutes after its creation, William Vickers (fiddler) has had few edits outside its main author; others made five of its 43 edits during the AfD process and mainly as a part of that process. Yet the AfD process shaped this page. The author’s contributions are certainly more voluminous due to the AfD. This was the first of eleven articles created by this author: only one has more than nine contributions from him (it has 26), and many have as few as three of his contributions.Suggestions made in the AfD were implemented in the page. First, they led the author to rename the page focusing on a more appropriate topic: the manuscript rather than the man who Little is known of. Second, in response to a call for further sources, the primary author added a discography. Although similar discussions could have happened on the article discussion page (this article still has none, barring a link to the now-closed AfD), immediate feed- back (which came not long after the article was created, in the first 3 hours after the deletion nomination) was probably helpful to the article development. The length of the debate period may also have been a factor: the no consensus decision is in part due to lack of comments when the debate was relisted, twice, for further discus- sion.=======St. Andrew’s Episcopal School (Amarillo, Texas).While mentoring a new contributor was also a feature of the AfD for St. Andrew’s Episcopal School (Amarillo, Texas), there was far more negative emotion. Its importance, or notability, was the main issue of contention: Except in unusual circumstances, elementary schools are redirected to the corresponding high school. The pri- mary question, then, was whether this school (which, as an inde- pendent school with no district or high school, lacked an obvious redirection target) was sufficiently notable on its own.The contributor’s behavior, not just the article, came under dis- cussion: s/he had marked other articles for possible deletion in the PROD process, garnering a cynical response: Maybe it’s in bad taste but if my school does not meet WP standards then why should others?? This was followed up by a message indicating discour- agement: To be honest it’s been a real turn off adding articles to WP and I don’t think I will add articles again. So smile and enjoy. Only the persistence of an advocate for the novice, who co-edited and argued strongly for the article, ameliorated the situation.This was the third article the user created, all within a single week. Again, AfD helped increase contributions: as opposed to three or four contributions from this user, s/he has made 18 edits to the article, which has received 53 edits overall (38 during the AfD process itself).Previous research has found that creators rarely (17.59%) discuss the deletion of their articles [16]; encouraging positive interactions with creators should be a design goal of future development. In our corpus, negative interactions were mainly due to conflicts around Wikipedia’s consensus values; article creators who do not under- stand these values express frustration with the process. In extreme cases, creators are banned from these negative interactions (this happened once in our corpus, with a novice editor whose autobiog- raphy had inherent sourcing problems). We next discuss conflicts around consensus values.
22% of all deletions are speedy deleted for A7: No indication of importance (Geiger & Ford WikiSym 2011)======R. S. Geiger and H. Ford.. In WikiSym ’11, pages 201–202.http://www.wikisym.org/ws2011/_media/proceedings:p201-geiger.pdfS. K. Lam, J. Karim, and J. Riedl. The effects of groupcomposition on decision quality in a social productioncommunity. In GROUP ’10, pages 55–64.Reid Priedhorsky, Jilin Chen, Shyong (Tony) K. Lam, Katherine Panciera, Loren Terveen, and John Riedl. Creating, destroying, and restoring value in wikipedia. In GROUP '07: Proceedings of the 2007 International ACM Conference on Supporting Group Work, pages 259-268.D. Taraborelli and G. L. Ciampaglia. Beyond notability. Collective deliberation on content inclusion in Wikipedia. In Fourth IEEE International Conference on Self-Adaptive and Self-Organizing Systems Workshops, 2010) G. West and I. Lee. What Wikipedia deletes: Characterizing dangerous collaborative content. In WikiSym ’11, pages 25–28.
What belongs in an article?Notability
very few content standards need to be clearly communicated to readers in order to bring significant benefit. 69.5% of discussions and 91% of comments are well-represented by just four factors: Notability, Sources, Maintenance and Bias. The best way to avoid deletion is for readers to understand these criteria.
very few content standards need to be clearly communicated to readers in order to bring significant benefit. 69.5% of discussions and 91% of comments are well-represented by just four factors: Notability, Sources, Maintenance and Bias. The best way to avoid deletion is for readers to understand these criteria.
very few content standards need to be clearly communicated to readers in order to bring significant benefit. 69.5% of discussions and 91% of comments are well-represented by just four factors: Notability, Sources, Maintenance and Bias. The best way to avoid deletion is for readers to understand these criteria.
Even when a topic’s notability is not disputed, it may factor into the discussion, as this closing summary emphasizes: this is the rare case where notability is not the main argument in favor of deletion. It has been demonstrated that the subject is already covered in numerous other articles and that those articles do a much better, more thorough job of covering the topic.