Slides for Muslims in ML workshop presentation at NeurlPS 2020 on December 8, 2020 - this is a shorter 25 minute version of the UMass Lowell talk of November 2020 (so the slides are a subset of that).
Hireâ Young Call Girls in Hari Nagar (Delhi) âïž 9205541914 âïž Independent Esco...
Â
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifying Islamophobia in Social Media
1. Automatically Identifying Islamophobia
in Social Media
Ted Pedersen
Department of Computer Science
University of Minnesota, Duluth
tpederse@umn.edu
@SeeTedTalk
http://www.d.umn.edu/~tpederse
2. Todayâs Agenda
Islamophobia in general and in Minnesota
Collecting & Annotating Twitter Data
What Weâve Learned from Annotating
The Way Forward
3. Islamophobia
A legacy of colonial histories,
particularly those that view the
Muslim world as exotic,
savage, dangerous - all
leading to a âClash of
Civlilizations.â
Orientalism (Said, 1978)
4. Islamophobia
A recent term for an older phenomena
Runnymede Trust (1997,
2017)
Unfounded hostility towards Islam
Practical consequences of such
hostility in unfair discrimination
against Muslim individuals and
communities
Exclusion of Muslims from mainstream
political and social affairs.
5. Common Anti-Muslim Tropes (Bridge Institute)
Islam and Muslims are inherently
violent.
Islam and Muslims are oppresive
to women.
Islam and Muslims are intolerant
towards other religions.
Islam is a political ideology, not
a religion.
In the West, Mulims are using
non-violent stealth jihad to
implement Sharia Law.
Islam is foreign, medieval, and
at odds with Western modernity.
Islam is a monolith.
All Muslims are Arab or Brown.
11. Goal : Identify Islamophobia in Social Media Text
Why?
Relatively understudied in NLP.
Highly intersectional problem since Muslim identity is multi-faceted.
Significant influence on events in the World, the USA, and Minnesota.
How?
Use ideas from NLP, especially Hate Speech Detection.
Create annotated corpora in order to understand problem better, and then
apply Machine Learning or Deep Learning.
12. Guiding Principles
This is not Just Another Classification Task. Seek out domain expertise, build
relationships, donât reduce the problem to a data set.
Frey et al. (2018). Artificial Intelligence and Inclusion: Formerly Gang-Involved
Youth as Domain Experts for Analyzing Unstructured Twitter Data. Social
Science Computer Review.
Creating annotated data is likely necessary. Be careful to fully document the
decisions made along the way, paying special attention to annotator background.
Bender and Friedman (2018) Data Statements for NLP : Toward Mitigating
Systems Bias and Enabling Better Science. TACL.
13. How Can We Detect Islamophobia (with NLP)?
Carry out a Qualitative Analysis of text with input from domain experts.
Collect and annotate Tweets.
Seek out a diverse pool of annotators.
Develop annotation scheme / code book.
Be Iterative.
Carry out Quantitative Analysis using Machine Learning or Deep Learning.
14. Data Collection
Islamophobia is global, but has many local variations each with their own issues,
terminology, and ways of being expressed.
This suggests the need for the data to have a regional focus - Islamophobia in the
UK, France, India, the USA, Minnesota, etc.
While she is a national figure, Ilhan Omar is from Minnesota, and our data
collection starts with her.
Muslim, but also a black Somali woman who was an immigrant / refugee.
Highly intersectional identity.
15. Tweet Collection (using Twitter public API)
Collecting since April 2019, any tweet that includes one or more of :
âIlhan omarâ, ilhan, omar, @ilhanmn, ilhanmn, #ilhanmn, #ilhanomar, #ilhan
Pilot Annotation based on April 2019 - April 2020, approx 5 million total tweets.
1020 Annotation based on Nov. 2019 - Oct. 2020, approx 10 million total tweets.
Twitter public API does not give you all tweets, downsamples.
16. 1020 Annotation (October 2020)
9.6 million tweets (incl. RT) collected Nov 2019 - Oct 2020.
1 million unique tweets.
Selected random samples of 384 tweets for annotation.
Agreement improved with more extensive set of labels.
Began to consider profile descriptions of âspeakersâ (Tweeters).
17. 1020 Annotation Labels
Neutral - apolitical or about someone
other than Ilhan Omar
Support - expresses support for position
or person of Ilhan Omar
Political - expression of political
difference of opinion with Ilhan Omar
Insult - personal insult directed at Ilhan
Omar not related to other labels
Immigration - Ilhan Omar has committed
fraud to remain in USA
Terrorist - Ilhan Omar is a terrorist or
supports them
Loyalty - Ilhan Omar is unAmerican,
disployal, or a traitor
Jail - Ilhan Omar should be prosecuted,
convicted, or incarcerated
Sharia - Ilhan Omar wants to replace US
law with Sharia Law
Adultery - Ilhan Omar is an adulterer or
married to her brother
22. Most frequent 2 grams in (re)Tweeter profiles
#maga #kag (25416), trump supporter (18969), trump 2020 (14241), president
trump (13951), husband father (12562), pro life (11502), happily married (10383),
god family (9690), proud american (9281), god bless (9100), wife mother (8487),
lives matter (7699), love god (7609), wife mom (6833), #maga #trump2020 (6799),
maga kag (6195), jesus christ (6187), christian conservative (6103), #kag
#trump2020 (6096), family country (5749), business owner (5733), american
patriot (5055), bless america (4916), common sense (4672), #trump2020 #maga
(4478), black lives (4230), truth seeker (4138), conservative christian (4132),
father husband (3991), donald trump (3931), constitutional conservative (3908),
united states (3884), 2nd amendment (3841), mother grandmother (3811),
america great (3801), #maga #wwg1wga (3725), army veteran (3486), human
rights (3419), dog lover (3414), #wwg1wga #maga (3112), free speech (3044)
23. 1 grams (muslim,islam,quran) in all 1020 Tweets
muslim (14,791), muslims (4,849), islamic (3,827),
islam (3,302,), islamist (1650), islamophobia (607),
islamists (600), quran (580), islamophobic (553),
congressmuslim (435)
24. 2 grams
a muslim (2,446), muslim brotherhood (1,631), the
muslim (1,440), islamic terrorist (594), anti muslim
(591), radical muslim (512), muslims in (483), muslim
woman (477), radical islamic (427), islam is (412)
25. 3 grams
the muslim brotherhood (624), is a muslim (488),
congressmuslim ilhan omar (376), as a muslim
(285), a muslim american (217), muslim ilhan omar
(197), muslim american trump (195), of the muslim
(192), a radical muslim (181), muslim anti
immigrant (181)
26. 4 grams
as a muslim american (198),a muslim american trump
(195),muslim american trump admirer (191),ahmed as
a muslim (183),muslim anti immigrant anti (175),she is
a muslim (171),somali congressmuslim ilhan omar
(166),omar is a muslim (151),muslim brotherhood ilhan
omar (136),muslim refugee dalia al (119)
27. as a muslim american trump (193), a muslim american
trump admirer (191), muslim american trump admirer i
(187), ahmed as a muslim american (182), muslim anti
immigrant anti black (152), qanta ahmed as a muslim
(144), icg obama isis soros muslim (117), obama isis soros
muslim brotherhood (117), isis soros muslim brotherhood
ilhan (116),omar and the progressive islamist (115)
5 grams
29. 1020 Annotation Labels
Neutral - apolitical or about someone
other than Ilhan Omar
Support - expresses support for position
or person of Ilhan Omar
Political - expression of political
difference of opinion with Ilhan Omar
Insult - personal insult directed at Ilhan
Omar not related to other labels
Immigration - Ilhan Omar has committed
fraud to remain in USA
Terrorist - Ilhan Omar is a terrorist or
supports them
Loyalty - Ilhan Omar is unAmerican,
disployal, or a traitor
Jail - Ilhan Omar should be prosecuted,
convicted, or incarcerated
Sharia - Ilhan Omar wants to replace US
law with Sharia Law
Adultery - Ilhan Omar is an adulterer or
married to her brother
30.
31. Lessons Learned
Impact of âlock her upâ and âsend her backâ rhetoric clearly seen in annotation.
Annotation labels must be nuanced, canât simply label as Islamophobic or not
since content may be based on gender, race, immigration or marital status,
political beliefs in addition to or instead of religion.
A highly visible or politicized personality attracts a lot of repetitive and viral content
based on most recent accusation or conspiracy.
Profile descriptions are an important clues.
32. Current Questions
Which public events are correlated with online Islamophobia?
What is the impact of Tweeter location and profile description?
How are less prominent public figures who are Muslim targeted?
Are political figures who are known to be Christian, Jewish, Hindu, and other
religions targeted to greater or lesser extents?
Can crowdsourcing be effective for more nuanced annotation problems?
33. Automatically Identifying Islamophobia
in Social Media
Ted Pedersen
Department of Computer Science
University of Minnesota, Duluth
tpederse@umn.edu
@SeeTedTalk
http://www.d.umn.edu/~tpederse