SlideShare a Scribd company logo
1 of 53
Download to read offline
Content Moderation Across Multiple Platforms
with Capsule Networks and Co-Training
Vani Agarwal
linkedin/in/vani-agarwal-04a02bb
b/
@VaniAgarwal9 fb.com/vani.agarwal30
Dr. Arun Balaji (Chair)
Dr. Ponnurangam Kumaraguru (Co-chair)
2
Thesis Committee
◆ Dr. Rajiv Ratn Shah, IIIT Delhi
◆ Dr. Niharika Sachdeva, InfoEdge
◆ Dr. Arun Balaji Buduru, IIIT Delhi
◆ Dr. Ponnurangam Kumaraguru, IIIT Delhi
Demo
3
What is content moderation
4Ref: 11
Different platforms have certain
policies, if content does not
meet the guidelines than
moderation action takes place.
Why content moderation is necessary
5
◆
Posts on different platforms
6
Challenges for content moderation
7
◆ Different needs of different
platforms
◆ Huge amount of content
◆ The way in which content
displayed differs for each
platform
Platforms struggling to moderate content
8
Human
Moderators
suffer PTSD
High
Turnaround
Time
Ref: 1
High Cost
Research Aim
Given posts P = p1
,p2
,. . . ., pk
from domains D = D1
,D2
,. . . .,
Dn
, find a subset of posts which should be flagged for
moderation
9
INPUT OUTPUT
List of items to
be moderated
P’ where P’ ⊆ P
MODEL
Contributions
◆ Comparison of different methods across multiple platforms
◆ Capsule Networks for Content Moderation
◆ Co-training to understand domain adaptability
10
Outline
◆ Data Collection
◆ Comparison of Methods
◆ Capsule Networks
◆ Co-training for Domain Adaptation
◆ Conclusion
11
Data Collection
12
◆ Twitter, Quora, Wikipedia - public datasets
◆ Whisper - combination of public dataset and website
scraping
◆ Reddit - Collected data from subreddits
Data Collection of Reddit
◆ Subreddit r/creepy
▶ violent content
▶ Weak labels
◆ Subreddit r/pics
▶ normal content
▶ Manually checked 200 posts to see if they are problematic or
not
13
Twitter
14
Reddit
15
Wikipedia
“hey punk dont be deleting my stuff,
you know nothing bout the harly drags
so stay out of my shit you stupid nerd,
punk fag female thats all u, bitch”
16
Quora
17
Whisper
18
Dataset Summary
19
Twitter Reddit Wikipedia Whisper Quora
Data
collection
strategy
1. Tweets
related to
protest,
riots[7]
Collected
data from
subreddits:
r/creepy
r/pics
1.Comments
of personal
attacks[3]
Hate
speech
related
posts[4]
and web
scraping
Insincere
questions
asked on
Quora[2]
2. Tweets
related to
Racism,
Sexism[1]
2. Toxic
comments
on talk
page[12]
Dataset Summary
20
Dataset Positive(1) Negative(0) Total Positive
class %
Text / Image
Quora 817 12244 13061 6% Text
Whisper 760 1720 2480 30% Text
Wikipedia1 647 5146 5793 11% Text
Wikipedia2 783 7195 7978 10% Text
Twitter2 1200 2000 3200 37% Text
Twitter1 3619 1052 4671 77% Text + Image
Reddit 2073 2598 4671 44% Text + Image
Data Pre-processing
Tweet - "nice to see that the top trending post by suriya
#TamilNaduBandh #Saithan are located around TamilNadu"
Anonymized Tweet - "nice to see that the top trending post by
<NAME> are located around tamilnadu"
21
Lower case
Remove hashtags, emoticons, punctuations
Named Entity Recognizer
Outline
◆ Data Collection
◆ Comparison of Methods
◆ Capsule Networks
◆ Co-training for Domain Adaptation
◆ Conclusion
22
Methods
23
Text
Models
Logistic Regression[5] LR_machina
Logistic Regression[6] LR_Badjatiya
Multi Layer Perceptron[5] MLP
Gated Recurrent Unit[7] GRU
Long short term memory[7] LSTM
Convolutional Neural Network[6] CNN
Capsule Network CapsNet
Fusion
Models
LSTM + (Object + Scene recognition) LstmFusion
CapsNet + (Object + Scene recognition) CapsFusion
Outline
◆ Data Collection
◆ Comparison of Methods
◆ Capsule Networks
◆ Co-training for Domain Adaptation
◆ Conclusion
24
Capsule Network Intuition
25Ref: 10
◆ It is not a human face.
◆ Capsule networks
understand spatial
orientation.
◆ Max Pooling loses
information.
Capsule Working
26Ref: 10
Why CapsNet?
◆ Capsules output vector.
◆ Each Capsule decides which feature to
pass to higher capsule.
◆ Prominent features are transformed from
one Capsule to another using routing
protocol.
◆ This helps to learn semantic meaning of
text data well.
27
Capsule Network Architecture
28
Experimental Design
- Evaluation parameters used
- Average Precision or Area under PR curve
- Macro F1
- 5 fold cross validations
- Grid search on various Hyper-parameters
- Train set - 80%
- Test set - 20%
29
Text All Model Results
30
Method Macro F1 Average Precision
GRU 0.6977 0.6983
LR_Badjatiya 0.7560 0.7260
LR_machina 0.6772 0.3805
CNN 0.6400 0.6960
LSTM 0.7076 0.7057
MLP 0.7300 0.6916
CapsNet 0.8254 0.7695
Performance in Twitter2 dataset
Text Model Results on all Datasets
31
Dataset Method Macro F1 Average Precision
Quora CapsNet 0.6959 0.9269
LSTM 0.6731 0.6560
Reddit CapsNet 0.7967 0.7373
LSTM 0.7306 0.7321
Twitter1 CapsNet 0.7953 0.7695
LSTM 0.8635 0.6748
Twitter2 CapsNet 0.8254 0.7695
LSTM 0.7076 0.7057
Text Model Results on all Datasets
32
Dataset Method Macro F1 Average Precision
Whisper CapsNet 0.9856 0.9783
LSTM 0.9816 0.9816
Wikipedia1 CapsNet 0.8361 0.9195
LSTM 0.7775 0.7413
Wikipedia2 CapsNet 0.8361 0.9195
LSTM 0.8098 0.7698
Average CapsNet 0.8244 0.8600
LSTM 0.7919 0.7516
Fusion Model Architecture
33
Fusion Model Results
34
Dataset Method Macro F1 Average Precision
Twitter1 LstmFusion 0.6711 0.8381
CapsFusion 0.6968 0.8613
Reddit LstmFusion 0.7529 0.7566
CapsFusion 0.8141 0.8149
Takeaways from Capsule Network Model
◆ Capsule networks perform better than LSTM by 10.54% in
average precision.
◆ CapsFusion model performs better than LstmFusion by
5.2% in average precision.
35
Error Analysis
◆ Manual analysis of 50 instances marked wrong by LSTM but
correctly by CapsNet
◆ Findings:
▶ False Positives by LSTM / correctly classified by CapsNet
“ :I think ``YOU RACIST CUNT`` qualifies as defamation. diff. I didn't edit your
post, that was someone else.”
▶ False Negatives by LSTM / correctly classified by CapsNet
“` I had no interest in getting ``under your skin``. It's you and your fellow admins
who got under my skin. So well done. It doesn't matter anymore. — `”
36
Qualitative Analysis
Text - “Where are the activists and foot soldiers when k'tak
bleeds in silence.”
Label - Positive
DeepSHAP[8] results
37
Qualitative Analysis
Text - “The proud hero of kashmir! The hero of freedom
struggle.”
Label - Negative
DeepSHAP results
38
Outline
◆ Data Collection
◆ Comparison of Methods
◆ Capsule Networks
◆ Co-training for Domain Adaptation
◆ Conclusion
39
40Ref: [9]
Example of Domain Adaptation
41
Example of Domain Adaptation from CODA paper (Chen et
al) on reviews from different domains
Our Co-training for Domain Adaptation Algorithm
42
Co-training for Domain Adaptation
43
Twitter1
Reddit
(20%)
+
Training
Testing
Twitter 2
(20%)
Whisper
(20%)
M1
M2
+
+
. . .
M6
Reddit
(80%)
Twitter2
(80%)
Whisper
(80%)
. . .
Co-training for Domain Adaptation Results
44
Trained on
(Domain1)
Co-trained on
(Domain2)
Method Macro F1 Average
Precision
Twitter1
Quora
CapsNet
0.6739 0.6723
Reddit 0.6690 0.5581
Twitter2 0.6321 0.6232
Whisper 0.9167 0.9201
Wikipedia1 0.7496 0.7480
Wikipedia2 0.7341 0.7457
Tradeoff Analysis
45
Domain1-Twitter1 and Domain2-Reddit
Co-training Results
◆ Just by augmenting with 20% samples we face reduction in
performance by 17% compared to a model trained on 100%
samples.
◆ As the percentage of Domain2 samples added to Domain1
increases, models performance improves.
◆ Therefore, if we only have a small amount of labeled data
co-training for domain adaptation is a viable option.
46
Outline
◆ Data Collection
◆ Comparison of Methods
◆ Capsule Networks
◆ Co-training for Domain Adaptation
◆ Conclusion
47
Conclusion
48
◆ We perform Multi-platform comparison for Content
moderation.
◆ Capsule Networks outperformed existing methods for
content moderation.
◆ Co-training for domain adaptation, a cost-effective solution
for annotating data.
Challenges, Limitation, Future Work
49
◆ Some datasets use weak labels,
▶ Future work: see if stronger labels perform better than weak
labels.
◆ Different platforms have different style of expressing content and
also have different moderation policies
▶ Co-training may not work if the policies of platforms do not
align.
◆ It was challenging to collect Reddit dataset
▶ Quarantined subreddits no longer available
◆ We plan to extend the work to video content on various platforms.
Acknowledgement
◆ Committee Members
◆ Indira Sen, GESIS
◆ Snehal Gupta, Asmit Kumar Singh, Shubham Singh
◆ Members of Precog
◆ Family and friends
50
References
[1]Twitter data- https://github.com/zeerakw/hatespeech
[2] Quora data-
https://www.kaggle.com/c/quora-insincere-questions-classificati
on/data
[3] Wikipedia data-
https://figshare.com/articles/Wikipedia_Detox_Data/4054689
[4] Whisper data-
https://github.com/Mainack/hatespeech-data-HT-2017
[5] ExMachina - https://arxiv.org/abs/1610.08914
[6] Badjatiya- https://arxiv.org/abs/1706.00188
51
References
[7] LSTM-
http://precog.iiitd.edu.in/pubs/empowering-first-responders.pdf
[8] DeepShap - https://github.com/slundberg/shap
[9] Co-training for domain adaptation -
https://papers.nips.cc/paper/4433-co-training-for-domain-adapta
tion.pdf
52
Thanks!
vani17068@iiitd.ac.in
@VaniAgarwal9
53

More Related Content

What's hot

H08_固定電話の常識を変える!Teams 電話でますます快適なハイブリッドワークを実現するには [Microsoft Japan Digital Days]
H08_固定電話の常識を変える!Teams 電話でますます快適なハイブリッドワークを実現するには [Microsoft Japan Digital Days]H08_固定電話の常識を変える!Teams 電話でますます快適なハイブリッドワークを実現するには [Microsoft Japan Digital Days]
H08_固定電話の常識を変える!Teams 電話でますます快適なハイブリッドワークを実現するには [Microsoft Japan Digital Days]日本マイクロソフト株式会社
 
Integrating SIS’s with Salesforce: An Accidental Integrator’s Guide
Integrating SIS’s with Salesforce: An Accidental Integrator’s GuideIntegrating SIS’s with Salesforce: An Accidental Integrator’s Guide
Integrating SIS’s with Salesforce: An Accidental Integrator’s GuideSalesforce.org
 
Einstein recommendations how it works
Einstein recommendations  how it works  Einstein recommendations  how it works
Einstein recommendations how it works Cloud Analogy
 
Salesforce Integration
Salesforce IntegrationSalesforce Integration
Salesforce IntegrationJoshua Hoskins
 
Introduction to salesforce lightning bolt
Introduction to salesforce lightning boltIntroduction to salesforce lightning bolt
Introduction to salesforce lightning boltCloud Analogy
 
The complete srs documentation of our developed game.
The complete srs documentation of our developed game. The complete srs documentation of our developed game.
The complete srs documentation of our developed game. Isfand yar Khan
 
Microsoft Endpoint Configuration Manager 概要
Microsoft Endpoint Configuration Manager 概要Microsoft Endpoint Configuration Manager 概要
Microsoft Endpoint Configuration Manager 概要Yutaro Tamai
 
WSO2 Roadmap and Vision
WSO2 Roadmap and VisionWSO2 Roadmap and Vision
WSO2 Roadmap and VisionWSO2
 
Software Design Description (SDD) sample
Software Design Description (SDD) sampleSoftware Design Description (SDD) sample
Software Design Description (SDD) samplePeny Gama
 
sfc Assingment l4dc NCC education
 sfc Assingment l4dc NCC education sfc Assingment l4dc NCC education
sfc Assingment l4dc NCC educationDavid Parker
 
Time Logger- BSc.CSIT Internship report
Time Logger- BSc.CSIT Internship reportTime Logger- BSc.CSIT Internship report
Time Logger- BSc.CSIT Internship reportRashna Maharjan
 
Windows Server Community Meetup #4:時刻の話 - Accurate Network Time -
Windows Server Community Meetup #4:時刻の話 - Accurate Network Time - Windows Server Community Meetup #4:時刻の話 - Accurate Network Time -
Windows Server Community Meetup #4:時刻の話 - Accurate Network Time - wind06106
 
Systems Analysis and Design | Final Project
Systems Analysis and Design | Final Project Systems Analysis and Design | Final Project
Systems Analysis and Design | Final Project Amber Raiford
 
Intro to Office 365 Admin
Intro to Office 365 AdminIntro to Office 365 Admin
Intro to Office 365 AdminNikkia Carter
 
IT エンジニアのための 流し読み Windows - Windows 共有 PC モード
IT エンジニアのための 流し読み Windows - Windows 共有 PC モードIT エンジニアのための 流し読み Windows - Windows 共有 PC モード
IT エンジニアのための 流し読み Windows - Windows 共有 PC モードTAKUYA OHTA
 
Using Pardot and Communities: Marketing with Partner and Dealer Networks
Using Pardot and Communities: Marketing with Partner and Dealer NetworksUsing Pardot and Communities: Marketing with Partner and Dealer Networks
Using Pardot and Communities: Marketing with Partner and Dealer NetworksMatt Dillon
 

What's hot (20)

H08_固定電話の常識を変える!Teams 電話でますます快適なハイブリッドワークを実現するには [Microsoft Japan Digital Days]
H08_固定電話の常識を変える!Teams 電話でますます快適なハイブリッドワークを実現するには [Microsoft Japan Digital Days]H08_固定電話の常識を変える!Teams 電話でますます快適なハイブリッドワークを実現するには [Microsoft Japan Digital Days]
H08_固定電話の常識を変える!Teams 電話でますます快適なハイブリッドワークを実現するには [Microsoft Japan Digital Days]
 
Integrating SIS’s with Salesforce: An Accidental Integrator’s Guide
Integrating SIS’s with Salesforce: An Accidental Integrator’s GuideIntegrating SIS’s with Salesforce: An Accidental Integrator’s Guide
Integrating SIS’s with Salesforce: An Accidental Integrator’s Guide
 
Einstein recommendations how it works
Einstein recommendations  how it works  Einstein recommendations  how it works
Einstein recommendations how it works
 
Windows installation
Windows installation Windows installation
Windows installation
 
Salesforce Integration
Salesforce IntegrationSalesforce Integration
Salesforce Integration
 
Introduction to salesforce lightning bolt
Introduction to salesforce lightning boltIntroduction to salesforce lightning bolt
Introduction to salesforce lightning bolt
 
The complete srs documentation of our developed game.
The complete srs documentation of our developed game. The complete srs documentation of our developed game.
The complete srs documentation of our developed game.
 
Microsoft Endpoint Configuration Manager 概要
Microsoft Endpoint Configuration Manager 概要Microsoft Endpoint Configuration Manager 概要
Microsoft Endpoint Configuration Manager 概要
 
Data model in salesforce
Data model in salesforceData model in salesforce
Data model in salesforce
 
WSO2 Roadmap and Vision
WSO2 Roadmap and VisionWSO2 Roadmap and Vision
WSO2 Roadmap and Vision
 
Software Design Description (SDD) sample
Software Design Description (SDD) sampleSoftware Design Description (SDD) sample
Software Design Description (SDD) sample
 
sfc Assingment l4dc NCC education
 sfc Assingment l4dc NCC education sfc Assingment l4dc NCC education
sfc Assingment l4dc NCC education
 
Time Logger- BSc.CSIT Internship report
Time Logger- BSc.CSIT Internship reportTime Logger- BSc.CSIT Internship report
Time Logger- BSc.CSIT Internship report
 
Windows Server Community Meetup #4:時刻の話 - Accurate Network Time -
Windows Server Community Meetup #4:時刻の話 - Accurate Network Time - Windows Server Community Meetup #4:時刻の話 - Accurate Network Time -
Windows Server Community Meetup #4:時刻の話 - Accurate Network Time -
 
Systems Analysis and Design | Final Project
Systems Analysis and Design | Final Project Systems Analysis and Design | Final Project
Systems Analysis and Design | Final Project
 
Report.docx
Report.docxReport.docx
Report.docx
 
Intro to Office 365 Admin
Intro to Office 365 AdminIntro to Office 365 Admin
Intro to Office 365 Admin
 
Business IT Project
Business IT ProjectBusiness IT Project
Business IT Project
 
IT エンジニアのための 流し読み Windows - Windows 共有 PC モード
IT エンジニアのための 流し読み Windows - Windows 共有 PC モードIT エンジニアのための 流し読み Windows - Windows 共有 PC モード
IT エンジニアのための 流し読み Windows - Windows 共有 PC モード
 
Using Pardot and Communities: Marketing with Partner and Dealer Networks
Using Pardot and Communities: Marketing with Partner and Dealer NetworksUsing Pardot and Communities: Marketing with Partner and Dealer Networks
Using Pardot and Communities: Marketing with Partner and Dealer Networks
 

Similar to Content Moderation Across Multiple Platforms with Capsule Networks and Co-Training

[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language ModelsDataScienceConferenc1
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadatamarkgrover
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
 
Semantic Analysis to Compute Personality Traits from Social Media Posts
Semantic Analysis to Compute Personality Traits from Social Media PostsSemantic Analysis to Compute Personality Traits from Social Media Posts
Semantic Analysis to Compute Personality Traits from Social Media PostsGiulio Carducci
 
Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017Clarisse Hedglin
 
Semi-Supervised Insight Generation from Petabyte Scale Text Data
Semi-Supervised Insight Generation from Petabyte Scale Text DataSemi-Supervised Insight Generation from Petabyte Scale Text Data
Semi-Supervised Insight Generation from Petabyte Scale Text DataTech Triveni
 
Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Sonya Liberman
 
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaEfficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaGezim Sejdiu
 
Unevenly Distributed
Unevenly DistributedUnevenly Distributed
Unevenly DistributedC4Media
 
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal GreenplumSimplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal GreenplumVMware Tanzu
 
DockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopDockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopKevin Crawley
 
how to build a Length of Stay model for a ProofOfConcept project
how to build a Length of Stay model for a ProofOfConcept projecthow to build a Length of Stay model for a ProofOfConcept project
how to build a Length of Stay model for a ProofOfConcept projectZenodia Charpy
 
deep_Visualization in Data mining.ppt
deep_Visualization in Data mining.pptdeep_Visualization in Data mining.ppt
deep_Visualization in Data mining.pptPerumalPitchandi
 
EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...ChemAxon
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Rakebul Hasan
 
Federated mesos clusters for global data center designs
Federated mesos clusters for global data center designsFederated mesos clusters for global data center designs
Federated mesos clusters for global data center designsKrishna-Kumar
 

Similar to Content Moderation Across Multiple Platforms with Capsule Networks and Co-Training (20)

[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Semantic Analysis to Compute Personality Traits from Social Media Posts
Semantic Analysis to Compute Personality Traits from Social Media PostsSemantic Analysis to Compute Personality Traits from Social Media Posts
Semantic Analysis to Compute Personality Traits from Social Media Posts
 
Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017
 
Big Data analytics with Tableau Training by myTectra
Big Data analytics with Tableau Training by myTectraBig Data analytics with Tableau Training by myTectra
Big Data analytics with Tableau Training by myTectra
 
Semi-Supervised Insight Generation from Petabyte Scale Text Data
Semi-Supervised Insight Generation from Petabyte Scale Text DataSemi-Supervised Insight Generation from Petabyte Scale Text Data
Semi-Supervised Insight Generation from Petabyte Scale Text Data
 
Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019
 
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaEfficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
 
Unevenly Distributed
Unevenly DistributedUnevenly Distributed
Unevenly Distributed
 
Saner17 sharma
Saner17 sharmaSaner17 sharma
Saner17 sharma
 
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal GreenplumSimplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
 
DockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopDockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability Workshop
 
how to build a Length of Stay model for a ProofOfConcept project
how to build a Length of Stay model for a ProofOfConcept projecthow to build a Length of Stay model for a ProofOfConcept project
how to build a Length of Stay model for a ProofOfConcept project
 
deep_Visualization in Data mining.ppt
deep_Visualization in Data mining.pptdeep_Visualization in Data mining.ppt
deep_Visualization in Data mining.ppt
 
EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
 
User Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge BaseUser Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge Base
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
Federated mesos clusters for global data center designs
Federated mesos clusters for global data center designsFederated mesos clusters for global data center designs
Federated mesos clusters for global data center designs
 

More from IIIT Hyderabad

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayIIIT Hyderabad
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesIIIT Hyderabad
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasIIIT Hyderabad
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIIIT Hyderabad
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityIIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...IIIT Hyderabad
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper IIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...IIIT Hyderabad
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceIIIT Hyderabad
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...IIIT Hyderabad
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesIIIT Hyderabad
 

More from IIIT Hyderabad (20)

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success stories
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake News
 
#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBias
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial Advice
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian Languages
 

Recently uploaded

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 

Recently uploaded (20)

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 

Content Moderation Across Multiple Platforms with Capsule Networks and Co-Training

  • 1. Content Moderation Across Multiple Platforms with Capsule Networks and Co-Training Vani Agarwal linkedin/in/vani-agarwal-04a02bb b/ @VaniAgarwal9 fb.com/vani.agarwal30 Dr. Arun Balaji (Chair) Dr. Ponnurangam Kumaraguru (Co-chair)
  • 2. 2 Thesis Committee ◆ Dr. Rajiv Ratn Shah, IIIT Delhi ◆ Dr. Niharika Sachdeva, InfoEdge ◆ Dr. Arun Balaji Buduru, IIIT Delhi ◆ Dr. Ponnurangam Kumaraguru, IIIT Delhi
  • 4. What is content moderation 4Ref: 11 Different platforms have certain policies, if content does not meet the guidelines than moderation action takes place.
  • 5. Why content moderation is necessary 5 ◆
  • 6. Posts on different platforms 6
  • 7. Challenges for content moderation 7 ◆ Different needs of different platforms ◆ Huge amount of content ◆ The way in which content displayed differs for each platform
  • 8. Platforms struggling to moderate content 8 Human Moderators suffer PTSD High Turnaround Time Ref: 1 High Cost
  • 9. Research Aim Given posts P = p1 ,p2 ,. . . ., pk from domains D = D1 ,D2 ,. . . ., Dn , find a subset of posts which should be flagged for moderation 9 INPUT OUTPUT List of items to be moderated P’ where P’ ⊆ P MODEL
  • 10. Contributions ◆ Comparison of different methods across multiple platforms ◆ Capsule Networks for Content Moderation ◆ Co-training to understand domain adaptability 10
  • 11. Outline ◆ Data Collection ◆ Comparison of Methods ◆ Capsule Networks ◆ Co-training for Domain Adaptation ◆ Conclusion 11
  • 12. Data Collection 12 ◆ Twitter, Quora, Wikipedia - public datasets ◆ Whisper - combination of public dataset and website scraping ◆ Reddit - Collected data from subreddits
  • 13. Data Collection of Reddit ◆ Subreddit r/creepy ▶ violent content ▶ Weak labels ◆ Subreddit r/pics ▶ normal content ▶ Manually checked 200 posts to see if they are problematic or not 13
  • 16. Wikipedia “hey punk dont be deleting my stuff, you know nothing bout the harly drags so stay out of my shit you stupid nerd, punk fag female thats all u, bitch” 16
  • 19. Dataset Summary 19 Twitter Reddit Wikipedia Whisper Quora Data collection strategy 1. Tweets related to protest, riots[7] Collected data from subreddits: r/creepy r/pics 1.Comments of personal attacks[3] Hate speech related posts[4] and web scraping Insincere questions asked on Quora[2] 2. Tweets related to Racism, Sexism[1] 2. Toxic comments on talk page[12]
  • 20. Dataset Summary 20 Dataset Positive(1) Negative(0) Total Positive class % Text / Image Quora 817 12244 13061 6% Text Whisper 760 1720 2480 30% Text Wikipedia1 647 5146 5793 11% Text Wikipedia2 783 7195 7978 10% Text Twitter2 1200 2000 3200 37% Text Twitter1 3619 1052 4671 77% Text + Image Reddit 2073 2598 4671 44% Text + Image
  • 21. Data Pre-processing Tweet - "nice to see that the top trending post by suriya #TamilNaduBandh #Saithan are located around TamilNadu" Anonymized Tweet - "nice to see that the top trending post by <NAME> are located around tamilnadu" 21 Lower case Remove hashtags, emoticons, punctuations Named Entity Recognizer
  • 22. Outline ◆ Data Collection ◆ Comparison of Methods ◆ Capsule Networks ◆ Co-training for Domain Adaptation ◆ Conclusion 22
  • 23. Methods 23 Text Models Logistic Regression[5] LR_machina Logistic Regression[6] LR_Badjatiya Multi Layer Perceptron[5] MLP Gated Recurrent Unit[7] GRU Long short term memory[7] LSTM Convolutional Neural Network[6] CNN Capsule Network CapsNet Fusion Models LSTM + (Object + Scene recognition) LstmFusion CapsNet + (Object + Scene recognition) CapsFusion
  • 24. Outline ◆ Data Collection ◆ Comparison of Methods ◆ Capsule Networks ◆ Co-training for Domain Adaptation ◆ Conclusion 24
  • 25. Capsule Network Intuition 25Ref: 10 ◆ It is not a human face. ◆ Capsule networks understand spatial orientation. ◆ Max Pooling loses information.
  • 27. Why CapsNet? ◆ Capsules output vector. ◆ Each Capsule decides which feature to pass to higher capsule. ◆ Prominent features are transformed from one Capsule to another using routing protocol. ◆ This helps to learn semantic meaning of text data well. 27
  • 29. Experimental Design - Evaluation parameters used - Average Precision or Area under PR curve - Macro F1 - 5 fold cross validations - Grid search on various Hyper-parameters - Train set - 80% - Test set - 20% 29
  • 30. Text All Model Results 30 Method Macro F1 Average Precision GRU 0.6977 0.6983 LR_Badjatiya 0.7560 0.7260 LR_machina 0.6772 0.3805 CNN 0.6400 0.6960 LSTM 0.7076 0.7057 MLP 0.7300 0.6916 CapsNet 0.8254 0.7695 Performance in Twitter2 dataset
  • 31. Text Model Results on all Datasets 31 Dataset Method Macro F1 Average Precision Quora CapsNet 0.6959 0.9269 LSTM 0.6731 0.6560 Reddit CapsNet 0.7967 0.7373 LSTM 0.7306 0.7321 Twitter1 CapsNet 0.7953 0.7695 LSTM 0.8635 0.6748 Twitter2 CapsNet 0.8254 0.7695 LSTM 0.7076 0.7057
  • 32. Text Model Results on all Datasets 32 Dataset Method Macro F1 Average Precision Whisper CapsNet 0.9856 0.9783 LSTM 0.9816 0.9816 Wikipedia1 CapsNet 0.8361 0.9195 LSTM 0.7775 0.7413 Wikipedia2 CapsNet 0.8361 0.9195 LSTM 0.8098 0.7698 Average CapsNet 0.8244 0.8600 LSTM 0.7919 0.7516
  • 34. Fusion Model Results 34 Dataset Method Macro F1 Average Precision Twitter1 LstmFusion 0.6711 0.8381 CapsFusion 0.6968 0.8613 Reddit LstmFusion 0.7529 0.7566 CapsFusion 0.8141 0.8149
  • 35. Takeaways from Capsule Network Model ◆ Capsule networks perform better than LSTM by 10.54% in average precision. ◆ CapsFusion model performs better than LstmFusion by 5.2% in average precision. 35
  • 36. Error Analysis ◆ Manual analysis of 50 instances marked wrong by LSTM but correctly by CapsNet ◆ Findings: ▶ False Positives by LSTM / correctly classified by CapsNet “ :I think ``YOU RACIST CUNT`` qualifies as defamation. diff. I didn't edit your post, that was someone else.” ▶ False Negatives by LSTM / correctly classified by CapsNet “` I had no interest in getting ``under your skin``. It's you and your fellow admins who got under my skin. So well done. It doesn't matter anymore. — `” 36
  • 37. Qualitative Analysis Text - “Where are the activists and foot soldiers when k'tak bleeds in silence.” Label - Positive DeepSHAP[8] results 37
  • 38. Qualitative Analysis Text - “The proud hero of kashmir! The hero of freedom struggle.” Label - Negative DeepSHAP results 38
  • 39. Outline ◆ Data Collection ◆ Comparison of Methods ◆ Capsule Networks ◆ Co-training for Domain Adaptation ◆ Conclusion 39
  • 41. Example of Domain Adaptation 41 Example of Domain Adaptation from CODA paper (Chen et al) on reviews from different domains
  • 42. Our Co-training for Domain Adaptation Algorithm 42
  • 43. Co-training for Domain Adaptation 43 Twitter1 Reddit (20%) + Training Testing Twitter 2 (20%) Whisper (20%) M1 M2 + + . . . M6 Reddit (80%) Twitter2 (80%) Whisper (80%) . . .
  • 44. Co-training for Domain Adaptation Results 44 Trained on (Domain1) Co-trained on (Domain2) Method Macro F1 Average Precision Twitter1 Quora CapsNet 0.6739 0.6723 Reddit 0.6690 0.5581 Twitter2 0.6321 0.6232 Whisper 0.9167 0.9201 Wikipedia1 0.7496 0.7480 Wikipedia2 0.7341 0.7457
  • 46. Co-training Results ◆ Just by augmenting with 20% samples we face reduction in performance by 17% compared to a model trained on 100% samples. ◆ As the percentage of Domain2 samples added to Domain1 increases, models performance improves. ◆ Therefore, if we only have a small amount of labeled data co-training for domain adaptation is a viable option. 46
  • 47. Outline ◆ Data Collection ◆ Comparison of Methods ◆ Capsule Networks ◆ Co-training for Domain Adaptation ◆ Conclusion 47
  • 48. Conclusion 48 ◆ We perform Multi-platform comparison for Content moderation. ◆ Capsule Networks outperformed existing methods for content moderation. ◆ Co-training for domain adaptation, a cost-effective solution for annotating data.
  • 49. Challenges, Limitation, Future Work 49 ◆ Some datasets use weak labels, ▶ Future work: see if stronger labels perform better than weak labels. ◆ Different platforms have different style of expressing content and also have different moderation policies ▶ Co-training may not work if the policies of platforms do not align. ◆ It was challenging to collect Reddit dataset ▶ Quarantined subreddits no longer available ◆ We plan to extend the work to video content on various platforms.
  • 50. Acknowledgement ◆ Committee Members ◆ Indira Sen, GESIS ◆ Snehal Gupta, Asmit Kumar Singh, Shubham Singh ◆ Members of Precog ◆ Family and friends 50
  • 51. References [1]Twitter data- https://github.com/zeerakw/hatespeech [2] Quora data- https://www.kaggle.com/c/quora-insincere-questions-classificati on/data [3] Wikipedia data- https://figshare.com/articles/Wikipedia_Detox_Data/4054689 [4] Whisper data- https://github.com/Mainack/hatespeech-data-HT-2017 [5] ExMachina - https://arxiv.org/abs/1610.08914 [6] Badjatiya- https://arxiv.org/abs/1706.00188 51
  • 52. References [7] LSTM- http://precog.iiitd.edu.in/pubs/empowering-first-responders.pdf [8] DeepShap - https://github.com/slundberg/shap [9] Co-training for domain adaptation - https://papers.nips.cc/paper/4433-co-training-for-domain-adapta tion.pdf 52