Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement, Mohammad Ali Abbasi,
Arizona State University
http://dmml.asu.edu
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
Learning To Recognize Reliable Users And Content In Social Media With Coupled Mutual Reinforcement
1. DATA MINING AND MACHINE LEARNING
IN A NUTSHELL
LEARNING TO RECOGNIZE RELIABLE USERS AND CONTENT IN SOCIAL
MEDIA WITH COUPLED MUTUAL REINFORCEMENT
Mohammad-Ali Abbasi
http://www.public.asu.edu/~mabbasi2/
SCHOOL OF COMPUTING, INFORMATICS, AND DECISION SYSTEMS ENGINEERING
ARIZONA STATE UNIVERSITY
Arizona State University
http://dmml.asu.edu/ to Recognize Reliable Users and Content in Social Media with
Learning
Data Mining and Machine Learning Lab
Data Mining and Machine Learning- in a nutshell 1
Coupled Mutual Reinforcement
2. About the paper
• Learning to Recognize Reliable Users and Content in
Social Media with Coupled Mutual Reinforcement
– Jiang Bian, Georgia Institute of Technology
– Yandong Liu, Emory University
– Ding Zhou, Facebook Inc.
– Eugene Agichtein, Emory University
– Hongyuan Zha, Georgia Institute of Technology
• WWW 2009, April 20–24, 2009, Madrid, Spain.
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2 2
3. Community Question Answering (CQA)
• Is a popular forum for users to pose questions
for the other users to answer
• User can ask natural language question
• Is comparable with regular web search
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 3 3
4. Sample: Yahoo! Answers
• Introduction
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 4 4
5. What is the problem?
• retrieve answers from a social media archive
with a large amount information
– the quality, accuracy, and comprehensiveness of
the submitted questions and answers varies
widely
– A large fraction of the content is not useful for
answering queries
– Current approaches require large amounts of
manually labeled data
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 5 5
6. CQA environment
• Users
• Question
• Answers
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 6 6
7. The goal
• Identify
– High quality Answers
– High quality Questions
– High reputation Users
• Simultaneously
• With the minimum manual labeling
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 7 7
8. The contribution of this paper
• developing a semi-supervised coupled mutual
reinforcement framework for simultaneously
calculating content quality and user
reputation, that requires relatively few labeled
examples to initialize the training process
• more effective for finding high-quality
answers, questions, and users.
• improves the accuracy of search over CQA
archives
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 8 8
9. Current approaches
• Relies on the users reputation,
• OR- Require large amount of supervision,
• OR- focus on the network properties of the
CQA
• without considering the actual content of the
information exchanged
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 9 9
10. How to rank?
• Current approaches:
– Content Quality
OR
– User reputation
• This paper:
– Content Quality
AND
– User reputation
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1010
11. Definitions
• Question Quality
– A question's effectiveness at attracting high quality
answers
• Answer Quality
– the responsiveness, accuracy, and comprehensiveness of
the answer to a question.
• Question Reputation
– indicating the expected quality of the questions posted by
a user
• Answer Reputation
– the expected quality of the answers posted by a user.
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1111
12. Model the problem
• Solution
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1212
13. Mutual reinforcement Principle
• Solution
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1313
14. Feature Space: X(Q), X(A), X(U)
• Solution
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1414
15. Learning quality and reputation(Coupled Mutual Reinforcement)
• P(x): probability of being “good”
• Model of P(x)
• B is Coefficient of the linear model and can be
found by maximizing:
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1515
16. Non independent equations
• Conditional log-likelihood
• Objective function
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1616
17. CQA-MR Algorithm
• Solution
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1717
18. Experimental Setup- Data Collection
• From Yahoo! Answers with their API
• Use TREC QA benchmark Archive to crawl QA
archives (http://trec.nist.gov/data.html)
• Get all available answers for each question
– 107293 users
– 27354 questions
– 224617 answers
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1818
19. Evaluation Metrics
• Mean Reciprocal Rank(MRR)
– the reciprocal of the rank at which the first relevant
answer was returned, or 0 if none of the top N results
contained a relevant answer
• Precision at K
– for a given query, P(K) reports the fraction of answers
ranked in the top K results that are labeled as relevant
• Mean Average of Precision(MAP)
– the mean of the precision at K values calculated after each
relevant answer was retrieved
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 1919
20. User reputation methods
• Baseline
– users are ranked by “indegree" (number of answers
posted)
• HITS
– Users are ranked based on their authority scores
• CQA-Supervised
– classify users into those with "high" and "low”
reputation, and trained over the features
• CQA-MR
– predict user reputation based on mutual- reinforcement
algorithm
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2020
21. CQA Retrieval methods
• Baseline
– score computed as the difference of up votes and down
votes
• Gbrank
– did not include answer and question quality and user
reputation
• GBrank-HITS:
– optimized GBrank by adding user reputation calculated by
HITS algorithm
• GBrank-Supervised
– supervised learning and optimize GBrank by adding
obtained quality
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2121
22. Precision at K for the top contributors
• Experiments
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2222
23. Precision at K
• Experiments
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2323
24. Accuracy
• Experiments
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2424
25. Training Labels
• Experiments
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2525
26. Training Labels
• Experiments
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 2626
27. Mohammad-Ali Abbasi (Ali),
Ali, is a Ph.D student at Data Mining
and Machine Learning Lab, Arizona
State University.
His research interests include Data
Mining, Machine Learning, Social
Computing, and Social Media Behavior
Analysis.
http://www.public.asu.edu/~mabbasi2/
Arizona State University
Data Mining and Machine Learning- in a nutshell Learning to Recognize Reliable Users and Content in Social Media with
Data Mining and Machine Learning Lab Coupled Mutual Reinforcement 27
Hinweis der Redaktion
An answer is likely to be of high quality if the content is responsive and well-formed, the question has high quality, and the answerer is of high answer-reputation. At the same time, a user will have high answer-reputation if she posts high- quality answers, and high question-reputation if she tends to post high-quality questions. Finally, a question is likely to be of high quality if it is well stated, is posted by a user with high question reputation, and attracts high-quality answers.
Circular definition from user to contentIn previous work, question and answer quality were defined in terms of content, form, and style, as manually labeled by paid editors [2]. In contrast, our definitions focus on question effectiveness, and the answer accuracy { both quantities that can be measured automatically and do not necessarily require human judgments.
Proportional User question-reputation and user answers-reputationQuestions QualityAnswers QualityY q (~a) denotes the quality of answera’s question
3000 factoid questions as the initial set of queries and select 1250 factoid questions that has at least one similar question in Yahoo! Answers archive
and reputation as extra features for learning the ranking function