2. About Me
ï§ 2000-2004, B.S. Math, Central South University
ï§ 2004-2007, M.S. Computer Science, BUPT
ï§ 2007-Present, Researcher, Working on Recommender Systems and
Data Mining
3. Agenda
ï§ Social Tagging System and Its Features
ï§ Tag Recommender
ï§ Tag-based Recommender
4. Social Tagging
ï§ A folksonomy is a system of classification derived from the practice
and method of collaboratively creating and managing tags to annotate
and categorize content; this practice is also known as collaborative
tagging, social classification, social indexing, and social tagging.
Folksonomy is a portmaneau of folk and taxonomy.
ï§ Social Tagging boomed from 2004, with the wave of Web 2.0.
â Delicious
â Citeulike
â Bibsonomy
â Youtube
â Flickr
â Dogear â A internal social book marking system in IBM
â âŠ
5. Some Insights of Tagging System
ï§ Shilad Sen et.al., tagging, communities, vocabulary, evolution,
CSCWâ06
â Modeling vocabulary evolution
â Tagging system features
â Based on Movielens recommender system
â Personal tendency and community influence
â Tag displaying strategies and their effects
â Tag utility
7. Tagging System Features
ï§ Design Features
â Tag Sharing
â Tag Selection
â Item Ownership
â Tag Scope
â Broad
â Narrow
ï§ Tag Class
â Factual Tag
â Subjective Tag
â Personal Tag
9. Personal Tendency
ï§ How strongly do investment and
habit affect personal tagging
behavior?
â 1. Habit and investment
influence userâs tag applications.
â 2. Habit and investment
influence grows stronger as
users apply more tags.
â 3. Habit and investment cannot
be the only factors thatcontribute
to vocabulary evolution.
10. Community Influence
ï§ How does the tagging
community influence
personal vocabulary?
â 1. Community influence
affects a userâs personal
vocabulary.
â 2. Community influence
on a userâs first tag is
stronger for users who
have seen more tags.
13. Tag Recommender
ï§ Purpose
â Encourage users to tag more frequently, apply more tags to an
individual resource, reuse common tags
â Make user use tags not previously considered.
â Eliminate Redundant tags
â Promote a core tag vocabulary steering the user toward adopting
certain tags while not imposing any strict rules.
â Avoid ambiguous tags in favor of tags that offer greater information
value.
14. Tag Recommender â Technologies
ï§ Naive Methods
â Most Popular Tags on Resources
â Most Popular Tags on Users
â Most Popular Tags on Resources and Users
ï§ Classical Collaborative Filtering
â User-KNN
â Item-KNN
ï§ Adapted KNN Methods
â Extend User-Item Matrix
â Degrade User-Item-Tag Relationship
ï§ Content-based Method
ï§ Tensor Method
â Tensor Factorization
ï§ Graph Based
â FolkRank
ï§ Our Work
16. Adapted KNN â Degrade User-Item-Tag relationship
ï§ Process
â TF/IDF on UI, UT, IT
â P-Core Processing
â Remove noise data
â Extract User Model by
Hebbian Deflation
18. FolkRank
ï§ PageRank
PR( p j )
PR( pi ) ïœ (1ï d ) / N ï« d ï„
p j ïM ( pi ) L( p j ) (1)
ï§ Personalized PageRank
PR( p j )
PR( pi ) ïœ (1ï d ) pi ï« d ï„
p j ïM ( pi ) L( p j ) (2)
ï§ FolkRank
1. Compute global PageRank by (1)
2. Then for each <user, item> pair, compute personalized PageRank by (2)
â p[i] = 1, but p [u] = 1 + |U| and p [r] = 1 + |R|.
3. FolkRank = Personalized PageRank - PageRank
19. Our Work
ï§ Explored and Exploring Methods
â Non-classical Tensor Fusion Factorization
â Multi-label Classification by Random Decision Trees, High Speed
â The performance of both two methods are close to FolkRank
ï§ Current Progress
â Shiwan develop a simple graph model
â Best precision and recall on several datasets compared to other
methods
â We are writing paper targeting ACM RecSys 2010
20. Tag-based Recommender
ï§ Our Work
â IUI 2008 Paper, Improved Recommendation based on Collaborative
Tagging Behaviors
â Explored Methods
â Tensor Factorization
â Non-classical Tensor and Matrix Fusion Factorization
ï§ Other Works
â Shilad Sen, Jesse Vig, and John Riedl, Tagommenders: Connecting
Users to Items through Tags, WWW 2009
21. IUI 2008 Paper Overview
ï§ We invent a new collaborative filtering approach TBCF (Tag-based Collaborative
Filtering) based on the semantic distance among tags assigned by different users
to improve the effectiveness of neighbor selection.
ï§ That is, two users could be considered similar not only if they rated the items
similarly, but also if they have similar cognitions over these items.
ï§ Example
â Both Bob and Tom may rate the movie Avatar with 5 stars, which indicates they
all like this movie very much.
â Nevertheless, as a 3D fan, Bob appreciates this movie for its high quality 3D
animations, while Tom may think that it is a wonderful action movie.
22. Tag-based Collaborative Filtering
Tag-based User-Item Matrix
Item1 Item2 Item3 Item4
Alice Art, photo Home, Products Writing, Design Learning,
Education
Daniel Photo, Album, Ă Typewriter Tutorial, Training
Image
Sherry Ă Cleaning Ă Language, Study
Maggie Photography Ă Ovens Ă
Steps
1. Calculate the semantic similarity of tags based on WordNet (for the tags not
included in WordNet, calculate the edit-distance instead)
2. Calculate the similarity between tag sets
3. Calculate the similarity between user u and v by summing up the similarity of tag
sets on common pages (tagged by both u & v)
4. Find the top-N nearest neighbors of the active user to make the prediction
5. Return the top-M predicted items to the active user
23. Tag Similarity Calculation
ï§ Tag similarity
â WordNet
â LSA/PLSA
ï§ Tag set similarity
â Hungarian method
WordNet Concept Tree
Word similarity in WordNet
If x and y are contained in WordNet, dis(x,y) is the shortest path length between x and y.
24. Experimental Evaluation
Data Set
Extract total 8000 users, 5315 pages and 7670 tags from web logs.
Algorithm Average Precision Average Ranking
TBCF 0.27 2.8
cosine 0.13 1.5
Random generated subset Average Precision Average Precision
TBCF cosine
500 0.208 0.121
2000 0.182 0.118
4000 0.202 0.173
6000 0.209 0.180