SlideShare a Scribd company logo
1 of 89
Download to read offline
ContentContent--based Recommender Systemsbased Recommender Systems
problems, challengesproblems, challenges
and research directionsand research directions
Giovanni Semeraro & the SWAP group
http://www.di.uniba.it/~swap/
semeraro@di.uniba.it
Department of Computer Science
University of Bari “Aldo Moro”
UMAP 2010 – 8° Workshop on
INTELLIGENT TECHNIQUES FOR WEB PERSONALIZATION
& RECOMMENDER SYSTEMS (ITWP 2010)
BIG ISLAND OF HAWAII, JUNE 20 2010
Semantic
Web 
Access and 
Personalization 
research group
http://www.di.uniba.it/~swap
2/89
OutlineOutline
Content-based Recommender Systems (CBRS)
Basics
Advantages & Drawbacks
Drawback 1: Limited content analysis
Beyond keywords: Semantics into CBRS
Taking advantage of Web 2.0: Folksonomy-based
CBRS
Drawback 2: Overspecialization
Strategies for diversification of recommendations
3/89
ContentContent--based Recommender Systems (CBRS)based Recommender Systems (CBRS)
Recommend an item to a user
based upon a description of the item and
a profile of the user’s interests
Implement strategies for:
representing items
creating a user profile that describes the types of
items the user likes/dislikes
comparing the user profile to some reference
characteristics (with the aim to predict whether the
user is interested in an unseen item)
[Pazzani07] Pazzani, M. J., & Billsus, D. Content-Based Recommendation Systems. The Adaptive Web. Lecture Notes in
Computer Science vol. 4321, 325-341, 2007.
4/89
ContentContent--basedbased FilteringFiltering
User Profile User profile compared against items
for relevance computation
Information
Source
Target User
Items recommended to the user
5/89
ContentContent--based Filteringbased Filtering
Each user is assumed to operate independently
Items are represented by some features
Movies: actors, director, plot, …
The profile is often created and updated automatically in
response to feedback on the desirability of items that have been
presented to the user
Machine Learning for automated inference
Relevance judgment on items, e.g. ratings
Training on rated items user profile
Filtering based on the comparison between the content (features)
of the items and the user preferences as defined in the user
profile
Keyword-based representation for content and profiles
string matching or text similarity
6/89
General Architecture of CBRSGeneral Architecture of CBRS
CONTENTCONTENT
ANALYZERANALYZER
PROFILEPROFILE
LEARNERLEARNER
FILTERINGFILTERING
COMPONENTCOMPONENT
Information
Source
Represented
Items
Feedback
PROFILES
Structured
Item
Representation
Active user ua
Item
Descriptions
User ua
feedback
User ua
training
examples
User ua
feedback
List of
recommendations
User ua
Profile
New
Items
User ua
Profile
7/89
Advantages of CBRSAdvantages of CBRS
USER INDEPENDENCE
CBRS exploit solely ratings provided by the active user to build
her own profile
No need for data on other users
TRANSPARENCY
CBRS can provide explanations for recommended items by
listing content-features that caused an item to be
recommended
NEW ITEM (Item not yet rated by any user)
CBRS are capable of recommending new and unknown items
No first-rater problem
8/89
Drawbacks of CBRS: LIMITED CONTENTDrawbacks of CBRS: LIMITED CONTENT
ANALYSISANALYSIS
No suitable suggestions if the analyzed content does not contain enough
information to discriminate items the user likes from items the user does
not like
Content must be encoded as meaningful features
automatic/manually assignment of features to items might be
insufficient to define distinguishing aspects of items necessary for
the elicitation of user interests
keywords not appropriate for representing content, due to polysemy,
synonymy, multi-word concepts (homography, homophony,...) –
“Sator arepo eccetera” [Eco07]
SATOR
AREPO
TENET
OPERA
ROTAS
R
E
T
S
O
N
R
E
T
A
P
R
E
T
S
O
N
R
E
T
A
P
E R N O S T RETAP E R N O S T RETAP
AA
OOOO
AA
9/89
AI is a branch of
computer science
doc1
the 2011
International Joint
Conference on
Artificial
Intelligence will be
held in Spain
doc2
apple launches a
new product…
doc3
artificial 0.02
intelligence 0.01
apple 0.13
AI 0.15
…
USER PROFILE
MULTI-WORD CONCEPTS
KeywordKeyword--based Profilesbased Profiles
10/89
AI is a branch of
computer science
doc1
the 2011
International Joint
Conference on
Artificial
Intelligence will be
held in Spain
doc2
apple launches a
new product…
doc3
artificial 0.02
intelligence 0.01
apple 0.13
AI 0.15
…
USER PROFILE
SYNONYMY
KeywordKeyword--based Profilesbased Profiles
11/89
AI is a branch of
computer science
doc1
the 2011
International Joint
Conference on
Artificial
Intelligence will be
held in Spain
doc2
apple launches a
new product…
doc3
artificial 0.02
intelligence 0.01
apple 0.13
AI 0.15
…
USER PROFILE
POLYSEMY
KeywordKeyword--based Profilesbased Profiles
NLP methods are needed for the elicitation
of user interests
12/89
Drawbacks of CBRS: OVERSPECIALIZATIONDrawbacks of CBRS: OVERSPECIALIZATION
CBRS suggest items whose scores are high when matched
against the user profile
the user is going to be recommended items similar to those
already rated
No inherent method for finding something unexpected
Obviousness in recommendations
suggesting “STAR TREK” to a science-fiction fan:
accurate but not useful
users don’t want algorithms that produce better ratings, but
sensible recommendations
The Serendipity Problem
[McNee06] S.M. McNee, J. Riedl, and J. Konstan. Accurate is not always good: How accuracy metrics have hurt
recommender systems. In Extended Abstracts of the 2006 ACM Conference on Human Factors in Computing Systems,
pages 1-5, Canada, 2006.
13/89
The serendipity problem: mind cagesThe serendipity problem: mind cages
Homophily: the tendency to surround ourselves by
like-minded people
opinions taken to extremes cultural impoverishment
threat for biodiversity?
14/89
The homophily trapThe homophily trap
Does homophily hurt RS?
try to tell Amazon that you liked the movie
“War Games”…
[Zuckerman08] E. Zuckerman. Homophily, serendipity, xenophilia. April 25, 2008.
www.ethanzuckerman.com/blog/2008/04/25/homophily-serendipity-xenophilia/
15/89
The homophily trapThe homophily trap
Recommendations by other (ageing?)
COMPUTER GEEKS!
16/89
““ItemItem--toto--Item”Item” homophily…homophily…
Harry Potter for everHarry Potter for ever??
17/89
Novelty vs SerendipityNovelty vs Serendipity
Novelty: A novel recommendation helps the user find a
surprisingly interesting item she might have
autonomously discovered
Serendipity: A serendipitous recommendation helps
the user find a surprisingly interesting item she
might not have otherwise discovered
How to introduce serendipity in (CB)RS?
[Herlocker04] Herlocker, J.L., Konstan, J.A., Terveen, L.G., and Riedl, J.T. Evaluating Collaborative Filtering
Recommender Systems. ACM Transactions on Information Systems, 22(1): 39-49, 2004.
18/89
““Computational” serendipity? A motivatingComputational” serendipity? A motivating
exampleexample
for Star Trek fans: Did you try
“Star Trek – The experience”
in Las Vegas?
19/89
Putting Intelligence into CBRS:Putting Intelligence into CBRS:
Challenges & Research DirectionsChallenges & Research Directions
Semantic analysisSemantic analysis ofof
content by means ofcontent by means of
external knowledgeexternal knowledge
sourcessources
LanguageLanguage--independentindependent
CBRSCBRS
Beyond keywords:Beyond keywords:
novel strategies for thenovel strategies for the
representation ofrepresentation of
items and profilesitems and profilesLimited ContentLimited Content
AnalysisAnalysis
Defeating homophily:Defeating homophily:
recommendationrecommendation
diversificationdiversification
Taking advantage ofTaking advantage of
Web 2.0 for collectingWeb 2.0 for collecting
User Generated ContentUser Generated Content
CHALLENGESCHALLENGES
OverspecializationOverspecialization
PROBLEMSPROBLEMS
““computational”computational”
serendipityserendipity
programming forprogramming for
serendipityserendipity
Knowledge InfusionKnowledge Infusion
FolksonomyFolksonomy--based CBRSbased CBRS
RESEARCHRESEARCH
DIRECTIONSDIRECTIONS
20/89
Putting Intelligence into CBRS:Putting Intelligence into CBRS:
Challenges & Research DirectionsChallenges & Research Directions
Semantic analysis ofSemantic analysis of
content by means ofcontent by means of
external knowledgeexternal knowledge
sourcessources
LanguageLanguage--independentindependent
CBRSCBRS
Beyond keywords:Beyond keywords:
novel strategies for thenovel strategies for the
representation ofrepresentation of
items and profilesitems and profilesLimited ContentLimited Content
AnalysisAnalysis
Defeating homophily:Defeating homophily:
recommendationrecommendation
diversificationdiversification
Taking advantage ofTaking advantage of
Web 2.0 for collectingWeb 2.0 for collecting
User Generated ContentUser Generated Content
CHALLENGESCHALLENGES
OverspecializationOverspecialization
PROBLEMSPROBLEMS
““computational”computational”
serendipityserendipity
programming forprogramming for
serendipityserendipity
Knowledge InfusionKnowledge Infusion
FolksonomyFolksonomy--based CBRSbased CBRS
RESEARCHRESEARCH
DIRECTIONSDIRECTIONS
21/89
Semantic Analysis =
1. Semantics: concept identification in text-based representations
through advanced NLP techniques “beyond keywords”
+
2. Personalization: representation of user information needs in
an effective way “deep (high-accuracy) user profiles”
Semantic Analysis: beyond keywordsSemantic Analysis: beyond keywords
22/89
Apple Computer iPhone
Beyond keywordsBeyond keywords: Word Sense Disambiguation: Word Sense Disambiguation
(WSD)(WSD) -- from words to meaningsfrom words to meanings
WSD selects the proper meaning (sense) for a word in a
text by taking into account the context in which that
word occurs
#12567: computer brand #22999: fruit
Dictionaries, Ontologies, e.g. WordNetSense RepositorySense Repository
Apple
context
[Basile07] P. Basile, M. Degemmis, A. Gentile, P. Lops, and G. Semeraro. UNIBA: JIGSAW algorithm for Word Sense
Disambiguation. In Proceedings of the 4th ACL 2007 International Workshop on Semantic Evaluations (SemEval-2007),
Prague, Czech Republic, pages 398–401, Association for Computational Linguistics, June 23-24, 2007.
23/89
AI is a branch of
computer science
doc1
the 2011
International Joint
Conference on
Artificial
Intelligence will be
held in Spain
doc2
apple launches a
new product…
doc3
artificial 0.02
intelligence 0.01
apple 0.13
AI 0.15
…
USER PROFILE
MULTI-WORD CONCEPTS
ITR (ITem Recommender)ITR (ITem Recommender)
SenseSense--based Profilesbased Profiles
#12387 0.03
24/89
AI is a branch of
computer science
doc1
the 2011
International Joint
Conference on
Artificial
Intelligence will be
held in Spain
doc2
apple launches a
new product…
doc3
#12387 0.03
apple 0.13
AI 0.15
…
USER PROFILE
SYNONYMY
ITR (ITem Recommender)ITR (ITem Recommender)
SenseSense--based Profilesbased Profiles
#12387 0.15
0.18
25/89
AI is a branch of
computer science
doc1
the 2011
International Joint
Conference on
Artificial
Intelligence will be
held in Spain
doc2
apple launches a
new product…
doc3
#12387 0.18
apple 0.13
…
USER PROFILE
POLYSEMY
ITR (ITem Recommender)ITR (ITem Recommender)
SenseSense--based Profilesbased Profiles
#12567
SEMANTIC USER PROFILE
sense identifiers rather than
keywords
[Degemmis07] M. Degemmis, P. Lops, and G. Semeraro. A Content-collaborative Recommender that Exploits WordNet-
based User Profiles for Neighborhood Formation. User Modeling and User-Adapted Interaction: The Journal of
Personalization Research (UMUAI), 17(3):217–255, Springer Science + Business Media B.V., 2007.
[Semeraro07] G. Semeraro, M. Degemmis, P. Lops, and P. Basile. Combining Learning and Word Sense Disambiguation
for Intelligent User Profiling. In M. M. Veloso, editor, IJCAI 2007, Proceedings of the 20th International Joint
Conference on Artificial Intelligence, Hyderabad, India, January 6-12, 2007 , pages 2856–2861. Morgan Kaufmann,
2007.
26/89
Advantages of SenseAdvantages of Sense--based Representationsbased Representations
Semantic matching between items and profiles
computing semantic relatedness [Pedersen04] rather than string
matching (e.g., by using similarity measures between WordNet
synsets)
Senses are inherently multilingual
Concepts remain the same across different languages, while terms
used for describing them in each specific language change
Improving transparency
matched concepts can be used to justify suggestions
Collaborative Filtering could benefit too
finding better neighbors: similar users discovered by looking at
profile overlap even if they did not rate the same items
semantic profiles succeed where Pearson’s correlation coefficient fail
[Pedersen04] Pedersen, Ted and Patwardhan, Siddharth, and Michelizzi, Jason. WordNet::Similarity - Measuring the
Relatedness of Concepts. In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-
2004), pp. 1024-1025, San Jose, CA, July, 2004.
27/89
SenseSense--based profiles in a hybrid CBbased profiles in a hybrid CB--CFCF
recommenderrecommender
Sense-based profiles obtained by applying WSD on
textual description of items
WordNet as sense repository
Synset-based user profiles
Hybrid CB-CF RS
[Degemmis07] M. Degemmis, P. Lops, and G. Semeraro. A Content-collaborative Recommender that Exploits WordNet-
based User Profiles for Neighborhood Formation. User Modeling and User-Adapted Interaction: The Journal of
Personalization Research (UMUAI), 17(3):217–255, Springer Science + Business Media B.V., 2007.
28/89
Clustering of sense-based profiles
User profiles
Active user
Active user
Clusters of profiles
Profiles in the cluster
used as neighbors
SenseSense--based profiles in a hybrid CBbased profiles in a hybrid CB--CFCF
recommenderrecommender
29/89
Experimental Evaluation on EachMovieExperimental Evaluation on EachMovie
datasetdataset
835 users selected from EachMovie dataset*
1,613 movies grouped into 10 categories,
180,356 ratings, user-item matrix 87% sparse
Each user rated between 30 and 100 movies
Discrete ratings between 0 and 5
Movie content crawled from the Internet Movie
Database (IMDb)
CF algorithm using Pearson’s correlation coefficient
vs. CF algorithm integrating clusters of semantic
user profiles
*2,811,983 ratings entered by 72,916 users for 1628 different movies. As of October, 2004, HP/Compaq Research
(formerly DEC Research) retired the EachMovie dataset. It is no longer available for download
30/89
SenseSense--based profiles improvebased profiles improve
recommendationsrecommendations
Rating scale: 0-5
31/89
Semantic Analysis: Ontologies in CBRSSemantic Analysis: Ontologies in CBRS
Recommendation of onRecommendation of on--line academicline academic
research papersresearch papers
Research paper topic ontology based on theResearch paper topic ontology based on the
computer science classification of the DMOZcomputer science classification of the DMOZ
open directory projectopen directory project
KK--NN classification used to associate classesNN classification used to associate classes
to previously browsed papersto previously browsed papers
Quickstep & FoxtrotQuickstep & Foxtrot
[Middleton04][Middleton04]
SEWePSEWeP (Semantic Enhancement(Semantic Enhancement
for Web Personalization)for Web Personalization)
[Eirinaki03][Eirinaki03]
Manually built domainManually built domain--specific taxonomy ofspecific taxonomy of
categories for the automated annotation ofcategories for the automated annotation of
Web pagesWeb pages
WordNetWordNet--based word similarity used to mapbased word similarity used to map
keywords to categorieskeywords to categories
Categories of interest discovered fromCategories of interest discovered from
navigational history of the usernavigational history of the user
DESCRIPTIONDESCRIPTIONSYSTEMSYSTEM
[Lops10] P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the Art and Trends.
In: P. Kantor, F. Ricci, L. Rokach and B. Shapira (Eds.), Recommender Systems Handbook: A Complete Guide
for Research Scientists & Practitioners, Chapter 3, pages 73-105, BERLIN: Springer, 2010.
32/89
Semantic Analysis: Ontologies in CBRSSemantic Analysis: Ontologies in CBRS
OWL ontology for representing TV programs andOWL ontology for representing TV programs and
user profilesuser profiles
OWL representation allows reasoning on preferencesOWL representation allows reasoning on preferences
and discovering new knowledgeand discovering new knowledge
Spreading activation for matching items andSpreading activation for matching items and
preferencespreferences
RS for Interactive Digital TelevisionRS for Interactive Digital Television
[Blanco[Blanco--Fernandez08]Fernandez08]
OntologyOntology--based news recommenderbased news recommender
17 ontologies adapted from the IPTC ontology17 ontologies adapted from the IPTC ontology
((http://http://nets.ii.uam.es/neptuno/iptcnets.ii.uam.es/neptuno/iptc/)/)
Items and user profiles represented as vectors in theItems and user profiles represented as vectors in the
space of concepts defined by the ontologiesspace of concepts defined by the ontologies
News@handNews@hand
[Cantador08][Cantador08]
Informed RecommenderInformed Recommender
[Aciar07][Aciar07]
Consumer product reviews to makeConsumer product reviews to make
recommendationsrecommendations
Ontology used to convert consumers’ opinions into aOntology used to convert consumers’ opinions into a
structured formstructured form
TextText--mining for mapping sentences in the reviewsmining for mapping sentences in the reviews
into the ontology information structureinto the ontology information structure
SearchSearch--based recommendationsbased recommendations
DESCRIPTIONDESCRIPTIONSYSTEMSYSTEM
33/89
Semantic Analysis: WikipediaSemantic Analysis: Wikipedia
Do we really need only ontologies?
What about encyclopedic knowledge sources
available on the Web?
Is Wikipedia potentially useful for CBRS? How?
It is free
It covers many domains
It is under constant development by the
community
It can be seen as a multilingual corpus
Its accuracy rivals that of Encyclopaedia Britannica
[Giles05]
[Giles05] J. Giles. Internet Encyclopaedias Go Head to Head. Nature, 438:900–901, 2005.
34/89
EExplicitxplicit SSemanticemantic AAnalysis (ESA)nalysis (ESA)
Technique able to provide a fine-grained semantic representation
of natural language texts in a high-dimensional space of
comprehensible concepts derived from Wikipedia [Gabri06]
[Gabri06] E. Gabrilovich and S. Markovitch. Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text
Categorization with Encyclopedic Knowledge. In Proceedings of the 21th National Conf. on Artificial Intelligence and
the 18th Innovative Applications of Artificial Intelligence Conference, pages 1301–1306. AAAI Press, 2006.
Panthera
World 
War II
World 
War II
Jane 
Fonda
Island
Wikipedia viewed as an ontology = 
a collection of ~1M concepts
[Egozi09] O. Egozi. Concept-Based Information Retrieval
using Explicit Semantic Analysis. M.Sc. Thesis, CS Dept.,
Technion, 2009.
35/89
Wikipedia is viewed as an ontology ‐ a collection of ~1M concepts
Every Wikipedia article represents a concept
Panthera
Explicit Semantic Analysis (ESA)Explicit Semantic Analysis (ESA)
Article words are associated with the concept (TF‐IDF)
Cat [0.92]
Leopard [0.84]
Roar [0.77]
36/89
Wikipedia is viewed as an ontology ‐ a collection of ~1M concepts
Every Wikipedia article represents a concept
Panthera
Explicit Semantic Analysis (ESA)Explicit Semantic Analysis (ESA)
Article words are associated with the concept (TF‐IDF)
Cat [0.92]
Leopard [0.84]
Roar [0.77]
37/89
Wikipedia is viewed as an ontology ‐ a collection of ~1M concepts
Every Wikipedia article represents a concept
Panthera
Explicit Semantic Analysis (ESA)Explicit Semantic Analysis (ESA)
Article words are associated with the concept (TF‐IDF)
Cat [0.92]
Leopard [0.84]
Roar [0.77]
The semantics of a word is the vector of 
its associations with Wikipedia concepts
Cat Panthera
[0.92]
Cat
[0.95]
Jane 
Fonda
[0.07]
38/89
Explicit Semantic Analysis (ESA)Explicit Semantic Analysis (ESA)
The semantics of a text fragment is the average
vector (centroid) of the semantics of its words
button
Dick 
Button
[0.84]
Button
[0.93]
Game 
Controller
[0.32]
Mouse 
(computing)
[0.81]
mouse
Mouse 
(computing)
[0.84]
Mouse 
(rodent)
[0.91]
John 
Steinbeck
[0.17]
Mickey 
Mouse 
[0.81]
mouse  button
Drag‐
and‐drop
[0.91]
Mouse 
(computing)
[0.95]
Mouse 
(rodent)
[0.56]
Game 
Controller
[0.64]
In practice – WSD…
mouse  button
39/89
ESA: concept spaceESA: concept space
D1 = 2C1 + 3C2 + 5C3
D2 = 3C1 + 7C2 + 1C3
ESA used for computing semantic relatedness [Gabri07]
C3
C1
C2
D1 = 2C1+ 3C2 + 5C3
D2 = 3C1 + 7C2 + 1C3
7
32
5
3
1
Ci = Wikipedia article
[Gabri07] E. Gabrilovich and S. Markovitch. Computing Semantic Relatedness Using Wikipedia-based Explicit
Semantic Analysis. In Manuela M. Veloso, editor, Proceedings of the 20th International Joint Conference on
Artificial Intelligence, pages 1606–1611, 2007.
40/89
Wikipedia and CBRS: recent ideasWikipedia and CBRS: recent ideas
Wikipedia used for computing the similarity
between movie descriptions for the Netflix prize
competition [Lees08]
ESA used for user profiling, spam detection and RSS
filtering [Smirnov08]
Wikipedia included in a Knowledge Infusion process
for recommendation diversification [Semeraro09a]
[Lees08] J. Lees-Miller, F. Anderson, B. Hoehn, and R. Greiner. Does Wikipedia Information Help Netflix Predictions?
Proceedings of the Seventh International Conference on Machine Learning and Applications (ICMLA), pages 337–343.
IEEE Computer Society, 2008.
[Smirnov08] A. V. Smirnov and A. Krizhanovsky. Information Filtering based on Wiki Index Database. CoRR,
abs/0804.2354, 2008.
[Semeraro09a] G. Semeraro, P. Lops, P. Basile, and M. de Gemmis. Knowledge Infusion into Content-based
Recommender Systems. In Proceedings of the 2009 ACM Conference on Recommender Systems, RecSys 2009, pages
301-304, New York, USA, October 22-25, 2009.
41/89
Putting Intelligence into CBRS:Putting Intelligence into CBRS:
Challenges & Research DirectionsChallenges & Research Directions
Semantic analysisSemantic analysis ofof
content by means ofcontent by means of
external knowledgeexternal knowledge
sourcessources
LanguageLanguage--independentindependent
CBRSCBRS
Beyond keywords:Beyond keywords:
novel strategies for thenovel strategies for the
representation ofrepresentation of
items and profilesitems and profilesLimited ContentLimited Content
AnalysisAnalysis
Defeating homophily:Defeating homophily:
recommendationrecommendation
diversificationdiversification
Taking advantage ofTaking advantage of
Web 2.0 for collectingWeb 2.0 for collecting
User Generated ContentUser Generated Content
CHALLENGESCHALLENGES
OverspecializationOverspecialization
PROBLEMSPROBLEMS
““computational”computational”
serendipityserendipity
programming forprogramming for
serendipityserendipity
Knowledge InfusionKnowledge Infusion
FolksonomyFolksonomy--based CBRSbased CBRS
RESEARCHRESEARCH
DIRECTIONSDIRECTIONS
42/89
MARS (MARS (MMultilultilAAnguagenguage RRecommenderecommender SSystem)ystem)
crosscross--language user profileslanguage user profiles
WSD for building language-independent user profiles
MultiWordNet as sense repository
Multilingual lexical database that supports English, Italian,
Spanish, Portuguese, Hebrew, Romanian, Latin
Alignment between synsets in the different languages
– Semantic relations imported and preserved
all of the inhabitants of the earth
world, human race, humanity,
humankind, human beings, humans,
mankind, man
LanguageLanguage SynsetSynset GlossGloss
mondo, umanità, uomo, genere
umano, terra
insieme degli abitanti della terra, il
complesso di tutti gli esseri umani
43/89
MARS (MARS (MMultilultilAAnguagenguage RRecommenderecommender SSystem)ystem)
crosscross--language user profileslanguage user profiles
CLOCKWORK ORANGE
Being the adventures of a young
man whose principal interests are
rape, ultra-violence and Beethoven
ARANCIA MECCANICA
Le avventure di un giovane i cui
principali interessi sono lo stupro,
l’ultra-violenza e Beethoven
“a12889641” “n5477412”
“n3652872” “a2584413”
“n3255687” “a3225896”
“n32256325” “n225784”
“n255632” “Beethoven”
“n5477412” “a1744532”
“a2584413” “n3652872”
“a3225722” “n32256325”
“n225784” “n255632”
“Beethoven”
ENGLISH description ITALIAN description
Bag of Synset Bag of Synset
44/89
MARS (MARS (MMultilultilAAnguagenguage RRecommenderecommender SSystem)ystem)
crosscross--language user profileslanguage user profiles
Target User
45/89
MARS (MARS (MultilultilAnguagenguage RRecommenderecommender SSystem)ystem)
preliminary resultspreliminary results
MovieLens 100k ratings dataset
613 users with ≥ 20 ratings selected from 943 different users
520 movies and 40,717 ratings
movie content crawled from Wikipedia (English and Italian)
same movie - different descriptions in English and Italian
Results in terms of Fß=0.5 measure
no statistically significant
difference wrt the baselines
Neither content translations
nor profile translations achieve the
same effectiveness (they cannot avoid
the negative impact of polysemy and
lack of context)
63.9864.91
63.70 63.71
Recommendations
Profiles
46/89
Putting Intelligence into CBRS:Putting Intelligence into CBRS:
Challenges & Research DirectionsChallenges & Research Directions
Semantic analysisSemantic analysis ofof
content by means ofcontent by means of
external knowledgeexternal knowledge
sourcessources
LanguageLanguage--independentindependent
CBRSCBRS
Beyond keywords:Beyond keywords:
novel strategies for thenovel strategies for the
representation ofrepresentation of
items and profilesitems and profilesLimited ContentLimited Content
AnalysisAnalysis
Defeating homophily:Defeating homophily:
recommendationrecommendation
diversificationdiversification
Taking advantage ofTaking advantage of
Web 2.0 for collectingWeb 2.0 for collecting
User Generated ContentUser Generated Content
CHALLENGESCHALLENGES
OverspecializationOverspecialization
PROBLEMSPROBLEMS
““computational”computational”
serendipityserendipity
programming forprogramming for
serendipityserendipity
Knowledge InfusionKnowledge Infusion
FolksonomyFolksonomy--based CBRSbased CBRS
RESEARCHRESEARCH
DIRECTIONSDIRECTIONS
47/89
Web 2.0 & UserWeb 2.0 & User--Generated Content (UGC)Generated Content (UGC)
47
48/89
Social Tagging & FolksonomiesSocial Tagging & Folksonomies
Users annotate resources of
interests with free keywords,
called tags
Social tagging activity builds a
bottom-up classification
schema, called a folksonomy
Folksonomy: “Folks” +
“Taxonomy”
How to exploit folksonomies
for advanced user profiling in
CBRS?
48
Resources
(Artworks) Tags
Users
(Visitors)
…
the cry, munch
vangogh,
girasoli
van gogh,suflowers
VanGogh
favorite,
the_scream
da
vinci,
m
onna
lisa
da vinci code,
favorite
…
49/89
Cultural Heritage fruition & e-learning applications
of new Advanced (multimodal) Technologies
In the context of cultural heritage personalization, does the 
integration of UGC and textual description of artwork 
collections cause an increase of the prediction accuracy in the 
process of recommending artifacts to users?
50/89
FIRSt:FIRSt:
FFolksonomyolksonomy--basedbased IItemtem RRecommender syecommender syStStemem
Artwork representation
Artist
Title
Description
Tags
Semantic Indexing
Change of text representation from vectors of words
(BOW) into vectors of WordNet synsets (BOS)
From tags to semantic tags
Supervised Learning
Bayesian Classifier learned from artworks labeled with
user ratings and tags
51/89
5‐point rating scale
Textual description of 
items (static content)
Personal Tags
FIRSt (FIRSt (FFolksonomyolksonomy--basedbased IItemtem RRecommender syecommender syStStem)em)
Learning from Ratings & TagsLearning from Ratings & Tags
51
Social Tags (from other users): caravaggio, deposition, christ, cross, suffering, religion
Social Tags
passion
52/89
caravaggio, deposition,
cross, christ, rome, …
passion
caravaggio, deposition,
christ, cross, suffering,
religion, …
USER PROFILE
FIRSt (FIRSt (FFolksonomyolksonomy--basedbased IItemtem RRecommender syecommender syStStem)em)
Tags within User ProfilesTags within User Profiles
Personal 
Tags
Static 
Content
Social Tags
collaborative part of
the user profile
[de Gemmis08] M. de Gemmis, P. Lops, G. Semeraro, and P. Basile.
Integrating Tags in a Semantic Content-based Recommender. In RecSys ’08,
Proceed. of the 2nd ACM Conference on Recommender Systems, pages 163–
170, October 23-25, 2008, Lausanne, Switzerland, ACM, 2008.
53/89
Experimental EvaluationExperimental Evaluation
Goal: Compare predictive accuracy of FIRSt when user
profiles are learned from:
Static content only, i.e., textual descriptions of
artifacts (content-based profiles)
both Static and Dynamic UGC (tag-based profiles).
UGC can be:
– Personal Tags, entered by a user for an artifact,
i.e., the user’s contribution to the whole
folksonomy
– Social Tags, i.e., the whole folksonomy of tags
added by all visitors
53
54/89
Experimental SetupExperimental Setup
Dataset
45 paintings from the Vatican
picture-gallery
Static content (i.e., title, artist
and description) captured using
screenscraping bots
Subjects
30 volunteers
average age ≈ 25
none reported to be an art expert
54
55/89
Experimental DesignExperimental Design
5 experiments designed
EXP#1: Static Content
EXP#2: Personal Tags
EXP#3: Social Tags
EXP#4: Static Content +
Personal Tags
EXP#5: Static Content +
Social Tags
5-fold cross validation
Evaluation Metrics: Precision (Pr),
Recall (Re), F1 measure
One run for each user:
1. Select the appropriate content
depending on the experiment
2. Split the selected data into a
training set Tr and a test set
Ts
3. Use Tr for learning the
corresponding user profile
4. Evaluate the predictive
accuracy of the induced
profile on Ts
55
56/89
Analysis of PrecisionAnalysis of Precision
Type of Content Precision* Recall* F1*
EXP#1: Static Content 75.86 94.27 84.07
EXP#2: Personal Tags 75.96 92.65 83.48
EXP#3: Social Tags 75.59 90.50 82.37
EXP#4: Static Content + Personal Tags 78.04 93.60 85.11
EXP#5: Static Content + Social Tags 78.01 93.19 84.93
56
* Results averaged over the 30 study subjects
Augmented
Profiles
Content-based
Profiles
Tag-based
Profiles
57/89
Analysis of PrecisionAnalysis of Precision
Type of Content Precision* Recall* F1*
EXP#1: Static Content 75.86 94.27 84.07
EXP#2: Personal Tags 75.96 92.65 83.48
EXP#3: Social Tags 75.59 90.50 82.37
EXP#4: Static Content + Personal Tags 78.04 93.60 85.11
EXP#5: Static Content + Social Tags 78.01 93.19 84.93
57
* Results averaged over the 30 study subjects
Augmented
Profiles
Content-based
Profiles
Tag-based
Profiles
Tag vs CB
Precision not
improved
58/89
Analysis of PrecisionAnalysis of Precision
Type of Content Precision* Recall* F1*
EXP#1: Static Content 75.86 94.27 84.07
EXP#2: Personal Tags 75.96 92.65 83.48
EXP#3: Social Tags 75.59 90.50 82.37
EXP#4: Static Content + Personal Tags 78.04 93.60 85.11
EXP#5: Static Content + Social Tags 78.01 93.19 84.93
58
* Results averaged over the 30 study subjects
Augmented
Profiles
Content-based
Profiles
Tag-based
Profiles
Augmented vs CB
Precision
Improvement ≈ 2%
59/89
Analysis of RecallAnalysis of Recall
Type of Content Precision* Recall* F1*
EXP#1: Static Content 75.86 94.27 84.07
EXP#2: Personal Tags 75.96 92.65 83.48
EXP#3: Social Tags 75.59 90.50 82.37
EXP#4: Static Content + Personal Tags 78.04 93.60 85.11
EXP#5: Static Content + Social Tags 78.01 93.19 84.93
59
* Results averaged over the 30 study subjects
Augmented
Profiles
Content-based
Profiles
Tag-based
Profiles
Tag vs CB
Recall decrease
1.62% – 3.77%
60/89
Analysis of RecallAnalysis of Recall
Type of Content Precision* Recall* F1*
EXP#1: Static Content 75.86 94.27 84.07
EXP#2: Personal Tags 75.96 92.65 83.48
EXP#3: Social Tags 75.59 90.50 82.37
EXP#4: Static Content + Personal Tags 78.04 93.60 85.11
EXP#5: Static Content + Social Tags 78.01 93.19 84.93
60
* Results averaged over the 30 study subjects
Augmented
Profiles
Content-based
Profiles
Tag-based
Profiles
Augmented vs CB
Recall decrease:
0.67% – 1.08%
61/89
Analysis of F1Analysis of F1
Type of Content Precision* Recall* F1*
EXP#1: Static Content 75.86 94.27 84.07
EXP#2: Personal Tags 75.96 92.65 83.48
EXP#3: Social Tags 75.59 90.50 82.37
EXP#4: Static Content + Personal Tags 78.04 93.60 85.11
EXP#5: Static Content + Social Tags 78.01 93.19 84.93
61
* Results averaged over the 30 study subjects
Augmented
Profiles
Content-based
Profiles
Tag-based
Profiles
Overall accuracy F1 ≈ 85%
62/89
Putting Intelligence into CBRS:Putting Intelligence into CBRS:
Challenges & Research DirectionsChallenges & Research Directions
Semantic analysisSemantic analysis ofof
content by means ofcontent by means of
external knowledgeexternal knowledge
sourcessources
LanguageLanguage--independentindependent
CBRSCBRS
Beyond keywords:Beyond keywords:
novel strategies for thenovel strategies for the
representation ofrepresentation of
items and profilesitems and profilesLimited ContentLimited Content
AnalysisAnalysis
Defeating homophily:Defeating homophily:
recommendationrecommendation
diversificationdiversification
Taking advantage ofTaking advantage of
Web 2.0 for collectingWeb 2.0 for collecting
User Generated ContentUser Generated Content
CHALLENGESCHALLENGES
OverspecializationOverspecialization
PROBLEMSPROBLEMS
““computationalcomputational””
serendipityserendipity
programming forprogramming for
serendipityserendipity
Knowledge InfusionKnowledge Infusion
FolksonomyFolksonomy--based CBRSbased CBRS
RESEARCHRESEARCH
DIRECTIONSDIRECTIONS
63/89
Serendipity: DefinitionsSerendipity: Definitions
Serendipity
Making discoveries, by accidents and sagacity, of things
which one were not in quest of (Horace Walpole, 1754)
The art of making an unsought finding (Pek van Andel,
1994) [vanAndel94]
Serendipitous ideas and findings
Gelignite by Alfred Nobel, when he accidentally mixed
collodium (gun cotton) with nitroglycerin
Penicillin by Alexander Fleming
The psychedelic effects of LSD by Albert Hofmann
Cellophane by Jacques Brandenberger
The structure of benzene by Friedric August Kekulé
[vanAndel94] van Andel, P. Anatomy of the Unsought Finding. Serendipity: Origin, History, Domains, Traditions,
Appearances, Patterns and Programmability. The British Journal for the Philosophy of Science, 45(2): 631-648, 994.
64/89
The challengeThe challenge
Serendipity in RSs is the experience of
receiving an unexpected and fortuitous, but
useful advice
it is a way to diversify recommendations
The challenge is programming for
serendipity
to find a manner to introduce
serendipity into the recommendation
process in an operational way
65/89
Strategies forStrategies for computationalcomputational serendipityserendipity [Toms00][Toms00]
“Blind Luck”: random recommendations
“Prepared Mind”: Pasteur principle (“chance favors the
prepared mind”) - deep user modeling
“Anomalies and Exceptions”: searching for dissimilarity
[Iaquinta10]
“Reasoning by Analogy”
[Iaquinta10] L. Iaquinta, M. de Gemmis, P. Lops, G. Semeraro, P. Molino (2010). Can a Recommender
System Induce Serendipitous Encounters? In: KYEONG KANG. E-Commerce, 229-246, VIENNA: IN-TECH,
2010.
[Toms00] Toms, E. Serendipitous Information Retrieval. In Proceedings of the First DELOS Network of
Excellence Workshop on Information Seeking, Searching and Querying in Digital Libraries, Zurich,
Switzerland: European Research Consortium for Informatics and Mathematics, 2000.
66/89
Programming for Serendipity into CBRS:Programming for Serendipity into CBRS:
“Anomalies and Exceptions”“Anomalies and Exceptions”
Basic recommendation list defined by the best N
items ranked according to the user profile
Idea for inducing serendipity
extending the basic list with items
programmatically supposed to be serendipitous
for the active user
67/89
ITem Recommender (ITR)ITem Recommender (ITR)
Content-based recommender developed at Univ. of
Bari [Semeraro07]
learns a probabilistic model of the interests of the
user from textual descriptions of items
user profile = binary text classifier able to
categorize items as interesting (LIKES) or not
(DISLIKES)
a-posteriori probabilities as classification scores for
LIKES and DISLIKES
[Semeraro07] G. Semeraro, M. Degemmis, P. Lops, and P. Basile. Combining Learning and Word Sense Disambiguation
for Intelligent User Profiling. In M. M. Veloso, editor, IJCAI 2007, Proceedings of the 20th International Joint Conference
on Artificial Intelligence, Hyderabad, India, January 6-12, 2007, pages 2856–2861, Morgan Kaufmann, 2007.
68/89
Recommendation process: Ranked list approachRecommendation process: Ranked list approach
Profile
Learner
DISLIKESLIKES
USER PROFILE
future violence
alien
…
blood
…
0.89
0.74
0.22
P(LIKES | ALF)
…
69/89
Programming for Serendipity into ITR: strategyProgramming for Serendipity into ITR: strategy
Potentially serendipitous items selected on the
ground of categorization scores for LIKES and
DISLIKES
difference of classification scores tends to zero
uncertain classification
| P(LIKES | ITEM) – P(DISLIKES | ITEM) | ≈ 0
assumption:
uncertain classification ≡ items not known by the user
70/89
Programming for Serendipity into ITR: exampleProgramming for Serendipity into ITR: example
Basic recommendation list = N most
interesting items
Ranked list of “unpredictable” items
obtained from ITR
Basic recommendation list augmented
with some serendipitous items
DISLIKESLIKES
USER PROFILE
future violence
alien blood
0.760.89 0.72
P(LIKES | ITEM)
… …
0.01 0.02
| P(LIKES | ITEM) –
P(DISLIKES | ITEM) |
71/89
What about evaluation?What about evaluation?
Classic evaluation metrics (Precision, Recall, F, MAE,…) don’t
take into account obviousness, novelty and serendipity
Accurate recommendation ≠ Useful recommendation
emotional response associated with serendipity difficult to
capture by conventional accuracy metrics
serendipity degree impossible to evaluate without
considering user feedback
Novel metrics required
planned as a future work
72/89
Programming for Serendipity:Programming for Serendipity:
crosscross--domain recommendationsdomain recommendations
73/89
““Reasoning by AnalogyReasoning by Analogy””: a serendipity strategy for: a serendipity strategy for
crosscross--domain recommendationsdomain recommendations
ONTOLOGY
user profile for Movies “parallel” user profile for Travels
74/89
Ongoing work: DEVIUSOngoing work: DEVIUS
Analogy engine for computing “parallel” user profiles
Spreading activation on DBpedia for mapping
between domains
Open source code of DEVIUS available in September
Experimental evaluation
books / movies
75/89
Putting Intelligence into CBRS:Putting Intelligence into CBRS:
Challenges & Research DirectionsChallenges & Research Directions
Semantic analysisSemantic analysis ofof
content by means ofcontent by means of
external knowledgeexternal knowledge
sourcessources
LanguageLanguage--independentindependent
CBRSCBRS
Beyond keywords:Beyond keywords:
novel strategies for thenovel strategies for the
representation ofrepresentation of
items and profilesitems and profilesLimited ContentLimited Content
AnalysisAnalysis
Defeating homophily:Defeating homophily:
recommendationrecommendation
diversificationdiversification
Taking advantage ofTaking advantage of
Web 2.0 for collectingWeb 2.0 for collecting
User Generated ContentUser Generated Content
CHALLENGESCHALLENGES
OverspecializationOverspecialization
PROBLEMSPROBLEMS
““computational”computational”
serendipityserendipity
programming forprogramming for
serendipityserendipity
Knowledge InfusionKnowledge Infusion
FolksonomyFolksonomy--based CBRSbased CBRS
RESEARCHRESEARCH
DIRECTIONSDIRECTIONS
76/89
Knowledge Infusion (KI)Knowledge Infusion (KI)
Humans typically have the linguistic and cultural
experience to comprehend the meaning of a text
How to realize this capability into machines?
In NLP tasks, computers require access to vast
amounts of common-sense and domain-specific
world knowledge
Infusing lexical knowledge Dictionaries
(e.g. WordNet)
Infusing cultural knowledge Wikipedia
…
77/89
Enhancing CBRS by KIEnhancing CBRS by KI
Modeling the unstructured information stored in several (open)
knowledge sources
Exploiting the acquired knowledge in order to better understand
the item descriptions and extract more meaningful features
Inspired by a language game: The Guillotine [Semeraro09b]
Cultural and Linguistic
Background Knowledge
[Semeraro09b] G. Semeraro, P. Lops, P. Basile, and M. de Gemmis. On the Tip of my Thought: Playing the
Guillotine Game. In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI
2009), 1543-1548, Morgan Kaufmann, 2009.
78/89
The Guillotine: the gameThe Guillotine: the game
[Lops09] P. Lops, P. Basile, M. de Gemmis and G. Semeraro. "Language Is the Skin of My Thought":
Integrating Wikipedia and AI to Support a Guillotine Player. In: R. Serra, R. Cucchiara (Eds.), AI*IA 2009:
Emergent Perspectives in Artificial Intelligence, XIth International Conference of the Italian Association for
Artificial Intelligence, Reggio Emilia, Italy, December 9-12, 2009. LNCS 5883, 324-333, Springer 2009.
79/89
Let’s try to play the gameLet’s try to play the game
APPLE
JUDGMENT
SUNRISE
“An apple a day takes the doctor away”
Day of Judgment
Beginning of the day
INDEPENDENCE Independence day
SLEEPER Daysleeper, a famous song by R.E.M.
80/89
Clue#1
Clue#2
Clue#3
Clue#4
Clue#5
DictionaryDictionary
EncyclopediaEncyclopedia
ProverbsProverbs
DIC-WORD1
DIC-WORD2
…
LINGUISTIC
WORLD
SPREADING
ACTIVATION NET
ENC-WORD1
ENC-WORD2
…
PRO-WORD1
PRO-WORD2
…
CLUE-RELATED WORDSKNOWLEDGE
SOL-WORD1
SOL-WORD2
…
CANDIDATE
SOLUTION LIST
CLUES
81/89
What does OTTHO know aboutWhat does OTTHO know about ‘‘starsstars’’??
STAR
KNOWLEDGE
DICTIONARY MATRIX
0.55
LIGHT
STAR
Lemmas
…
1.45
SKY
…
…
TAG MATRIX
0.27
ALIEN
STAR
…
1.41
SPACE
…
…
Tagsinitems’
tagcloud
SKY 1.45
LIGHT 0.55
…
SPACE 1.41
ALIEN 0.27
…
Lemma: Definitions | Compound Forms
Star: any one of the distant bodies appearing as a point of light in the
sky at night | Fixed star, i.e. one which is not a planet
“STAR, SPACE, ALIEN”
82/89
KI@workKI@work for recommendation diversificationfor recommendation diversification
STAR
ROBOT
ALIEN
WAR
BATTLE
SPACE 0.36
FUTURE 0.10
EXTRATERRESTRIAL 0.08
CYBORG 0.07
FIGHT 0.02
JUSTICE 0.01
…
Plot Keywords
KI-LIST
Search Results
83/89
Concluding RemarksConcluding Remarks
Research directions for overcoming some CBRS drawbacks
main strategies adopted to introduce some semantics in the
recommendation process
main strategies for diversifying recommendations
Research agenda: glean meaning and user thought from the
precious boxes (brain, Web, social networks,…) they are hidden
into:
fMRI & Eye/Head-tracking technologies for a new generation of
evaluation metrics
Linked Open Data: interlinking user profiles with Semantic Web
data and LOD
Semantic Cross-system Personalization: semantic matching of
user profiles coming from heterogeneous systems
84/89
Thanks…Thanks…
…for your attention…
…Questions?
Semantic
Web 
Access and 
Personalization 
research group
http://www.di.uniba.it/~swap
Pierpaolo Basile
Marco de Gemmis
Leo Iaquinta
Piero Molino
Fedelucio Narducci
Eufemia Tinelli
Annalina Caputo
Michele Filannino
Pasquale Lops
Cataldo Musto
Giovanni Semeraro
85/89
+ The librarian + “A Logic Named Joe”
- Gaetano Bassolino
& Emanuele Vizzini
+ Arundhati
Roy
+ Milena Jole Gabanelli
CreditsCredits
+ Tullio
De Mauro
+ Ivonne
Bordelois
+ Umberto
Eco
+ Stefano Bartezzaghi
“Accavallavacca”
86/89
References 1/4References 1/4
[Aciar07] S. Aciar, D. Zhang, S. Simoff, and J. Debenham. Informed Recommender: Basing
Recommendations on Consumer Product Reviews. IEEE Intelligent Systems, 22(3):39–47, 2007.
[Basile07] P. Basile, M. Degemmis, A. Gentile, P. Lops, and G. Semeraro. UNIBA: JIGSAW algorithm for
Word Sense Disambiguation. In Proc.4th ACL 2007 International Workshop on Semantic
Evaluations (SemEval-2007), Prague, Czech Republic, 398–401, Association for Computational
Linguistics, June 23-24, 2007.
[BlancoFernandez08] Y. Blanco-Fernandez, A. Gil-Solla, J. J. Pazos-Arias, M. Ramos-Cabrer, and M.
Lopez-Nores. Providing Entertainment by Content-based Filtering and Semantic Reasoning in
Intelligent Recommender Systems. IEEE Trans. on Consumer Electronics, 54(2):727–735, 2008.
[Cantador08] I. Cantador, A. Bellog´ın, and P. Castells. News@hand: A Semantic Web Approach to
Recommending News. In Wolfgang Nejdl, Judy Kay, Pearl Pu, and Eelco Herderm (Eds.), Adaptive
Hypermedia and Adaptive Web-Based Systems, LNCS 5149, pages 279–283, Springer, 2008.
[Degemmis07] M. Degemmis, P. Lops, and G. Semeraro. A Content-collaborative Recommender that
Exploits WordNet-based User Profiles for Neighborhood Formation. User Modeling and User-
Adapted Interaction: The Journal of Personalization Research (UMUAI), 17(3):217–255, Springer
Science + Business Media B.V., 2007.
[de Gemmis08] M. de Gemmis, P. Lops, G. Semeraro, and P. Basile. Integrating Tags in a Semantic
Content-based Recommender. In RecSys ’08, Proc. of the 2nd ACM Conference on Recommender
Systems, pages 163–170, October 23-25, 2008, Lausanne, Switzerland, ACM, 2008.
[Eco07] U. Eco, Sator arepo eccetera. Bompiani, 2007 (in Italian).
87/89
References 2/4References 2/4
[Egozi09] O. Egozi. Concept-Based Information Retrieval using Explicit Semantic Analysis. M.Sc.
Thesis, CS Department, Technion, 2009.
[Eirinaki03] LM. Eirinaki, M. Vazirgiannis, and I. Varlamis. SEWeP: Using Site Semantics and a
Taxonomy to Enhance the Web Personalization Process. In Proceedings of the Ninth ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 99–108,
ACM, 2003.
[Gabri06] E. Gabrilovich and S. Markovitch. Overcoming the Brittleness Bottleneck using Wikipedia:
Enhancing Text Categorization with Encyclopedic Knowledge. In Proceed. of the 21th National
Conf. on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence
Conf., pages 1301–1306, AAAI Press, 2006.
[Gabri07] E. Gabrilovich and S. Markovitch. Computing Semantic Relatedness Using Wikipedia-based
Explicit Semantic Analysis. In Manuela M. Veloso, editor, Proceedings of the 20th International
Joint Conference on Artificial Intelligence, pages 1606–1611, 2007.
[Giles05] J. Giles. Internet Encyclopaedias Go Head to Head. Nature, 438:900–901, 2005.
[Herlocker04] Herlocker, J.L., Konstan, J.A., Terveen, L.G., and Riedl, J.T. Evaluating Collaborative
Filtering Recommender Systems. ACM Transactions on Information Systems, 22(1): 39-49,
2004.
[Iaquinta10] L. Iaquinta, M. de Gemmis, P. Lops, G. Semeraro, P. Molino (2010). Can a Recommender
System Induce Serendipitous Encounters? In: KYEONG KANG. E-Commerce, 229-246,
VIENNA: IN-TECH, 2010.
[Lees08] J. Lees-Miller, F. Anderson, B. Hoehn, and R. Greiner. Does Wikipedia Information Help
Netflix Predictions? Proceedings of the Seventh International Conference on Machine Learning
and Applications (ICMLA), pages 337–343, IEEE Computer Society, 2008.
88/89
References 3/4References 3/4
[Lops09] P. Lops, P. Basile, M. de Gemmis and G. Semeraro. "Language Is the Skin of My Thought":
Integrating Wikipedia and AI to Support a Guillotine Player. In: R. Serra, R. Cucchiara (Eds.), AI*IA
2009: Emergent Perspectives in Artificial Intelligence, XIth International Conference of the Italian
Association for Artificial Intelligence, Reggio Emilia, Italy, December 9-12, 2009. LNCS 5883,
324-333, Springer 2009.
[Lops10] P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the
Art and Trends. In: P. Kantor, F. Ricci, L. Rokach and B. Shapira, editors, Recommender Systems
Handbook: A Complete Guide for Research Scientists & Practitioners, Chapter 3, pages 73-105,
BERLIN: Springer, 2010.
[McNee06] S.M. McNee, J. Riedl, and J. Konstan. Accurate is not always good: How accuracy metrics
have hurt recommender systems. In Extended Abstracts of the 2006 ACM Conference on Human
Factors in Computing Systems, pages 1-5, Canada, 2006.
[Middleton04] S. E. Middleton, N. R. Shadbolt, and D. C. De Roure. Ontological User Profiling in
Recommender Systems. ACM Transactions on Information Systems, 22(1):54–88, 2004.
[Pazzani07] Pazzani, M. J., & Billsus, D. Content-Based Recommendation Systems. The Adaptive Web.
Lecture Notes in Computer Science vol. 4321, 325-341, 2007.
[Pedersen04] Pedersen, Ted and Patwardhan, Siddharth, and Michelizzi, Jason. WordNet::Similarity -
Measuring the Relatedness of Concepts. In Proceedings of the Nineteenth National Conference
on Artificial Intelligence (AAAI-2004), pp. 1024-1025, San Jose, CA, July, 2004.
[Semeraro07] G. Semeraro, M. Degemmis, P. Lops, and P. Basile. Combining Learning and Word
Sense Disambiguation for Intelligent User Profiling. In M. M. Veloso, editor, IJCAI 2007,
Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad,
India, January 6-12, 2007, pages 2856–2861, Morgan Kaufmann, 2007.
89/89
References 4/4References 4/4
[Semeraro09a] G. Semeraro, P. Lops, P. Basile, and M. de Gemmis. Knowledge Infusion into
Content-based Recommender Systems. In Proceedings of the 2009 ACM Conf. on
Recommender Systems, RecSys 2009, pages 301-304, New York, USA, October 22-25, 2009.
[Semeraro09b] G. Semeraro, P. Lops, P. Basile, and M. de Gemmis. On the Tip of my Thought:
Playing the Guillotine Game. In Proceedings of the 21st International Joint Conference on
Artificial Intelligence (IJCAI 2009), 1543-1548, Morgan Kaufmann, 2009.
[Smirnov08] A. V. Smirnov and A. Krizhanovsky. Information Filtering based on Wiki Index
Database. CoRR, abs/0804.2354, 2008.
[Toms00] Toms, E. Serendipitous Information Retrieval. In Proceedings of the First DELOS
Network of Excellence Workshop on Information Seeking, Searching and Querying in Digital
Libraries, Zurich, Switzerland: European Research Consortium for Informatics and
Mathematics, 2000.
[vanAndel94] van Andel, P. Anatomy of the Unsought Finding. Serendipity: Origin, History,
Domains, Traditions, Appearances, Patterns and Programmability. The British Journal for the
Philosophy of Science, 45(2), pp. 631-648, 1994.
[Zuckerman08] E. Zuckerman. Homophily, serendipity, xenophilia. April 25, 2008.
www.ethanzuckerman.com/blog/2008/04/25/homophily-serendipity-xenophilia/

More Related Content

Similar to "Content-based RecSys: problems, challenges & research directions"-UMAP'10, ITWP workshop, Big Island of Hawaii, June'10

IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...IRJET Journal
 
Multi Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - OverviewMulti Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - OverviewDavide Giannico
 
PhD defense - Exploiting distributional semantics for content-based and conte...
PhD defense - Exploiting distributional semantics for content-based and conte...PhD defense - Exploiting distributional semantics for content-based and conte...
PhD defense - Exploiting distributional semantics for content-based and conte...Victor Codina
 
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...IRJET Journal
 
Recommenders Systems
Recommenders SystemsRecommenders Systems
Recommenders SystemsTariq Hassan
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE
 
IRJET- Analyzing Sentiments in One Go
IRJET-  	  Analyzing Sentiments in One GoIRJET-  	  Analyzing Sentiments in One Go
IRJET- Analyzing Sentiments in One GoIRJET Journal
 
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...IRJET Journal
 
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...YONG ZHENG
 
Customer to Customer recommendation system
Customer to Customer recommendation systemCustomer to Customer recommendation system
Customer to Customer recommendation systemsksaif95
 
Study of Recommendation System Used In Tourism and Travel
Study of Recommendation System Used In Tourism and TravelStudy of Recommendation System Used In Tourism and Travel
Study of Recommendation System Used In Tourism and Travelijtsrd
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015Marianne Sweeny
 
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional DatasetsProjection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional DatasetsIRJET Journal
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madnesssemanticsconference
 
Identifying Malicious Reviews Using NLP and Bayesian Technique on Ecommerce H...
Identifying Malicious Reviews Using NLP and Bayesian Technique on Ecommerce H...Identifying Malicious Reviews Using NLP and Bayesian Technique on Ecommerce H...
Identifying Malicious Reviews Using NLP and Bayesian Technique on Ecommerce H...IRJET Journal
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label ClassificationYONG ZHENG
 
A Community Detection and Recommendation System
A Community Detection and Recommendation SystemA Community Detection and Recommendation System
A Community Detection and Recommendation SystemIRJET Journal
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systemsvivatechijri
 

Similar to "Content-based RecSys: problems, challenges & research directions"-UMAP'10, ITWP workshop, Big Island of Hawaii, June'10 (20)

IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...
 
Multi Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - OverviewMulti Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - Overview
 
PhD defense
PhD defense PhD defense
PhD defense
 
PhD defense - Exploiting distributional semantics for content-based and conte...
PhD defense - Exploiting distributional semantics for content-based and conte...PhD defense - Exploiting distributional semantics for content-based and conte...
PhD defense - Exploiting distributional semantics for content-based and conte...
 
Recommender-technology-ReColl08
Recommender-technology-ReColl08Recommender-technology-ReColl08
Recommender-technology-ReColl08
 
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...
 
Recommenders Systems
Recommenders SystemsRecommenders Systems
Recommenders Systems
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
 
IRJET- Analyzing Sentiments in One Go
IRJET-  	  Analyzing Sentiments in One GoIRJET-  	  Analyzing Sentiments in One Go
IRJET- Analyzing Sentiments in One Go
 
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
 
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
 
Customer to Customer recommendation system
Customer to Customer recommendation systemCustomer to Customer recommendation system
Customer to Customer recommendation system
 
Study of Recommendation System Used In Tourism and Travel
Study of Recommendation System Used In Tourism and TravelStudy of Recommendation System Used In Tourism and Travel
Study of Recommendation System Used In Tourism and Travel
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015
 
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional DatasetsProjection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madness
 
Identifying Malicious Reviews Using NLP and Bayesian Technique on Ecommerce H...
Identifying Malicious Reviews Using NLP and Bayesian Technique on Ecommerce H...Identifying Malicious Reviews Using NLP and Bayesian Technique on Ecommerce H...
Identifying Malicious Reviews Using NLP and Bayesian Technique on Ecommerce H...
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification
 
A Community Detection and Recommendation System
A Community Detection and Recommendation SystemA Community Detection and Recommendation System
A Community Detection and Recommendation System
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 

Recently uploaded

THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECTTHE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT17mos052
 
Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!andrekr997
 
Mastering Wealth with YouTube Content Marketing.pdf
Mastering Wealth with YouTube Content Marketing.pdfMastering Wealth with YouTube Content Marketing.pdf
Mastering Wealth with YouTube Content Marketing.pdfTirupati Social Media
 
O9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking MenO9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking MenSapana Sha
 
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar DelhiCall Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhidelhiescort
 
AI Virtual Influencers: The Future of Influencer Marketing
AI Virtual Influencers:  The Future of Influencer MarketingAI Virtual Influencers:  The Future of Influencer Marketing
AI Virtual Influencers: The Future of Influencer MarketingCut-the-SaaS
 
When-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptxWhen-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptxReaper61
 
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一ra6e69ou
 
Amplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesAmplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesNetqom Solutions
 
YouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdfYouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdfAlexander Sirach
 
social media advantages and disadvantages
social media advantages and disadvantagessocial media advantages and disadvantages
social media advantages and disadvantagesmehwishkhan1018786
 
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170Komal Khan
 
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsUnveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsSocioCosmos
 
Music Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptxMusic Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptxjenrobinson12
 
Upgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio CosmosUpgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio CosmosSocioCosmos
 
Cosmic Conversations with Sociocosmos...
Cosmic Conversations with Sociocosmos...Cosmic Conversations with Sociocosmos...
Cosmic Conversations with Sociocosmos...SocioCosmos
 

Recently uploaded (20)

THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECTTHE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
 
Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!
 
Mastering Wealth with YouTube Content Marketing.pdf
Mastering Wealth with YouTube Content Marketing.pdfMastering Wealth with YouTube Content Marketing.pdf
Mastering Wealth with YouTube Content Marketing.pdf
 
O9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking MenO9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking Men
 
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar DelhiCall Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
 
AI Virtual Influencers: The Future of Influencer Marketing
AI Virtual Influencers:  The Future of Influencer MarketingAI Virtual Influencers:  The Future of Influencer Marketing
AI Virtual Influencers: The Future of Influencer Marketing
 
young call girls in Greater Noida 🔝 9953056974 🔝 Delhi escort Service
young call girls in  Greater Noida 🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in  Greater Noida 🔝 9953056974 🔝 Delhi escort Service
young call girls in Greater Noida 🔝 9953056974 🔝 Delhi escort Service
 
When-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptxWhen-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptx
 
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一
定制(ENU毕业证书)英国爱丁堡龙比亚大学毕业证成绩单原版一比一
 
Amplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesAmplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing Services
 
YouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdfYouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdf
 
social media advantages and disadvantages
social media advantages and disadvantagessocial media advantages and disadvantages
social media advantages and disadvantages
 
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
 
young Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Service
young Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Serviceyoung Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Service
young Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Service
 
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsUnveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
 
Hot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort Service
 
Music Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptxMusic Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptx
 
Upgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio CosmosUpgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio Cosmos
 
Cosmic Conversations with Sociocosmos...
Cosmic Conversations with Sociocosmos...Cosmic Conversations with Sociocosmos...
Cosmic Conversations with Sociocosmos...
 
Enjoy ➥8448380779▻ Call Girls In Noida Sector 93 Escorts Delhi NCR
Enjoy ➥8448380779▻ Call Girls In Noida Sector 93 Escorts Delhi NCREnjoy ➥8448380779▻ Call Girls In Noida Sector 93 Escorts Delhi NCR
Enjoy ➥8448380779▻ Call Girls In Noida Sector 93 Escorts Delhi NCR
 

"Content-based RecSys: problems, challenges & research directions"-UMAP'10, ITWP workshop, Big Island of Hawaii, June'10

  • 1. ContentContent--based Recommender Systemsbased Recommender Systems problems, challengesproblems, challenges and research directionsand research directions Giovanni Semeraro & the SWAP group http://www.di.uniba.it/~swap/ semeraro@di.uniba.it Department of Computer Science University of Bari “Aldo Moro” UMAP 2010 – 8° Workshop on INTELLIGENT TECHNIQUES FOR WEB PERSONALIZATION & RECOMMENDER SYSTEMS (ITWP 2010) BIG ISLAND OF HAWAII, JUNE 20 2010 Semantic Web  Access and  Personalization  research group http://www.di.uniba.it/~swap
  • 2. 2/89 OutlineOutline Content-based Recommender Systems (CBRS) Basics Advantages & Drawbacks Drawback 1: Limited content analysis Beyond keywords: Semantics into CBRS Taking advantage of Web 2.0: Folksonomy-based CBRS Drawback 2: Overspecialization Strategies for diversification of recommendations
  • 3. 3/89 ContentContent--based Recommender Systems (CBRS)based Recommender Systems (CBRS) Recommend an item to a user based upon a description of the item and a profile of the user’s interests Implement strategies for: representing items creating a user profile that describes the types of items the user likes/dislikes comparing the user profile to some reference characteristics (with the aim to predict whether the user is interested in an unseen item) [Pazzani07] Pazzani, M. J., & Billsus, D. Content-Based Recommendation Systems. The Adaptive Web. Lecture Notes in Computer Science vol. 4321, 325-341, 2007.
  • 4. 4/89 ContentContent--basedbased FilteringFiltering User Profile User profile compared against items for relevance computation Information Source Target User Items recommended to the user
  • 5. 5/89 ContentContent--based Filteringbased Filtering Each user is assumed to operate independently Items are represented by some features Movies: actors, director, plot, … The profile is often created and updated automatically in response to feedback on the desirability of items that have been presented to the user Machine Learning for automated inference Relevance judgment on items, e.g. ratings Training on rated items user profile Filtering based on the comparison between the content (features) of the items and the user preferences as defined in the user profile Keyword-based representation for content and profiles string matching or text similarity
  • 6. 6/89 General Architecture of CBRSGeneral Architecture of CBRS CONTENTCONTENT ANALYZERANALYZER PROFILEPROFILE LEARNERLEARNER FILTERINGFILTERING COMPONENTCOMPONENT Information Source Represented Items Feedback PROFILES Structured Item Representation Active user ua Item Descriptions User ua feedback User ua training examples User ua feedback List of recommendations User ua Profile New Items User ua Profile
  • 7. 7/89 Advantages of CBRSAdvantages of CBRS USER INDEPENDENCE CBRS exploit solely ratings provided by the active user to build her own profile No need for data on other users TRANSPARENCY CBRS can provide explanations for recommended items by listing content-features that caused an item to be recommended NEW ITEM (Item not yet rated by any user) CBRS are capable of recommending new and unknown items No first-rater problem
  • 8. 8/89 Drawbacks of CBRS: LIMITED CONTENTDrawbacks of CBRS: LIMITED CONTENT ANALYSISANALYSIS No suitable suggestions if the analyzed content does not contain enough information to discriminate items the user likes from items the user does not like Content must be encoded as meaningful features automatic/manually assignment of features to items might be insufficient to define distinguishing aspects of items necessary for the elicitation of user interests keywords not appropriate for representing content, due to polysemy, synonymy, multi-word concepts (homography, homophony,...) – “Sator arepo eccetera” [Eco07] SATOR AREPO TENET OPERA ROTAS R E T S O N R E T A P R E T S O N R E T A P E R N O S T RETAP E R N O S T RETAP AA OOOO AA
  • 9. 9/89 AI is a branch of computer science doc1 the 2011 International Joint Conference on Artificial Intelligence will be held in Spain doc2 apple launches a new product… doc3 artificial 0.02 intelligence 0.01 apple 0.13 AI 0.15 … USER PROFILE MULTI-WORD CONCEPTS KeywordKeyword--based Profilesbased Profiles
  • 10. 10/89 AI is a branch of computer science doc1 the 2011 International Joint Conference on Artificial Intelligence will be held in Spain doc2 apple launches a new product… doc3 artificial 0.02 intelligence 0.01 apple 0.13 AI 0.15 … USER PROFILE SYNONYMY KeywordKeyword--based Profilesbased Profiles
  • 11. 11/89 AI is a branch of computer science doc1 the 2011 International Joint Conference on Artificial Intelligence will be held in Spain doc2 apple launches a new product… doc3 artificial 0.02 intelligence 0.01 apple 0.13 AI 0.15 … USER PROFILE POLYSEMY KeywordKeyword--based Profilesbased Profiles NLP methods are needed for the elicitation of user interests
  • 12. 12/89 Drawbacks of CBRS: OVERSPECIALIZATIONDrawbacks of CBRS: OVERSPECIALIZATION CBRS suggest items whose scores are high when matched against the user profile the user is going to be recommended items similar to those already rated No inherent method for finding something unexpected Obviousness in recommendations suggesting “STAR TREK” to a science-fiction fan: accurate but not useful users don’t want algorithms that produce better ratings, but sensible recommendations The Serendipity Problem [McNee06] S.M. McNee, J. Riedl, and J. Konstan. Accurate is not always good: How accuracy metrics have hurt recommender systems. In Extended Abstracts of the 2006 ACM Conference on Human Factors in Computing Systems, pages 1-5, Canada, 2006.
  • 13. 13/89 The serendipity problem: mind cagesThe serendipity problem: mind cages Homophily: the tendency to surround ourselves by like-minded people opinions taken to extremes cultural impoverishment threat for biodiversity?
  • 14. 14/89 The homophily trapThe homophily trap Does homophily hurt RS? try to tell Amazon that you liked the movie “War Games”… [Zuckerman08] E. Zuckerman. Homophily, serendipity, xenophilia. April 25, 2008. www.ethanzuckerman.com/blog/2008/04/25/homophily-serendipity-xenophilia/
  • 15. 15/89 The homophily trapThe homophily trap Recommendations by other (ageing?) COMPUTER GEEKS!
  • 17. 17/89 Novelty vs SerendipityNovelty vs Serendipity Novelty: A novel recommendation helps the user find a surprisingly interesting item she might have autonomously discovered Serendipity: A serendipitous recommendation helps the user find a surprisingly interesting item she might not have otherwise discovered How to introduce serendipity in (CB)RS? [Herlocker04] Herlocker, J.L., Konstan, J.A., Terveen, L.G., and Riedl, J.T. Evaluating Collaborative Filtering Recommender Systems. ACM Transactions on Information Systems, 22(1): 39-49, 2004.
  • 18. 18/89 ““Computational” serendipity? A motivatingComputational” serendipity? A motivating exampleexample for Star Trek fans: Did you try “Star Trek – The experience” in Las Vegas?
  • 19. 19/89 Putting Intelligence into CBRS:Putting Intelligence into CBRS: Challenges & Research DirectionsChallenges & Research Directions Semantic analysisSemantic analysis ofof content by means ofcontent by means of external knowledgeexternal knowledge sourcessources LanguageLanguage--independentindependent CBRSCBRS Beyond keywords:Beyond keywords: novel strategies for thenovel strategies for the representation ofrepresentation of items and profilesitems and profilesLimited ContentLimited Content AnalysisAnalysis Defeating homophily:Defeating homophily: recommendationrecommendation diversificationdiversification Taking advantage ofTaking advantage of Web 2.0 for collectingWeb 2.0 for collecting User Generated ContentUser Generated Content CHALLENGESCHALLENGES OverspecializationOverspecialization PROBLEMSPROBLEMS ““computational”computational” serendipityserendipity programming forprogramming for serendipityserendipity Knowledge InfusionKnowledge Infusion FolksonomyFolksonomy--based CBRSbased CBRS RESEARCHRESEARCH DIRECTIONSDIRECTIONS
  • 20. 20/89 Putting Intelligence into CBRS:Putting Intelligence into CBRS: Challenges & Research DirectionsChallenges & Research Directions Semantic analysis ofSemantic analysis of content by means ofcontent by means of external knowledgeexternal knowledge sourcessources LanguageLanguage--independentindependent CBRSCBRS Beyond keywords:Beyond keywords: novel strategies for thenovel strategies for the representation ofrepresentation of items and profilesitems and profilesLimited ContentLimited Content AnalysisAnalysis Defeating homophily:Defeating homophily: recommendationrecommendation diversificationdiversification Taking advantage ofTaking advantage of Web 2.0 for collectingWeb 2.0 for collecting User Generated ContentUser Generated Content CHALLENGESCHALLENGES OverspecializationOverspecialization PROBLEMSPROBLEMS ““computational”computational” serendipityserendipity programming forprogramming for serendipityserendipity Knowledge InfusionKnowledge Infusion FolksonomyFolksonomy--based CBRSbased CBRS RESEARCHRESEARCH DIRECTIONSDIRECTIONS
  • 21. 21/89 Semantic Analysis = 1. Semantics: concept identification in text-based representations through advanced NLP techniques “beyond keywords” + 2. Personalization: representation of user information needs in an effective way “deep (high-accuracy) user profiles” Semantic Analysis: beyond keywordsSemantic Analysis: beyond keywords
  • 22. 22/89 Apple Computer iPhone Beyond keywordsBeyond keywords: Word Sense Disambiguation: Word Sense Disambiguation (WSD)(WSD) -- from words to meaningsfrom words to meanings WSD selects the proper meaning (sense) for a word in a text by taking into account the context in which that word occurs #12567: computer brand #22999: fruit Dictionaries, Ontologies, e.g. WordNetSense RepositorySense Repository Apple context [Basile07] P. Basile, M. Degemmis, A. Gentile, P. Lops, and G. Semeraro. UNIBA: JIGSAW algorithm for Word Sense Disambiguation. In Proceedings of the 4th ACL 2007 International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, pages 398–401, Association for Computational Linguistics, June 23-24, 2007.
  • 23. 23/89 AI is a branch of computer science doc1 the 2011 International Joint Conference on Artificial Intelligence will be held in Spain doc2 apple launches a new product… doc3 artificial 0.02 intelligence 0.01 apple 0.13 AI 0.15 … USER PROFILE MULTI-WORD CONCEPTS ITR (ITem Recommender)ITR (ITem Recommender) SenseSense--based Profilesbased Profiles #12387 0.03
  • 24. 24/89 AI is a branch of computer science doc1 the 2011 International Joint Conference on Artificial Intelligence will be held in Spain doc2 apple launches a new product… doc3 #12387 0.03 apple 0.13 AI 0.15 … USER PROFILE SYNONYMY ITR (ITem Recommender)ITR (ITem Recommender) SenseSense--based Profilesbased Profiles #12387 0.15 0.18
  • 25. 25/89 AI is a branch of computer science doc1 the 2011 International Joint Conference on Artificial Intelligence will be held in Spain doc2 apple launches a new product… doc3 #12387 0.18 apple 0.13 … USER PROFILE POLYSEMY ITR (ITem Recommender)ITR (ITem Recommender) SenseSense--based Profilesbased Profiles #12567 SEMANTIC USER PROFILE sense identifiers rather than keywords [Degemmis07] M. Degemmis, P. Lops, and G. Semeraro. A Content-collaborative Recommender that Exploits WordNet- based User Profiles for Neighborhood Formation. User Modeling and User-Adapted Interaction: The Journal of Personalization Research (UMUAI), 17(3):217–255, Springer Science + Business Media B.V., 2007. [Semeraro07] G. Semeraro, M. Degemmis, P. Lops, and P. Basile. Combining Learning and Word Sense Disambiguation for Intelligent User Profiling. In M. M. Veloso, editor, IJCAI 2007, Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, January 6-12, 2007 , pages 2856–2861. Morgan Kaufmann, 2007.
  • 26. 26/89 Advantages of SenseAdvantages of Sense--based Representationsbased Representations Semantic matching between items and profiles computing semantic relatedness [Pedersen04] rather than string matching (e.g., by using similarity measures between WordNet synsets) Senses are inherently multilingual Concepts remain the same across different languages, while terms used for describing them in each specific language change Improving transparency matched concepts can be used to justify suggestions Collaborative Filtering could benefit too finding better neighbors: similar users discovered by looking at profile overlap even if they did not rate the same items semantic profiles succeed where Pearson’s correlation coefficient fail [Pedersen04] Pedersen, Ted and Patwardhan, Siddharth, and Michelizzi, Jason. WordNet::Similarity - Measuring the Relatedness of Concepts. In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI- 2004), pp. 1024-1025, San Jose, CA, July, 2004.
  • 27. 27/89 SenseSense--based profiles in a hybrid CBbased profiles in a hybrid CB--CFCF recommenderrecommender Sense-based profiles obtained by applying WSD on textual description of items WordNet as sense repository Synset-based user profiles Hybrid CB-CF RS [Degemmis07] M. Degemmis, P. Lops, and G. Semeraro. A Content-collaborative Recommender that Exploits WordNet- based User Profiles for Neighborhood Formation. User Modeling and User-Adapted Interaction: The Journal of Personalization Research (UMUAI), 17(3):217–255, Springer Science + Business Media B.V., 2007.
  • 28. 28/89 Clustering of sense-based profiles User profiles Active user Active user Clusters of profiles Profiles in the cluster used as neighbors SenseSense--based profiles in a hybrid CBbased profiles in a hybrid CB--CFCF recommenderrecommender
  • 29. 29/89 Experimental Evaluation on EachMovieExperimental Evaluation on EachMovie datasetdataset 835 users selected from EachMovie dataset* 1,613 movies grouped into 10 categories, 180,356 ratings, user-item matrix 87% sparse Each user rated between 30 and 100 movies Discrete ratings between 0 and 5 Movie content crawled from the Internet Movie Database (IMDb) CF algorithm using Pearson’s correlation coefficient vs. CF algorithm integrating clusters of semantic user profiles *2,811,983 ratings entered by 72,916 users for 1628 different movies. As of October, 2004, HP/Compaq Research (formerly DEC Research) retired the EachMovie dataset. It is no longer available for download
  • 30. 30/89 SenseSense--based profiles improvebased profiles improve recommendationsrecommendations Rating scale: 0-5
  • 31. 31/89 Semantic Analysis: Ontologies in CBRSSemantic Analysis: Ontologies in CBRS Recommendation of onRecommendation of on--line academicline academic research papersresearch papers Research paper topic ontology based on theResearch paper topic ontology based on the computer science classification of the DMOZcomputer science classification of the DMOZ open directory projectopen directory project KK--NN classification used to associate classesNN classification used to associate classes to previously browsed papersto previously browsed papers Quickstep & FoxtrotQuickstep & Foxtrot [Middleton04][Middleton04] SEWePSEWeP (Semantic Enhancement(Semantic Enhancement for Web Personalization)for Web Personalization) [Eirinaki03][Eirinaki03] Manually built domainManually built domain--specific taxonomy ofspecific taxonomy of categories for the automated annotation ofcategories for the automated annotation of Web pagesWeb pages WordNetWordNet--based word similarity used to mapbased word similarity used to map keywords to categorieskeywords to categories Categories of interest discovered fromCategories of interest discovered from navigational history of the usernavigational history of the user DESCRIPTIONDESCRIPTIONSYSTEMSYSTEM [Lops10] P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the Art and Trends. In: P. Kantor, F. Ricci, L. Rokach and B. Shapira (Eds.), Recommender Systems Handbook: A Complete Guide for Research Scientists & Practitioners, Chapter 3, pages 73-105, BERLIN: Springer, 2010.
  • 32. 32/89 Semantic Analysis: Ontologies in CBRSSemantic Analysis: Ontologies in CBRS OWL ontology for representing TV programs andOWL ontology for representing TV programs and user profilesuser profiles OWL representation allows reasoning on preferencesOWL representation allows reasoning on preferences and discovering new knowledgeand discovering new knowledge Spreading activation for matching items andSpreading activation for matching items and preferencespreferences RS for Interactive Digital TelevisionRS for Interactive Digital Television [Blanco[Blanco--Fernandez08]Fernandez08] OntologyOntology--based news recommenderbased news recommender 17 ontologies adapted from the IPTC ontology17 ontologies adapted from the IPTC ontology ((http://http://nets.ii.uam.es/neptuno/iptcnets.ii.uam.es/neptuno/iptc/)/) Items and user profiles represented as vectors in theItems and user profiles represented as vectors in the space of concepts defined by the ontologiesspace of concepts defined by the ontologies News@handNews@hand [Cantador08][Cantador08] Informed RecommenderInformed Recommender [Aciar07][Aciar07] Consumer product reviews to makeConsumer product reviews to make recommendationsrecommendations Ontology used to convert consumers’ opinions into aOntology used to convert consumers’ opinions into a structured formstructured form TextText--mining for mapping sentences in the reviewsmining for mapping sentences in the reviews into the ontology information structureinto the ontology information structure SearchSearch--based recommendationsbased recommendations DESCRIPTIONDESCRIPTIONSYSTEMSYSTEM
  • 33. 33/89 Semantic Analysis: WikipediaSemantic Analysis: Wikipedia Do we really need only ontologies? What about encyclopedic knowledge sources available on the Web? Is Wikipedia potentially useful for CBRS? How? It is free It covers many domains It is under constant development by the community It can be seen as a multilingual corpus Its accuracy rivals that of Encyclopaedia Britannica [Giles05] [Giles05] J. Giles. Internet Encyclopaedias Go Head to Head. Nature, 438:900–901, 2005.
  • 34. 34/89 EExplicitxplicit SSemanticemantic AAnalysis (ESA)nalysis (ESA) Technique able to provide a fine-grained semantic representation of natural language texts in a high-dimensional space of comprehensible concepts derived from Wikipedia [Gabri06] [Gabri06] E. Gabrilovich and S. Markovitch. Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge. In Proceedings of the 21th National Conf. on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference, pages 1301–1306. AAAI Press, 2006. Panthera World  War II World  War II Jane  Fonda Island Wikipedia viewed as an ontology =  a collection of ~1M concepts [Egozi09] O. Egozi. Concept-Based Information Retrieval using Explicit Semantic Analysis. M.Sc. Thesis, CS Dept., Technion, 2009.
  • 35. 35/89 Wikipedia is viewed as an ontology ‐ a collection of ~1M concepts Every Wikipedia article represents a concept Panthera Explicit Semantic Analysis (ESA)Explicit Semantic Analysis (ESA) Article words are associated with the concept (TF‐IDF) Cat [0.92] Leopard [0.84] Roar [0.77]
  • 36. 36/89 Wikipedia is viewed as an ontology ‐ a collection of ~1M concepts Every Wikipedia article represents a concept Panthera Explicit Semantic Analysis (ESA)Explicit Semantic Analysis (ESA) Article words are associated with the concept (TF‐IDF) Cat [0.92] Leopard [0.84] Roar [0.77]
  • 37. 37/89 Wikipedia is viewed as an ontology ‐ a collection of ~1M concepts Every Wikipedia article represents a concept Panthera Explicit Semantic Analysis (ESA)Explicit Semantic Analysis (ESA) Article words are associated with the concept (TF‐IDF) Cat [0.92] Leopard [0.84] Roar [0.77] The semantics of a word is the vector of  its associations with Wikipedia concepts Cat Panthera [0.92] Cat [0.95] Jane  Fonda [0.07]
  • 38. 38/89 Explicit Semantic Analysis (ESA)Explicit Semantic Analysis (ESA) The semantics of a text fragment is the average vector (centroid) of the semantics of its words button Dick  Button [0.84] Button [0.93] Game  Controller [0.32] Mouse  (computing) [0.81] mouse Mouse  (computing) [0.84] Mouse  (rodent) [0.91] John  Steinbeck [0.17] Mickey  Mouse  [0.81] mouse  button Drag‐ and‐drop [0.91] Mouse  (computing) [0.95] Mouse  (rodent) [0.56] Game  Controller [0.64] In practice – WSD… mouse  button
  • 39. 39/89 ESA: concept spaceESA: concept space D1 = 2C1 + 3C2 + 5C3 D2 = 3C1 + 7C2 + 1C3 ESA used for computing semantic relatedness [Gabri07] C3 C1 C2 D1 = 2C1+ 3C2 + 5C3 D2 = 3C1 + 7C2 + 1C3 7 32 5 3 1 Ci = Wikipedia article [Gabri07] E. Gabrilovich and S. Markovitch. Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis. In Manuela M. Veloso, editor, Proceedings of the 20th International Joint Conference on Artificial Intelligence, pages 1606–1611, 2007.
  • 40. 40/89 Wikipedia and CBRS: recent ideasWikipedia and CBRS: recent ideas Wikipedia used for computing the similarity between movie descriptions for the Netflix prize competition [Lees08] ESA used for user profiling, spam detection and RSS filtering [Smirnov08] Wikipedia included in a Knowledge Infusion process for recommendation diversification [Semeraro09a] [Lees08] J. Lees-Miller, F. Anderson, B. Hoehn, and R. Greiner. Does Wikipedia Information Help Netflix Predictions? Proceedings of the Seventh International Conference on Machine Learning and Applications (ICMLA), pages 337–343. IEEE Computer Society, 2008. [Smirnov08] A. V. Smirnov and A. Krizhanovsky. Information Filtering based on Wiki Index Database. CoRR, abs/0804.2354, 2008. [Semeraro09a] G. Semeraro, P. Lops, P. Basile, and M. de Gemmis. Knowledge Infusion into Content-based Recommender Systems. In Proceedings of the 2009 ACM Conference on Recommender Systems, RecSys 2009, pages 301-304, New York, USA, October 22-25, 2009.
  • 41. 41/89 Putting Intelligence into CBRS:Putting Intelligence into CBRS: Challenges & Research DirectionsChallenges & Research Directions Semantic analysisSemantic analysis ofof content by means ofcontent by means of external knowledgeexternal knowledge sourcessources LanguageLanguage--independentindependent CBRSCBRS Beyond keywords:Beyond keywords: novel strategies for thenovel strategies for the representation ofrepresentation of items and profilesitems and profilesLimited ContentLimited Content AnalysisAnalysis Defeating homophily:Defeating homophily: recommendationrecommendation diversificationdiversification Taking advantage ofTaking advantage of Web 2.0 for collectingWeb 2.0 for collecting User Generated ContentUser Generated Content CHALLENGESCHALLENGES OverspecializationOverspecialization PROBLEMSPROBLEMS ““computational”computational” serendipityserendipity programming forprogramming for serendipityserendipity Knowledge InfusionKnowledge Infusion FolksonomyFolksonomy--based CBRSbased CBRS RESEARCHRESEARCH DIRECTIONSDIRECTIONS
  • 42. 42/89 MARS (MARS (MMultilultilAAnguagenguage RRecommenderecommender SSystem)ystem) crosscross--language user profileslanguage user profiles WSD for building language-independent user profiles MultiWordNet as sense repository Multilingual lexical database that supports English, Italian, Spanish, Portuguese, Hebrew, Romanian, Latin Alignment between synsets in the different languages – Semantic relations imported and preserved all of the inhabitants of the earth world, human race, humanity, humankind, human beings, humans, mankind, man LanguageLanguage SynsetSynset GlossGloss mondo, umanità, uomo, genere umano, terra insieme degli abitanti della terra, il complesso di tutti gli esseri umani
  • 43. 43/89 MARS (MARS (MMultilultilAAnguagenguage RRecommenderecommender SSystem)ystem) crosscross--language user profileslanguage user profiles CLOCKWORK ORANGE Being the adventures of a young man whose principal interests are rape, ultra-violence and Beethoven ARANCIA MECCANICA Le avventure di un giovane i cui principali interessi sono lo stupro, l’ultra-violenza e Beethoven “a12889641” “n5477412” “n3652872” “a2584413” “n3255687” “a3225896” “n32256325” “n225784” “n255632” “Beethoven” “n5477412” “a1744532” “a2584413” “n3652872” “a3225722” “n32256325” “n225784” “n255632” “Beethoven” ENGLISH description ITALIAN description Bag of Synset Bag of Synset
  • 44. 44/89 MARS (MARS (MMultilultilAAnguagenguage RRecommenderecommender SSystem)ystem) crosscross--language user profileslanguage user profiles Target User
  • 45. 45/89 MARS (MARS (MultilultilAnguagenguage RRecommenderecommender SSystem)ystem) preliminary resultspreliminary results MovieLens 100k ratings dataset 613 users with ≥ 20 ratings selected from 943 different users 520 movies and 40,717 ratings movie content crawled from Wikipedia (English and Italian) same movie - different descriptions in English and Italian Results in terms of Fß=0.5 measure no statistically significant difference wrt the baselines Neither content translations nor profile translations achieve the same effectiveness (they cannot avoid the negative impact of polysemy and lack of context) 63.9864.91 63.70 63.71 Recommendations Profiles
  • 46. 46/89 Putting Intelligence into CBRS:Putting Intelligence into CBRS: Challenges & Research DirectionsChallenges & Research Directions Semantic analysisSemantic analysis ofof content by means ofcontent by means of external knowledgeexternal knowledge sourcessources LanguageLanguage--independentindependent CBRSCBRS Beyond keywords:Beyond keywords: novel strategies for thenovel strategies for the representation ofrepresentation of items and profilesitems and profilesLimited ContentLimited Content AnalysisAnalysis Defeating homophily:Defeating homophily: recommendationrecommendation diversificationdiversification Taking advantage ofTaking advantage of Web 2.0 for collectingWeb 2.0 for collecting User Generated ContentUser Generated Content CHALLENGESCHALLENGES OverspecializationOverspecialization PROBLEMSPROBLEMS ““computational”computational” serendipityserendipity programming forprogramming for serendipityserendipity Knowledge InfusionKnowledge Infusion FolksonomyFolksonomy--based CBRSbased CBRS RESEARCHRESEARCH DIRECTIONSDIRECTIONS
  • 47. 47/89 Web 2.0 & UserWeb 2.0 & User--Generated Content (UGC)Generated Content (UGC) 47
  • 48. 48/89 Social Tagging & FolksonomiesSocial Tagging & Folksonomies Users annotate resources of interests with free keywords, called tags Social tagging activity builds a bottom-up classification schema, called a folksonomy Folksonomy: “Folks” + “Taxonomy” How to exploit folksonomies for advanced user profiling in CBRS? 48 Resources (Artworks) Tags Users (Visitors) … the cry, munch vangogh, girasoli van gogh,suflowers VanGogh favorite, the_scream da vinci, m onna lisa da vinci code, favorite …
  • 49. 49/89 Cultural Heritage fruition & e-learning applications of new Advanced (multimodal) Technologies In the context of cultural heritage personalization, does the  integration of UGC and textual description of artwork  collections cause an increase of the prediction accuracy in the  process of recommending artifacts to users?
  • 50. 50/89 FIRSt:FIRSt: FFolksonomyolksonomy--basedbased IItemtem RRecommender syecommender syStStemem Artwork representation Artist Title Description Tags Semantic Indexing Change of text representation from vectors of words (BOW) into vectors of WordNet synsets (BOS) From tags to semantic tags Supervised Learning Bayesian Classifier learned from artworks labeled with user ratings and tags
  • 51. 51/89 5‐point rating scale Textual description of  items (static content) Personal Tags FIRSt (FIRSt (FFolksonomyolksonomy--basedbased IItemtem RRecommender syecommender syStStem)em) Learning from Ratings & TagsLearning from Ratings & Tags 51 Social Tags (from other users): caravaggio, deposition, christ, cross, suffering, religion Social Tags passion
  • 52. 52/89 caravaggio, deposition, cross, christ, rome, … passion caravaggio, deposition, christ, cross, suffering, religion, … USER PROFILE FIRSt (FIRSt (FFolksonomyolksonomy--basedbased IItemtem RRecommender syecommender syStStem)em) Tags within User ProfilesTags within User Profiles Personal  Tags Static  Content Social Tags collaborative part of the user profile [de Gemmis08] M. de Gemmis, P. Lops, G. Semeraro, and P. Basile. Integrating Tags in a Semantic Content-based Recommender. In RecSys ’08, Proceed. of the 2nd ACM Conference on Recommender Systems, pages 163– 170, October 23-25, 2008, Lausanne, Switzerland, ACM, 2008.
  • 53. 53/89 Experimental EvaluationExperimental Evaluation Goal: Compare predictive accuracy of FIRSt when user profiles are learned from: Static content only, i.e., textual descriptions of artifacts (content-based profiles) both Static and Dynamic UGC (tag-based profiles). UGC can be: – Personal Tags, entered by a user for an artifact, i.e., the user’s contribution to the whole folksonomy – Social Tags, i.e., the whole folksonomy of tags added by all visitors 53
  • 54. 54/89 Experimental SetupExperimental Setup Dataset 45 paintings from the Vatican picture-gallery Static content (i.e., title, artist and description) captured using screenscraping bots Subjects 30 volunteers average age ≈ 25 none reported to be an art expert 54
  • 55. 55/89 Experimental DesignExperimental Design 5 experiments designed EXP#1: Static Content EXP#2: Personal Tags EXP#3: Social Tags EXP#4: Static Content + Personal Tags EXP#5: Static Content + Social Tags 5-fold cross validation Evaluation Metrics: Precision (Pr), Recall (Re), F1 measure One run for each user: 1. Select the appropriate content depending on the experiment 2. Split the selected data into a training set Tr and a test set Ts 3. Use Tr for learning the corresponding user profile 4. Evaluate the predictive accuracy of the induced profile on Ts 55
  • 56. 56/89 Analysis of PrecisionAnalysis of Precision Type of Content Precision* Recall* F1* EXP#1: Static Content 75.86 94.27 84.07 EXP#2: Personal Tags 75.96 92.65 83.48 EXP#3: Social Tags 75.59 90.50 82.37 EXP#4: Static Content + Personal Tags 78.04 93.60 85.11 EXP#5: Static Content + Social Tags 78.01 93.19 84.93 56 * Results averaged over the 30 study subjects Augmented Profiles Content-based Profiles Tag-based Profiles
  • 57. 57/89 Analysis of PrecisionAnalysis of Precision Type of Content Precision* Recall* F1* EXP#1: Static Content 75.86 94.27 84.07 EXP#2: Personal Tags 75.96 92.65 83.48 EXP#3: Social Tags 75.59 90.50 82.37 EXP#4: Static Content + Personal Tags 78.04 93.60 85.11 EXP#5: Static Content + Social Tags 78.01 93.19 84.93 57 * Results averaged over the 30 study subjects Augmented Profiles Content-based Profiles Tag-based Profiles Tag vs CB Precision not improved
  • 58. 58/89 Analysis of PrecisionAnalysis of Precision Type of Content Precision* Recall* F1* EXP#1: Static Content 75.86 94.27 84.07 EXP#2: Personal Tags 75.96 92.65 83.48 EXP#3: Social Tags 75.59 90.50 82.37 EXP#4: Static Content + Personal Tags 78.04 93.60 85.11 EXP#5: Static Content + Social Tags 78.01 93.19 84.93 58 * Results averaged over the 30 study subjects Augmented Profiles Content-based Profiles Tag-based Profiles Augmented vs CB Precision Improvement ≈ 2%
  • 59. 59/89 Analysis of RecallAnalysis of Recall Type of Content Precision* Recall* F1* EXP#1: Static Content 75.86 94.27 84.07 EXP#2: Personal Tags 75.96 92.65 83.48 EXP#3: Social Tags 75.59 90.50 82.37 EXP#4: Static Content + Personal Tags 78.04 93.60 85.11 EXP#5: Static Content + Social Tags 78.01 93.19 84.93 59 * Results averaged over the 30 study subjects Augmented Profiles Content-based Profiles Tag-based Profiles Tag vs CB Recall decrease 1.62% – 3.77%
  • 60. 60/89 Analysis of RecallAnalysis of Recall Type of Content Precision* Recall* F1* EXP#1: Static Content 75.86 94.27 84.07 EXP#2: Personal Tags 75.96 92.65 83.48 EXP#3: Social Tags 75.59 90.50 82.37 EXP#4: Static Content + Personal Tags 78.04 93.60 85.11 EXP#5: Static Content + Social Tags 78.01 93.19 84.93 60 * Results averaged over the 30 study subjects Augmented Profiles Content-based Profiles Tag-based Profiles Augmented vs CB Recall decrease: 0.67% – 1.08%
  • 61. 61/89 Analysis of F1Analysis of F1 Type of Content Precision* Recall* F1* EXP#1: Static Content 75.86 94.27 84.07 EXP#2: Personal Tags 75.96 92.65 83.48 EXP#3: Social Tags 75.59 90.50 82.37 EXP#4: Static Content + Personal Tags 78.04 93.60 85.11 EXP#5: Static Content + Social Tags 78.01 93.19 84.93 61 * Results averaged over the 30 study subjects Augmented Profiles Content-based Profiles Tag-based Profiles Overall accuracy F1 ≈ 85%
  • 62. 62/89 Putting Intelligence into CBRS:Putting Intelligence into CBRS: Challenges & Research DirectionsChallenges & Research Directions Semantic analysisSemantic analysis ofof content by means ofcontent by means of external knowledgeexternal knowledge sourcessources LanguageLanguage--independentindependent CBRSCBRS Beyond keywords:Beyond keywords: novel strategies for thenovel strategies for the representation ofrepresentation of items and profilesitems and profilesLimited ContentLimited Content AnalysisAnalysis Defeating homophily:Defeating homophily: recommendationrecommendation diversificationdiversification Taking advantage ofTaking advantage of Web 2.0 for collectingWeb 2.0 for collecting User Generated ContentUser Generated Content CHALLENGESCHALLENGES OverspecializationOverspecialization PROBLEMSPROBLEMS ““computationalcomputational”” serendipityserendipity programming forprogramming for serendipityserendipity Knowledge InfusionKnowledge Infusion FolksonomyFolksonomy--based CBRSbased CBRS RESEARCHRESEARCH DIRECTIONSDIRECTIONS
  • 63. 63/89 Serendipity: DefinitionsSerendipity: Definitions Serendipity Making discoveries, by accidents and sagacity, of things which one were not in quest of (Horace Walpole, 1754) The art of making an unsought finding (Pek van Andel, 1994) [vanAndel94] Serendipitous ideas and findings Gelignite by Alfred Nobel, when he accidentally mixed collodium (gun cotton) with nitroglycerin Penicillin by Alexander Fleming The psychedelic effects of LSD by Albert Hofmann Cellophane by Jacques Brandenberger The structure of benzene by Friedric August Kekulé [vanAndel94] van Andel, P. Anatomy of the Unsought Finding. Serendipity: Origin, History, Domains, Traditions, Appearances, Patterns and Programmability. The British Journal for the Philosophy of Science, 45(2): 631-648, 994.
  • 64. 64/89 The challengeThe challenge Serendipity in RSs is the experience of receiving an unexpected and fortuitous, but useful advice it is a way to diversify recommendations The challenge is programming for serendipity to find a manner to introduce serendipity into the recommendation process in an operational way
  • 65. 65/89 Strategies forStrategies for computationalcomputational serendipityserendipity [Toms00][Toms00] “Blind Luck”: random recommendations “Prepared Mind”: Pasteur principle (“chance favors the prepared mind”) - deep user modeling “Anomalies and Exceptions”: searching for dissimilarity [Iaquinta10] “Reasoning by Analogy” [Iaquinta10] L. Iaquinta, M. de Gemmis, P. Lops, G. Semeraro, P. Molino (2010). Can a Recommender System Induce Serendipitous Encounters? In: KYEONG KANG. E-Commerce, 229-246, VIENNA: IN-TECH, 2010. [Toms00] Toms, E. Serendipitous Information Retrieval. In Proceedings of the First DELOS Network of Excellence Workshop on Information Seeking, Searching and Querying in Digital Libraries, Zurich, Switzerland: European Research Consortium for Informatics and Mathematics, 2000.
  • 66. 66/89 Programming for Serendipity into CBRS:Programming for Serendipity into CBRS: “Anomalies and Exceptions”“Anomalies and Exceptions” Basic recommendation list defined by the best N items ranked according to the user profile Idea for inducing serendipity extending the basic list with items programmatically supposed to be serendipitous for the active user
  • 67. 67/89 ITem Recommender (ITR)ITem Recommender (ITR) Content-based recommender developed at Univ. of Bari [Semeraro07] learns a probabilistic model of the interests of the user from textual descriptions of items user profile = binary text classifier able to categorize items as interesting (LIKES) or not (DISLIKES) a-posteriori probabilities as classification scores for LIKES and DISLIKES [Semeraro07] G. Semeraro, M. Degemmis, P. Lops, and P. Basile. Combining Learning and Word Sense Disambiguation for Intelligent User Profiling. In M. M. Veloso, editor, IJCAI 2007, Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, January 6-12, 2007, pages 2856–2861, Morgan Kaufmann, 2007.
  • 68. 68/89 Recommendation process: Ranked list approachRecommendation process: Ranked list approach Profile Learner DISLIKESLIKES USER PROFILE future violence alien … blood … 0.89 0.74 0.22 P(LIKES | ALF) …
  • 69. 69/89 Programming for Serendipity into ITR: strategyProgramming for Serendipity into ITR: strategy Potentially serendipitous items selected on the ground of categorization scores for LIKES and DISLIKES difference of classification scores tends to zero uncertain classification | P(LIKES | ITEM) – P(DISLIKES | ITEM) | ≈ 0 assumption: uncertain classification ≡ items not known by the user
  • 70. 70/89 Programming for Serendipity into ITR: exampleProgramming for Serendipity into ITR: example Basic recommendation list = N most interesting items Ranked list of “unpredictable” items obtained from ITR Basic recommendation list augmented with some serendipitous items DISLIKESLIKES USER PROFILE future violence alien blood 0.760.89 0.72 P(LIKES | ITEM) … … 0.01 0.02 | P(LIKES | ITEM) – P(DISLIKES | ITEM) |
  • 71. 71/89 What about evaluation?What about evaluation? Classic evaluation metrics (Precision, Recall, F, MAE,…) don’t take into account obviousness, novelty and serendipity Accurate recommendation ≠ Useful recommendation emotional response associated with serendipity difficult to capture by conventional accuracy metrics serendipity degree impossible to evaluate without considering user feedback Novel metrics required planned as a future work
  • 72. 72/89 Programming for Serendipity:Programming for Serendipity: crosscross--domain recommendationsdomain recommendations
  • 73. 73/89 ““Reasoning by AnalogyReasoning by Analogy””: a serendipity strategy for: a serendipity strategy for crosscross--domain recommendationsdomain recommendations ONTOLOGY user profile for Movies “parallel” user profile for Travels
  • 74. 74/89 Ongoing work: DEVIUSOngoing work: DEVIUS Analogy engine for computing “parallel” user profiles Spreading activation on DBpedia for mapping between domains Open source code of DEVIUS available in September Experimental evaluation books / movies
  • 75. 75/89 Putting Intelligence into CBRS:Putting Intelligence into CBRS: Challenges & Research DirectionsChallenges & Research Directions Semantic analysisSemantic analysis ofof content by means ofcontent by means of external knowledgeexternal knowledge sourcessources LanguageLanguage--independentindependent CBRSCBRS Beyond keywords:Beyond keywords: novel strategies for thenovel strategies for the representation ofrepresentation of items and profilesitems and profilesLimited ContentLimited Content AnalysisAnalysis Defeating homophily:Defeating homophily: recommendationrecommendation diversificationdiversification Taking advantage ofTaking advantage of Web 2.0 for collectingWeb 2.0 for collecting User Generated ContentUser Generated Content CHALLENGESCHALLENGES OverspecializationOverspecialization PROBLEMSPROBLEMS ““computational”computational” serendipityserendipity programming forprogramming for serendipityserendipity Knowledge InfusionKnowledge Infusion FolksonomyFolksonomy--based CBRSbased CBRS RESEARCHRESEARCH DIRECTIONSDIRECTIONS
  • 76. 76/89 Knowledge Infusion (KI)Knowledge Infusion (KI) Humans typically have the linguistic and cultural experience to comprehend the meaning of a text How to realize this capability into machines? In NLP tasks, computers require access to vast amounts of common-sense and domain-specific world knowledge Infusing lexical knowledge Dictionaries (e.g. WordNet) Infusing cultural knowledge Wikipedia …
  • 77. 77/89 Enhancing CBRS by KIEnhancing CBRS by KI Modeling the unstructured information stored in several (open) knowledge sources Exploiting the acquired knowledge in order to better understand the item descriptions and extract more meaningful features Inspired by a language game: The Guillotine [Semeraro09b] Cultural and Linguistic Background Knowledge [Semeraro09b] G. Semeraro, P. Lops, P. Basile, and M. de Gemmis. On the Tip of my Thought: Playing the Guillotine Game. In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI 2009), 1543-1548, Morgan Kaufmann, 2009.
  • 78. 78/89 The Guillotine: the gameThe Guillotine: the game [Lops09] P. Lops, P. Basile, M. de Gemmis and G. Semeraro. "Language Is the Skin of My Thought": Integrating Wikipedia and AI to Support a Guillotine Player. In: R. Serra, R. Cucchiara (Eds.), AI*IA 2009: Emergent Perspectives in Artificial Intelligence, XIth International Conference of the Italian Association for Artificial Intelligence, Reggio Emilia, Italy, December 9-12, 2009. LNCS 5883, 324-333, Springer 2009.
  • 79. 79/89 Let’s try to play the gameLet’s try to play the game APPLE JUDGMENT SUNRISE “An apple a day takes the doctor away” Day of Judgment Beginning of the day INDEPENDENCE Independence day SLEEPER Daysleeper, a famous song by R.E.M.
  • 81. 81/89 What does OTTHO know aboutWhat does OTTHO know about ‘‘starsstars’’?? STAR KNOWLEDGE DICTIONARY MATRIX 0.55 LIGHT STAR Lemmas … 1.45 SKY … … TAG MATRIX 0.27 ALIEN STAR … 1.41 SPACE … … Tagsinitems’ tagcloud SKY 1.45 LIGHT 0.55 … SPACE 1.41 ALIEN 0.27 … Lemma: Definitions | Compound Forms Star: any one of the distant bodies appearing as a point of light in the sky at night | Fixed star, i.e. one which is not a planet “STAR, SPACE, ALIEN”
  • 82. 82/89 KI@workKI@work for recommendation diversificationfor recommendation diversification STAR ROBOT ALIEN WAR BATTLE SPACE 0.36 FUTURE 0.10 EXTRATERRESTRIAL 0.08 CYBORG 0.07 FIGHT 0.02 JUSTICE 0.01 … Plot Keywords KI-LIST Search Results
  • 83. 83/89 Concluding RemarksConcluding Remarks Research directions for overcoming some CBRS drawbacks main strategies adopted to introduce some semantics in the recommendation process main strategies for diversifying recommendations Research agenda: glean meaning and user thought from the precious boxes (brain, Web, social networks,…) they are hidden into: fMRI & Eye/Head-tracking technologies for a new generation of evaluation metrics Linked Open Data: interlinking user profiles with Semantic Web data and LOD Semantic Cross-system Personalization: semantic matching of user profiles coming from heterogeneous systems
  • 84. 84/89 Thanks…Thanks… …for your attention… …Questions? Semantic Web  Access and  Personalization  research group http://www.di.uniba.it/~swap Pierpaolo Basile Marco de Gemmis Leo Iaquinta Piero Molino Fedelucio Narducci Eufemia Tinelli Annalina Caputo Michele Filannino Pasquale Lops Cataldo Musto Giovanni Semeraro
  • 85. 85/89 + The librarian + “A Logic Named Joe” - Gaetano Bassolino & Emanuele Vizzini + Arundhati Roy + Milena Jole Gabanelli CreditsCredits + Tullio De Mauro + Ivonne Bordelois + Umberto Eco + Stefano Bartezzaghi “Accavallavacca”
  • 86. 86/89 References 1/4References 1/4 [Aciar07] S. Aciar, D. Zhang, S. Simoff, and J. Debenham. Informed Recommender: Basing Recommendations on Consumer Product Reviews. IEEE Intelligent Systems, 22(3):39–47, 2007. [Basile07] P. Basile, M. Degemmis, A. Gentile, P. Lops, and G. Semeraro. UNIBA: JIGSAW algorithm for Word Sense Disambiguation. In Proc.4th ACL 2007 International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 398–401, Association for Computational Linguistics, June 23-24, 2007. [BlancoFernandez08] Y. Blanco-Fernandez, A. Gil-Solla, J. J. Pazos-Arias, M. Ramos-Cabrer, and M. Lopez-Nores. Providing Entertainment by Content-based Filtering and Semantic Reasoning in Intelligent Recommender Systems. IEEE Trans. on Consumer Electronics, 54(2):727–735, 2008. [Cantador08] I. Cantador, A. Bellog´ın, and P. Castells. News@hand: A Semantic Web Approach to Recommending News. In Wolfgang Nejdl, Judy Kay, Pearl Pu, and Eelco Herderm (Eds.), Adaptive Hypermedia and Adaptive Web-Based Systems, LNCS 5149, pages 279–283, Springer, 2008. [Degemmis07] M. Degemmis, P. Lops, and G. Semeraro. A Content-collaborative Recommender that Exploits WordNet-based User Profiles for Neighborhood Formation. User Modeling and User- Adapted Interaction: The Journal of Personalization Research (UMUAI), 17(3):217–255, Springer Science + Business Media B.V., 2007. [de Gemmis08] M. de Gemmis, P. Lops, G. Semeraro, and P. Basile. Integrating Tags in a Semantic Content-based Recommender. In RecSys ’08, Proc. of the 2nd ACM Conference on Recommender Systems, pages 163–170, October 23-25, 2008, Lausanne, Switzerland, ACM, 2008. [Eco07] U. Eco, Sator arepo eccetera. Bompiani, 2007 (in Italian).
  • 87. 87/89 References 2/4References 2/4 [Egozi09] O. Egozi. Concept-Based Information Retrieval using Explicit Semantic Analysis. M.Sc. Thesis, CS Department, Technion, 2009. [Eirinaki03] LM. Eirinaki, M. Vazirgiannis, and I. Varlamis. SEWeP: Using Site Semantics and a Taxonomy to Enhance the Web Personalization Process. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 99–108, ACM, 2003. [Gabri06] E. Gabrilovich and S. Markovitch. Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge. In Proceed. of the 21th National Conf. on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conf., pages 1301–1306, AAAI Press, 2006. [Gabri07] E. Gabrilovich and S. Markovitch. Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis. In Manuela M. Veloso, editor, Proceedings of the 20th International Joint Conference on Artificial Intelligence, pages 1606–1611, 2007. [Giles05] J. Giles. Internet Encyclopaedias Go Head to Head. Nature, 438:900–901, 2005. [Herlocker04] Herlocker, J.L., Konstan, J.A., Terveen, L.G., and Riedl, J.T. Evaluating Collaborative Filtering Recommender Systems. ACM Transactions on Information Systems, 22(1): 39-49, 2004. [Iaquinta10] L. Iaquinta, M. de Gemmis, P. Lops, G. Semeraro, P. Molino (2010). Can a Recommender System Induce Serendipitous Encounters? In: KYEONG KANG. E-Commerce, 229-246, VIENNA: IN-TECH, 2010. [Lees08] J. Lees-Miller, F. Anderson, B. Hoehn, and R. Greiner. Does Wikipedia Information Help Netflix Predictions? Proceedings of the Seventh International Conference on Machine Learning and Applications (ICMLA), pages 337–343, IEEE Computer Society, 2008.
  • 88. 88/89 References 3/4References 3/4 [Lops09] P. Lops, P. Basile, M. de Gemmis and G. Semeraro. "Language Is the Skin of My Thought": Integrating Wikipedia and AI to Support a Guillotine Player. In: R. Serra, R. Cucchiara (Eds.), AI*IA 2009: Emergent Perspectives in Artificial Intelligence, XIth International Conference of the Italian Association for Artificial Intelligence, Reggio Emilia, Italy, December 9-12, 2009. LNCS 5883, 324-333, Springer 2009. [Lops10] P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the Art and Trends. In: P. Kantor, F. Ricci, L. Rokach and B. Shapira, editors, Recommender Systems Handbook: A Complete Guide for Research Scientists & Practitioners, Chapter 3, pages 73-105, BERLIN: Springer, 2010. [McNee06] S.M. McNee, J. Riedl, and J. Konstan. Accurate is not always good: How accuracy metrics have hurt recommender systems. In Extended Abstracts of the 2006 ACM Conference on Human Factors in Computing Systems, pages 1-5, Canada, 2006. [Middleton04] S. E. Middleton, N. R. Shadbolt, and D. C. De Roure. Ontological User Profiling in Recommender Systems. ACM Transactions on Information Systems, 22(1):54–88, 2004. [Pazzani07] Pazzani, M. J., & Billsus, D. Content-Based Recommendation Systems. The Adaptive Web. Lecture Notes in Computer Science vol. 4321, 325-341, 2007. [Pedersen04] Pedersen, Ted and Patwardhan, Siddharth, and Michelizzi, Jason. WordNet::Similarity - Measuring the Relatedness of Concepts. In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-2004), pp. 1024-1025, San Jose, CA, July, 2004. [Semeraro07] G. Semeraro, M. Degemmis, P. Lops, and P. Basile. Combining Learning and Word Sense Disambiguation for Intelligent User Profiling. In M. M. Veloso, editor, IJCAI 2007, Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, January 6-12, 2007, pages 2856–2861, Morgan Kaufmann, 2007.
  • 89. 89/89 References 4/4References 4/4 [Semeraro09a] G. Semeraro, P. Lops, P. Basile, and M. de Gemmis. Knowledge Infusion into Content-based Recommender Systems. In Proceedings of the 2009 ACM Conf. on Recommender Systems, RecSys 2009, pages 301-304, New York, USA, October 22-25, 2009. [Semeraro09b] G. Semeraro, P. Lops, P. Basile, and M. de Gemmis. On the Tip of my Thought: Playing the Guillotine Game. In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI 2009), 1543-1548, Morgan Kaufmann, 2009. [Smirnov08] A. V. Smirnov and A. Krizhanovsky. Information Filtering based on Wiki Index Database. CoRR, abs/0804.2354, 2008. [Toms00] Toms, E. Serendipitous Information Retrieval. In Proceedings of the First DELOS Network of Excellence Workshop on Information Seeking, Searching and Querying in Digital Libraries, Zurich, Switzerland: European Research Consortium for Informatics and Mathematics, 2000. [vanAndel94] van Andel, P. Anatomy of the Unsought Finding. Serendipity: Origin, History, Domains, Traditions, Appearances, Patterns and Programmability. The British Journal for the Philosophy of Science, 45(2), pp. 631-648, 1994. [Zuckerman08] E. Zuckerman. Homophily, serendipity, xenophilia. April 25, 2008. www.ethanzuckerman.com/blog/2008/04/25/homophily-serendipity-xenophilia/