SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
The Mechanical Librarian
       Recommending Journal Articles
        in a Scientific Digital Library
                            Andre Vellino
                        andre.vellino@cnrc.ca

Group Leader, CISTI Research      Chef de groupe, Recherche ICIST
Canada Institute for Scientific   Institute canadien de l'information
and Technical Information         scientifique et technique
Outline of Talk



             • The Mechanical Librarian
             • How Recommenders Work
             • Recommenders in Digital Libraries
             • Problems for Science Article Recommenders and
               Strategies for CISTI’s Recommender Research
             • Synthese on CISTI Lab
             • Alternative Approaches
             • Future Work



Acknowledgements to: Glen Newton, Jeff Demaine and Greg Kresko &
Students : Dave Zeber, Matthew Rutledge-Taylor and Aurel Constantinescu   2
The Human (Reference)
      Librarian


                  Experience
World Knowledge
                         Vocabularies




                          Databases


                  Authoritative
                  Trustworthy
                  References
                                  3
The Mechanical Librarian




The Web, they say, is leaving the era of search and entering
one of discovery. What's the difference? Search is what you
do when you're looking for something. Discovery is when
something wonderful that you didn't know existed, or didn't
know how to ask for, finds you.
                         Jeffrey M. O'Brien, Fortune Magazine
                                                           4
Knowledge Discovery
                                    Technologies


• Text Mining
   – Enhances the researcher’s ability to
     discover new and meaningful information
     from existing text repositories
• Network Analysis
   – Distills the structural relationships among
     bibliographic elements to reveal trends
     and patterns in science
• User Behaviour
   – Infers “wisdom of the crowds” from
     usage statistics
                                                          5
What is a “Recommender”?



• A recommender is a software system which attempts to predict
  items that a user may be interested in, given information about
    – the user's interests
    – the content in the items
    – the usage patterns of other users
• Items may be:
    – Merchandise: movies, music, books
    – Text: news, blogs, web pages, and, why not,


          Scientific Journal Articles
Amazon Recommender
System
User
       Control
Category Filter



   Personalized



   User Ratings


   Explanations
Companies That Offer
         Recommenders to Users


Movies     Web Sites




Books      Music


                                 9
Companies That Sell
        Recommender Services


Product Merchandise Placement


Database Mining


Advertizing / Product Placement


Software as a Service Platform
                                  10
Recommendation is Hard
                                  Netflix Prize: $1M


• Netflix Prize
   – To develop a recommender that improves quality of
      recommendations by 10% over Netflix’s
   – http://www.netflixprize.com/
• Current Leader Board
   – BellKor (9.6%)
   – … + 39 others
• NY Times Magazine Article
   http://www.nytimes.com/2008/11/23/magazine/23Netflix-t.html



                                                                 11
Good Recommendations
are REALLY Hard




                 12
Outline of Talk



• The Mechanical Librarian
• How Recommenders Work
• Recommenders in Digital Libraries
• Problems for Science Article Recommenders and
  Strategies for CISTI’s Recommender Research
• Demonstration of Synthese on CISTI Lab
• Alternative Approaches
• Future Work




                                              13
Taxonomy of
                                                       Recommender Systems

Collaborative Filtering
• Usage based, with item-ratings
     – User-Based (“similar users”)
     – Item-Based (“like items”)
• Algorithms
     – Memory-based
     – Model-based
Content-Based Filtering
• Content (text / waveform / pixel) analysis to
     – Find “similar users”
     – Find “similar items”


 J. Breese, D. Heckerman, C. Kadie, et al. Empirical Analysis of Predictive Algorithms for Collaborative
 Filtering. Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, 461, 1998.
How Collaborative
                                              Filtering (CF) Works


• User-Based CF
   – Given user A find all the other users {U} that have the most
     “similar” item-rating patterns
   – For each item I not yet rated by A, predict the likely rating A will
     assign to I given the ratings for I given by {U}
   – Present the Top-N ordered list of items {I} to the user
• Item-Based CF
   – Given user A and the set of items {I} to which A has given
     ratings, find all the other items {O} that are “similar” to {I}
   – Present the Top-N ordered list of items {O} to the user

Sarwar, Badrul M., George Karypis, Joseph A. Konstan, and John Reidl. quot;Item-based
collaborative filtering recommendation algorithms.quot; World Wide Web. 2001, 285-295.   15
Find “Nearest Neighbour”
                                   and Predict Rating


• Find Nearest Neighbours (e.g. cosine similarity)




• Predict Rating (item i for user u)
   – Weighted average of user’s ratings on N similar users




                                                             16
User-Based
                                        Collaborative Filtering

         Users
Movies               Milk       Doubt    Dark Night   Bolt      Reader
         Alice         5          4          3          5         2
         Bob                      1          5                    5
         Carol         4                     3          4
         Ted           4          4                     ?
                                                        5


 • Goal: predict the rating Ted will give to the movie “Bolt”
 • Step 1 – eliminate the user-profiles of users who didn’t rate “Bolt”
 • Step 2 – find Ted’s “K-nearest neighbours” who rated “Bolt” and at
   least 2 other movies (Alice)
 • R(Ted,Bolt) ~= 5.
                                                                          17
Things that can go wrong
                                           with Collaborative Filtering


• False “product ratings” to artificially boost ranking (spamming)
• Losing the diversity in the “Long Tail” – converges to “Top N”.




Fleder, D. and K. Hosanagar. 2008. Blockbuster culture's next rise or fall: The effect of
                                                                                        18
recommender systems on sales diversity. NET Institute Working Paper 07-10.
Content-Based
                                  Recommenders


“These things are similar (in content) to that”.
• Depends only on a measure of similarity between the content in
   the items (text, music, images)
• Typical Steps for Content Based Recommenders
    1. Cluster the user’s purchased or highly-rated items by
         content-similarity
    2. Find other similar items not purchased or rated by the user
    3. Recommend the “Top N” to the user




                                                              19
Search Engine as
                          “Content-Based
                          Recommender”

Collaborative filtering
“Similar Pages” is a
Content-Based
Recommender
What can go wrong with
                               Content Based
                               Recommenders
                               that use only Metadata



• Bad Men Do What Good Men Dream: A Forensic Psychiatrist
  Illuminates the Darker Side of Human Behavior
• Do Animals Dream?: Children's Questions about Animals Most
  Often asked of the Natural History Museum
• All I Do is Dream of You The other end of the leash : why we do
  what we do around dogs
• Why do Catholics do that : a guide to the teachings and practices
  of the Catholic Church
• Electric universe : the shocking true story of electricity
• The Island of Sheep
Outline of Talk



• The Mechanical Librarian
• How Recommenders Work
• Recommenders in Digital Libraries
• Problems for Science Article Recommenders and
  Strategies for CISTI’s Recommender Research
• Demonstration of Synthese on CISTI Lab
• Alternative Approaches
• Future Work




                                              23
Value of Recommenders
                                         in a Digital Library

• For the Researcher
   – Provide serendipity in a Browse / Search / Retrieve portal
       • Broaden scope of search to cognate but otherwise disparate domains
• For the Library
   – Increase customer loyalty by creating dynamic, adaptive,
     customized services
       • Alerts & notifications based on usage and collaborative filtering rather
         than stored queries
• For Authors
   – Given a draft article (with citations), find additional citations
• For Publishers & Journal Reviewers
   – Given a submitted article, recommending peer-reviewers

                                                                             24
Recommender Systems in
                                       Digital Libraries

– Techlens (University of Minnesota) (2002)
    • Uses ACM DL, full text Mixed Hybrid
– BibTip (University of Karlsruhe) (2003)
    • Uses OPAC (Library Catalog) usage data for collaborative filtering
– IngentaConnect (2007)
    • Uses Baynote (SaaS) customer tracking
– DSpace (2008)
    • Content-based recommender based on user-bookmarks
– CiteULike (academic experiment 2008)
    • Collaborative filtering on user bookmarks from CiteULike
– “bX” system from Ex Libris (2009)
    • Uses SFX resolver logs
– NextBio (to be announced in March 2009)
    • Life sciences search engine that uses collaborative filtering + ontologies
      to suggest new content (trials / abstracts / data)

                                                                               25
TechLens




           26
“bX”
                             Recommender (Jan „09)


Features
   • Uses log data from SFX resolvers
   • Applies Collaborative Filtering
   • Uses lots of aggregated data
   • Developed w/ the Los Alamos National Laboratory.
Possible issues
   • Infers identity of users only through IP address
   • May not be accurate when http proxies are used
   • Same IP address can have several “IR objectives”
   • Identical resolved objects may not be recognized

                                                        27
Outline of Talk



• The Mechanical Librarian
• How Recommenders Work
• Recommenders in Digital Libraries
• Problems for Science Article Recommenders and
  Strategies for CISTI’s Recommender Research
• Demonstration of Synthese on CISTI Lab
• Alternative Approaches
• Future Work




                                              28
Typical Problems with CF
                                      Recommenders in General

• Data Sparsity
    – Ratio of Users / Items is low (~ 1:10)
    – Number of Ratings per User is low
    – Ratings matrix sparsity ~ 95%
• Cold Start Problem
    – First-time users get poor or no recommendations because CF matrix
      has no entries
• Rating Items
    – CF recommender must be trained (explicitly or implicitly) by providing
      ratings to items
• Principle of Induction
    – People who exhibited similar behaviour in the past will tend to exhibit
      similar behaviour in the future.

                                                                         29
Specific Problems for
                                        Collaborative Filtering in
                                        Science Digital Libraries
• Data Sparsity
    – Many More Articles & Far Fewer Users (10x)
    – Fewer Item / Ratings (~ 99% sparsity)
• Rating Articles
    – Explicit ratings are more difficult to obtain
        • DL users have less need to “express themselves” by explicitly rating
          items than movie watchers
    – Implicit ratings depend on UI features of DL
        • No reliable method for inferring ratings from browsing and query
          behaviour
• Principle of Induction (that past is a good predictor of the future) not
  necessarily true in digital libraries
    – Interest drift
    – Context shifts


                                                                             30
Recommender Research
                                               Strategy @ CISTI


• Follow in footsteps of TechLens+
    – Collaborative Filtering (CF) among users
    – Seed CF recommender with citation matrix
    – Extended with
         • PageRank on Citations
         • User Contexts
    – Future Extensions
         • Add Content-Based Filtering (“Fusion Mixed Hybrid” model)
         • Distributed Multi-Dimensional Recommender
         • Explanation-based interface

A. Vellino and D. Zeber. (2007) “A Hybrid, Multi-dimensional Recommender for Journal
Articles in a Scientific Digital Library.” Conference Proceedings on Web Intelligence and
Intelligent Agent Technology                                                            31
Making a Reference   Rating




                        32
Recommender Citation
                                    Seeding


TechLens approach to Cold Start / Data Sparsity problem




   • Articles either cite or don’t cite other articles
   • Some articles that are cited are not in collection
   • Users’ “article collection profile” citations        33
Outline of Talk



• The Mechanical Librarian
• How Recommenders Work
• Recommenders in Digital Libraries
• Problems for Science Article Recommenders and
  Strategies for CISTI’s Recommender Research
• Demonstration of Synthese on CISTI Lab
• Alternative Approaches
• Future Work




                                              34
Synthese Recommender
on CISTI Lab




                  35
Query Index




              36
Add Important Articles to
“Basket” (1)




                      37
Add Important Articles to
“Basket” (2)




                      38
Add Important Articles to
“Basket” (3)




                      39
Add Important Articles to
“Basket” (4)




                      40
Query Again




              41
Add More Articles to
“Basket” (1)




                       42
Add More Articles to
“Basket” (2)




                       43
Recommend Based on
Current “Basket”




                 44
View Recommendations




                  45
Evaluate Recommender




                 46
Search and Basket
History




                    47
Multiple Profiles




                    48
Synthese Performance


                      Ratings of Recommendations
             35


             30


             25
Percentage




             20


             15


             10


              5


              0
                  1       2      3       4     5

                               Ratings
                                                     49
Recommender Citation
                              Seeding


Can we improve on 0 / 1 (Boolean) citation seeding?




                                                      50
Apply PageRank to
                                               Citation Matrix

PageRank algorithm applied to citations




Aurel Constantinescu “Ranking Full-Text Articles using Citation Based Methods”
                                                                                 51
Master’s Thesis, University of Ottawa
PageRank-weighted
                                              Citation matrix

                        p1 p2 p3 p4 p5 p6 p7 p8                      citations
                   p1                                     
                                              0.4

                   p2             0.5         0.4
     articles
                   p3   0.2                         0.6

                   p4                                         
                              0.7 0.5

                   u1                                         
                                  0.5 0.3           0.6
     users
                                                                   = constant
                   u2                                     
                        0.2             0.3


• Apply Page Rank on Citations
    – Use citation data (as in TechLens+)
    – Apply PageRank to weight the citation-based “ratings”
• Done before but only at the Journal level (http://www.eigenfactor.org/)
                                                                           52
PageRank Experimental
                                               Results




A. Vellino “The Effect of PageRank on the Collaborative Filtering of Journal Articles”
                                                                                       53
NRC Research Report, 2008.
Outline of Talk



• The Mechanical Librarian
• How Recommenders Work
• Recommenders in Digital Libraries
• Problems for Science Article Recommenders and
  Strategies for CISTI’s Recommender Research
• Demonstration of Synthese on CISTI Lab
• Alternative Approaches
• Future Work




                                              54
What is a Holographic
                                             Memory System?


• A Holographic Memory System (HMS) stores information in
  a manner analogous to the storage of an image on a
  holographic plate.
• HMS is composed of units called items
   – Each item represents some content
       • e.g, a concept, a word, a bibliographic item
   – Items are analogous to points on the surface of
     holographic film (or, plate)
   – Each item stores information about the associations it
     has with other items
T. A. Plate, 2003 Holographic Reduced Representations: Distributed Representations for
Cognitive Structures (Stanford, CA: CSLI Publications)
Holographic Memory
                                      System (HMS)

                                                  HMS
         Holography


                                                         Red
                                       Fruit



                                                               Spherical
                                                   Apple
Each point on the Holographic plate
stores information about many parts    Each item stores information about
of the image                           many other items in the system
HMS Recommender for
                                            Journal Articles


• We compared DSHM and user-based CF on journal article
  recommendation on 2 small collections
             Medicine                       Biology
             7495 articles                  38,667 articles
             0.55 references per article    1.15 references per article

• 90% - 10% Cross Validation
   • systematically removed one reference at a time
   • tested whether recommender predicts the reference.
   • compared DSHM and user-based CF
M. F. Rutledge-Taylor, A. Vellino and R. L. West. “A Holographic Associative Memory
Recommender System” 3rd Int. Conference on Digital Information Management, London, 2008.
Experimental Results




                       58
Holographic Recommender:
                             Discussion


• Advantages
   – Holographic System outperformed standard user-based
     CF on very sparse bibliographic datasets
   – DSHM is better able to exploit the available information
   – The uniformly consistent model of DSHM gives it good
     potential for success on multi-dimensional datasets
• Disadvantages
   – Requires a lot of computational resources
   – Unclear about how it works on a large scale.
Outline of Talk



• The Mechanical Librarian
• How Recommenders Work
• Recommenders in Digital Libraries
• Problems for Science Article Recommenders and
  Strategies for CISTI’s Recommender Research
• Demonstration of Synthese on CISTI Lab
• Alternative Approaches
• Future Work




                                              60
Multi-Dimensional Ratings
                                                      Matrix




G. Adomavicious, R. Sankaranarayanan, S. Sen, A. Tuzhilin, ACM Transactions on Information Systems 2005
Incorporating Contextual Information in Recommender Systems Using a Multidimensional Approach 61
Scaling Strategy:
                                                  Distributed
                                                  Recommenders
• Multiple ratings matrices decomposed by subject area
• Merge separate recommendations by subject




• Reduces matrix sparsity
• Improves accuracy of recommendations
S. Berkovsky, T.Kuflik, and F. Ricci Distributed Collaborative Filtering with   62
Domain Specialization Proceedings of Recommender Systems 2007
Importance of Quality and
                                                Trust


                       What predicts overall usefulness of a System?
               0.6
               0.5
 Correlation




               0.4
               0.3
               0.2
               0.1
                0
                     Good Rec. Useful Rec.     Trust     Adequate     Ease of
                                             Generating    Item        Use
                                               Rec.     Description
                                                                                63
Rashmi Sinha & Kirsten Swearingen – UC Berkeley
UI for Navigating
                                               Recommendations


• Explanation-based
  Recommendations
   – Provide transparency
     increase user trust
   – Allow users to cluster by
     type of reason
   – Filter out unwanted
     recommendations




  P. Pu and L. Chen. Trust Building with Explanation Interfaces. In IUI ’06: Proceedings of
  the 11th International Conference On Intelligent User Interfaces, pages 93–100       64
Conclusions


• Recommender technology is only 12 years old, but mature
  enough for widespread commercial use.
• Digital Libraries / Web 2.0 Bibliographic applications are
  beginning to use recommenders.
• Digital Libraries create new problems for recommenders
  (“context drift” / “data sparsity” / “multiple dimensions”)
• Recommenders insufficiently understood in Digital Libraries.
• Recommender as mechanism for enhancing the process of
  scientific discovery promising but still uncertain.



                                                                 65
Thank You!
              Questions?
http://lab.cisti-icist.nrc-cnrc.gc.ca/synthese/

Weitere ähnliche Inhalte

Ähnlich wie Mechanical Librarian

Rettiggoel.ux week.8.25.05
Rettiggoel.ux week.8.25.05Rettiggoel.ux week.8.25.05
Rettiggoel.ux week.8.25.05Marc Rettig
 
People-Centered Design
People-Centered DesignPeople-Centered Design
People-Centered DesignKatrina Alcorn
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsT212
 
Mendeley’s Research Catalogue: building it, opening it up and making it even ...
Mendeley’s Research Catalogue: building it, opening it up and making it even ...Mendeley’s Research Catalogue: building it, opening it up and making it even ...
Mendeley’s Research Catalogue: building it, opening it up and making it even ...Kris Jack
 
Designing the next big thing: Randomness vs Serendipity in DH tools
Designing the next big thing: Randomness vs Serendipity in DH toolsDesigning the next big thing: Randomness vs Serendipity in DH tools
Designing the next big thing: Randomness vs Serendipity in DH toolsKimberleyMartin
 
Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Dakiry
 
IGeLU2009: Patrons’ Collective Intelligence and Communities of Practice: let ...
IGeLU2009: Patrons’ Collective Intelligence and Communities of Practice: let ...IGeLU2009: Patrons’ Collective Intelligence and Communities of Practice: let ...
IGeLU2009: Patrons’ Collective Intelligence and Communities of Practice: let ...Filipe Bento
 
Straight Talk about the "B" Word: using the Edge benchmarks in your library
Straight Talk about the "B" Word: using the Edge benchmarks in your libraryStraight Talk about the "B" Word: using the Edge benchmarks in your library
Straight Talk about the "B" Word: using the Edge benchmarks in your libraryTechSoup for Libraries
 
Live Usability Lab: See One, Do One & Take One Home
Live Usability Lab: See One, Do One & Take One HomeLive Usability Lab: See One, Do One & Take One Home
Live Usability Lab: See One, Do One & Take One HomeStephanie Brown
 
Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Robert Stribley
 
Wc Usability Online Catalogs Combined August2009 Rev1 Ch
Wc Usability Online Catalogs Combined August2009 Rev1 ChWc Usability Online Catalogs Combined August2009 Rev1 Ch
Wc Usability Online Catalogs Combined August2009 Rev1 ChOCLC LAC
 
Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Robert Stribley
 
Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Robert Stribley
 
Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Robert Stribley
 
Online
OnlineOnline
Onlinedaveyp
 
Applications for Social Networking Strategies in an Agency Context
Applications for Social Networking Strategies in an Agency ContextApplications for Social Networking Strategies in an Agency Context
Applications for Social Networking Strategies in an Agency ContextJohn Brisbin
 
SVA Workshop Spring 0411
SVA Workshop Spring 0411SVA Workshop Spring 0411
SVA Workshop Spring 0411Robert Stribley
 
Conforming to Destiny or Adapting to Circumstance: The State of Cataloging in...
Conforming to Destiny or Adapting to Circumstance: The State of Cataloging in...Conforming to Destiny or Adapting to Circumstance: The State of Cataloging in...
Conforming to Destiny or Adapting to Circumstance: The State of Cataloging in...WiLS
 
2009 which candidate will you buy cj v3.0 summer school in methods and techni...
2009 which candidate will you buy cj v3.0 summer school in methods and techni...2009 which candidate will you buy cj v3.0 summer school in methods and techni...
2009 which candidate will you buy cj v3.0 summer school in methods and techni...Toni Gril
 

Ähnlich wie Mechanical Librarian (20)

Rettiggoel.ux week.8.25.05
Rettiggoel.ux week.8.25.05Rettiggoel.ux week.8.25.05
Rettiggoel.ux week.8.25.05
 
People-Centered Design
People-Centered DesignPeople-Centered Design
People-Centered Design
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Mendeley’s Research Catalogue: building it, opening it up and making it even ...
Mendeley’s Research Catalogue: building it, opening it up and making it even ...Mendeley’s Research Catalogue: building it, opening it up and making it even ...
Mendeley’s Research Catalogue: building it, opening it up and making it even ...
 
Designing the next big thing: Randomness vs Serendipity in DH tools
Designing the next big thing: Randomness vs Serendipity in DH toolsDesigning the next big thing: Randomness vs Serendipity in DH tools
Designing the next big thing: Randomness vs Serendipity in DH tools
 
Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”
 
IGeLU2009: Patrons’ Collective Intelligence and Communities of Practice: let ...
IGeLU2009: Patrons’ Collective Intelligence and Communities of Practice: let ...IGeLU2009: Patrons’ Collective Intelligence and Communities of Practice: let ...
IGeLU2009: Patrons’ Collective Intelligence and Communities of Practice: let ...
 
Straight Talk about the "B" Word: using the Edge benchmarks in your library
Straight Talk about the "B" Word: using the Edge benchmarks in your libraryStraight Talk about the "B" Word: using the Edge benchmarks in your library
Straight Talk about the "B" Word: using the Edge benchmarks in your library
 
Live Usability Lab: See One, Do One & Take One Home
Live Usability Lab: See One, Do One & Take One HomeLive Usability Lab: See One, Do One & Take One Home
Live Usability Lab: See One, Do One & Take One Home
 
Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16
 
Wc Usability Online Catalogs Combined August2009 Rev1 Ch
Wc Usability Online Catalogs Combined August2009 Rev1 ChWc Usability Online Catalogs Combined August2009 Rev1 Ch
Wc Usability Online Catalogs Combined August2009 Rev1 Ch
 
Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16
 
Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16
 
Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17
 
Online
OnlineOnline
Online
 
Applications for Social Networking Strategies in an Agency Context
Applications for Social Networking Strategies in an Agency ContextApplications for Social Networking Strategies in an Agency Context
Applications for Social Networking Strategies in an Agency Context
 
SVA Workshop Spring 0411
SVA Workshop Spring 0411SVA Workshop Spring 0411
SVA Workshop Spring 0411
 
SVA Winter 0211
SVA Winter 0211SVA Winter 0211
SVA Winter 0211
 
Conforming to Destiny or Adapting to Circumstance: The State of Cataloging in...
Conforming to Destiny or Adapting to Circumstance: The State of Cataloging in...Conforming to Destiny or Adapting to Circumstance: The State of Cataloging in...
Conforming to Destiny or Adapting to Circumstance: The State of Cataloging in...
 
2009 which candidate will you buy cj v3.0 summer school in methods and techni...
2009 which candidate will you buy cj v3.0 summer school in methods and techni...2009 which candidate will you buy cj v3.0 summer school in methods and techni...
2009 which candidate will you buy cj v3.0 summer school in methods and techni...
 

KĂźrzlich hochgeladen

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 

KĂźrzlich hochgeladen (20)

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 

Mechanical Librarian

  • 1. The Mechanical Librarian Recommending Journal Articles in a Scientific Digital Library Andre Vellino andre.vellino@cnrc.ca Group Leader, CISTI Research Chef de groupe, Recherche ICIST Canada Institute for Scientific Institute canadien de l'information and Technical Information scientifique et technique
  • 2. Outline of Talk • The Mechanical Librarian • How Recommenders Work • Recommenders in Digital Libraries • Problems for Science Article Recommenders and Strategies for CISTI’s Recommender Research • Synthese on CISTI Lab • Alternative Approaches • Future Work Acknowledgements to: Glen Newton, Jeff Demaine and Greg Kresko & Students : Dave Zeber, Matthew Rutledge-Taylor and Aurel Constantinescu 2
  • 3. The Human (Reference) Librarian Experience World Knowledge Vocabularies Databases Authoritative Trustworthy References 3
  • 4. The Mechanical Librarian The Web, they say, is leaving the era of search and entering one of discovery. What's the difference? Search is what you do when you're looking for something. Discovery is when something wonderful that you didn't know existed, or didn't know how to ask for, finds you. Jeffrey M. O'Brien, Fortune Magazine 4
  • 5. Knowledge Discovery Technologies • Text Mining – Enhances the researcher’s ability to discover new and meaningful information from existing text repositories • Network Analysis – Distills the structural relationships among bibliographic elements to reveal trends and patterns in science • User Behaviour – Infers “wisdom of the crowds” from usage statistics 5
  • 6. What is a “Recommender”? • A recommender is a software system which attempts to predict items that a user may be interested in, given information about – the user's interests – the content in the items – the usage patterns of other users • Items may be: – Merchandise: movies, music, books – Text: news, blogs, web pages, and, why not, Scientific Journal Articles
  • 8. User Control Category Filter Personalized User Ratings Explanations
  • 9. Companies That Offer Recommenders to Users Movies Web Sites Books Music 9
  • 10. Companies That Sell Recommender Services Product Merchandise Placement Database Mining Advertizing / Product Placement Software as a Service Platform 10
  • 11. Recommendation is Hard Netflix Prize: $1M • Netflix Prize – To develop a recommender that improves quality of recommendations by 10% over Netflix’s – http://www.netflixprize.com/ • Current Leader Board – BellKor (9.6%) – … + 39 others • NY Times Magazine Article http://www.nytimes.com/2008/11/23/magazine/23Netflix-t.html 11
  • 13. Outline of Talk • The Mechanical Librarian • How Recommenders Work • Recommenders in Digital Libraries • Problems for Science Article Recommenders and Strategies for CISTI’s Recommender Research • Demonstration of Synthese on CISTI Lab • Alternative Approaches • Future Work 13
  • 14. Taxonomy of Recommender Systems Collaborative Filtering • Usage based, with item-ratings – User-Based (“similar users”) – Item-Based (“like items”) • Algorithms – Memory-based – Model-based Content-Based Filtering • Content (text / waveform / pixel) analysis to – Find “similar users” – Find “similar items” J. Breese, D. Heckerman, C. Kadie, et al. Empirical Analysis of Predictive Algorithms for Collaborative Filtering. Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, 461, 1998.
  • 15. How Collaborative Filtering (CF) Works • User-Based CF – Given user A find all the other users {U} that have the most “similar” item-rating patterns – For each item I not yet rated by A, predict the likely rating A will assign to I given the ratings for I given by {U} – Present the Top-N ordered list of items {I} to the user • Item-Based CF – Given user A and the set of items {I} to which A has given ratings, find all the other items {O} that are “similar” to {I} – Present the Top-N ordered list of items {O} to the user Sarwar, Badrul M., George Karypis, Joseph A. Konstan, and John Reidl. quot;Item-based collaborative filtering recommendation algorithms.quot; World Wide Web. 2001, 285-295. 15
  • 16. Find “Nearest Neighbour” and Predict Rating • Find Nearest Neighbours (e.g. cosine similarity) • Predict Rating (item i for user u) – Weighted average of user’s ratings on N similar users 16
  • 17. User-Based Collaborative Filtering Users Movies Milk Doubt Dark Night Bolt Reader Alice 5 4 3 5 2 Bob 1 5 5 Carol 4 3 4 Ted 4 4 ? 5 • Goal: predict the rating Ted will give to the movie “Bolt” • Step 1 – eliminate the user-profiles of users who didn’t rate “Bolt” • Step 2 – find Ted’s “K-nearest neighbours” who rated “Bolt” and at least 2 other movies (Alice) • R(Ted,Bolt) ~= 5. 17
  • 18. Things that can go wrong with Collaborative Filtering • False “product ratings” to artificially boost ranking (spamming) • Losing the diversity in the “Long Tail” – converges to “Top N”. Fleder, D. and K. Hosanagar. 2008. Blockbuster culture's next rise or fall: The effect of 18 recommender systems on sales diversity. NET Institute Working Paper 07-10.
  • 19. Content-Based Recommenders “These things are similar (in content) to that”. • Depends only on a measure of similarity between the content in the items (text, music, images) • Typical Steps for Content Based Recommenders 1. Cluster the user’s purchased or highly-rated items by content-similarity 2. Find other similar items not purchased or rated by the user 3. Recommend the “Top N” to the user 19
  • 20. Search Engine as “Content-Based Recommender” Collaborative filtering
  • 21. “Similar Pages” is a Content-Based Recommender
  • 22. What can go wrong with Content Based Recommenders that use only Metadata • Bad Men Do What Good Men Dream: A Forensic Psychiatrist Illuminates the Darker Side of Human Behavior • Do Animals Dream?: Children's Questions about Animals Most Often asked of the Natural History Museum • All I Do is Dream of You The other end of the leash : why we do what we do around dogs • Why do Catholics do that : a guide to the teachings and practices of the Catholic Church • Electric universe : the shocking true story of electricity • The Island of Sheep
  • 23. Outline of Talk • The Mechanical Librarian • How Recommenders Work • Recommenders in Digital Libraries • Problems for Science Article Recommenders and Strategies for CISTI’s Recommender Research • Demonstration of Synthese on CISTI Lab • Alternative Approaches • Future Work 23
  • 24. Value of Recommenders in a Digital Library • For the Researcher – Provide serendipity in a Browse / Search / Retrieve portal • Broaden scope of search to cognate but otherwise disparate domains • For the Library – Increase customer loyalty by creating dynamic, adaptive, customized services • Alerts & notifications based on usage and collaborative filtering rather than stored queries • For Authors – Given a draft article (with citations), find additional citations • For Publishers & Journal Reviewers – Given a submitted article, recommending peer-reviewers 24
  • 25. Recommender Systems in Digital Libraries – Techlens (University of Minnesota) (2002) • Uses ACM DL, full text Mixed Hybrid – BibTip (University of Karlsruhe) (2003) • Uses OPAC (Library Catalog) usage data for collaborative filtering – IngentaConnect (2007) • Uses Baynote (SaaS) customer tracking – DSpace (2008) • Content-based recommender based on user-bookmarks – CiteULike (academic experiment 2008) • Collaborative filtering on user bookmarks from CiteULike – “bX” system from Ex Libris (2009) • Uses SFX resolver logs – NextBio (to be announced in March 2009) • Life sciences search engine that uses collaborative filtering + ontologies to suggest new content (trials / abstracts / data) 25
  • 26. TechLens 26
  • 27. “bX” Recommender (Jan „09) Features • Uses log data from SFX resolvers • Applies Collaborative Filtering • Uses lots of aggregated data • Developed w/ the Los Alamos National Laboratory. Possible issues • Infers identity of users only through IP address • May not be accurate when http proxies are used • Same IP address can have several “IR objectives” • Identical resolved objects may not be recognized 27
  • 28. Outline of Talk • The Mechanical Librarian • How Recommenders Work • Recommenders in Digital Libraries • Problems for Science Article Recommenders and Strategies for CISTI’s Recommender Research • Demonstration of Synthese on CISTI Lab • Alternative Approaches • Future Work 28
  • 29. Typical Problems with CF Recommenders in General • Data Sparsity – Ratio of Users / Items is low (~ 1:10) – Number of Ratings per User is low – Ratings matrix sparsity ~ 95% • Cold Start Problem – First-time users get poor or no recommendations because CF matrix has no entries • Rating Items – CF recommender must be trained (explicitly or implicitly) by providing ratings to items • Principle of Induction – People who exhibited similar behaviour in the past will tend to exhibit similar behaviour in the future. 29
  • 30. Specific Problems for Collaborative Filtering in Science Digital Libraries • Data Sparsity – Many More Articles & Far Fewer Users (10x) – Fewer Item / Ratings (~ 99% sparsity) • Rating Articles – Explicit ratings are more difficult to obtain • DL users have less need to “express themselves” by explicitly rating items than movie watchers – Implicit ratings depend on UI features of DL • No reliable method for inferring ratings from browsing and query behaviour • Principle of Induction (that past is a good predictor of the future) not necessarily true in digital libraries – Interest drift – Context shifts 30
  • 31. Recommender Research Strategy @ CISTI • Follow in footsteps of TechLens+ – Collaborative Filtering (CF) among users – Seed CF recommender with citation matrix – Extended with • PageRank on Citations • User Contexts – Future Extensions • Add Content-Based Filtering (“Fusion Mixed Hybrid” model) • Distributed Multi-Dimensional Recommender • Explanation-based interface A. Vellino and D. Zeber. (2007) “A Hybrid, Multi-dimensional Recommender for Journal Articles in a Scientific Digital Library.” Conference Proceedings on Web Intelligence and Intelligent Agent Technology 31
  • 32. Making a Reference Rating 32
  • 33. Recommender Citation Seeding TechLens approach to Cold Start / Data Sparsity problem • Articles either cite or don’t cite other articles • Some articles that are cited are not in collection • Users’ “article collection profile” citations 33
  • 34. Outline of Talk • The Mechanical Librarian • How Recommenders Work • Recommenders in Digital Libraries • Problems for Science Article Recommenders and Strategies for CISTI’s Recommender Research • Demonstration of Synthese on CISTI Lab • Alternative Approaches • Future Work 34
  • 37. Add Important Articles to “Basket” (1) 37
  • 38. Add Important Articles to “Basket” (2) 38
  • 39. Add Important Articles to “Basket” (3) 39
  • 40. Add Important Articles to “Basket” (4) 40
  • 42. Add More Articles to “Basket” (1) 42
  • 43. Add More Articles to “Basket” (2) 43
  • 44. Recommend Based on Current “Basket” 44
  • 49. Synthese Performance Ratings of Recommendations 35 30 25 Percentage 20 15 10 5 0 1 2 3 4 5 Ratings 49
  • 50. Recommender Citation Seeding Can we improve on 0 / 1 (Boolean) citation seeding? 50
  • 51. Apply PageRank to Citation Matrix PageRank algorithm applied to citations Aurel Constantinescu “Ranking Full-Text Articles using Citation Based Methods” 51 Master’s Thesis, University of Ottawa
  • 52. PageRank-weighted Citation matrix p1 p2 p3 p4 p5 p6 p7 p8 citations p1  0.4 p2 0.5 0.4 articles p3 0.2 0.6 p4  0.7 0.5 u1  0.5 0.3 0.6 users  = constant u2  0.2 0.3 • Apply Page Rank on Citations – Use citation data (as in TechLens+) – Apply PageRank to weight the citation-based “ratings” • Done before but only at the Journal level (http://www.eigenfactor.org/) 52
  • 53. PageRank Experimental Results A. Vellino “The Effect of PageRank on the Collaborative Filtering of Journal Articles” 53 NRC Research Report, 2008.
  • 54. Outline of Talk • The Mechanical Librarian • How Recommenders Work • Recommenders in Digital Libraries • Problems for Science Article Recommenders and Strategies for CISTI’s Recommender Research • Demonstration of Synthese on CISTI Lab • Alternative Approaches • Future Work 54
  • 55. What is a Holographic Memory System? • A Holographic Memory System (HMS) stores information in a manner analogous to the storage of an image on a holographic plate. • HMS is composed of units called items – Each item represents some content • e.g, a concept, a word, a bibliographic item – Items are analogous to points on the surface of holographic film (or, plate) – Each item stores information about the associations it has with other items T. A. Plate, 2003 Holographic Reduced Representations: Distributed Representations for Cognitive Structures (Stanford, CA: CSLI Publications)
  • 56. Holographic Memory System (HMS) HMS Holography Red Fruit Spherical Apple Each point on the Holographic plate stores information about many parts Each item stores information about of the image many other items in the system
  • 57. HMS Recommender for Journal Articles • We compared DSHM and user-based CF on journal article recommendation on 2 small collections Medicine Biology 7495 articles 38,667 articles 0.55 references per article 1.15 references per article • 90% - 10% Cross Validation • systematically removed one reference at a time • tested whether recommender predicts the reference. • compared DSHM and user-based CF M. F. Rutledge-Taylor, A. Vellino and R. L. West. “A Holographic Associative Memory Recommender System” 3rd Int. Conference on Digital Information Management, London, 2008.
  • 59. Holographic Recommender: Discussion • Advantages – Holographic System outperformed standard user-based CF on very sparse bibliographic datasets – DSHM is better able to exploit the available information – The uniformly consistent model of DSHM gives it good potential for success on multi-dimensional datasets • Disadvantages – Requires a lot of computational resources – Unclear about how it works on a large scale.
  • 60. Outline of Talk • The Mechanical Librarian • How Recommenders Work • Recommenders in Digital Libraries • Problems for Science Article Recommenders and Strategies for CISTI’s Recommender Research • Demonstration of Synthese on CISTI Lab • Alternative Approaches • Future Work 60
  • 61. Multi-Dimensional Ratings Matrix G. Adomavicious, R. Sankaranarayanan, S. Sen, A. Tuzhilin, ACM Transactions on Information Systems 2005 Incorporating Contextual Information in Recommender Systems Using a Multidimensional Approach 61
  • 62. Scaling Strategy: Distributed Recommenders • Multiple ratings matrices decomposed by subject area • Merge separate recommendations by subject • Reduces matrix sparsity • Improves accuracy of recommendations S. Berkovsky, T.Kuflik, and F. Ricci Distributed Collaborative Filtering with 62 Domain Specialization Proceedings of Recommender Systems 2007
  • 63. Importance of Quality and Trust What predicts overall usefulness of a System? 0.6 0.5 Correlation 0.4 0.3 0.2 0.1 0 Good Rec. Useful Rec. Trust Adequate Ease of Generating Item Use Rec. Description 63 Rashmi Sinha & Kirsten Swearingen – UC Berkeley
  • 64. UI for Navigating Recommendations • Explanation-based Recommendations – Provide transparency increase user trust – Allow users to cluster by type of reason – Filter out unwanted recommendations P. Pu and L. Chen. Trust Building with Explanation Interfaces. In IUI ’06: Proceedings of the 11th International Conference On Intelligent User Interfaces, pages 93–100 64
  • 65. Conclusions • Recommender technology is only 12 years old, but mature enough for widespread commercial use. • Digital Libraries / Web 2.0 Bibliographic applications are beginning to use recommenders. • Digital Libraries create new problems for recommenders (“context drift” / “data sparsity” / “multiple dimensions”) • Recommenders insufficiently understood in Digital Libraries. • Recommender as mechanism for enhancing the process of scientific discovery promising but still uncertain. 65
  • 66. Thank You! Questions? http://lab.cisti-icist.nrc-cnrc.gc.ca/synthese/