SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Downloaden Sie, um offline zu lesen
Mendeley:
    crowdsourcing and
recommending research
       on a large scale




     Kris Jack, PhD
  Data Mining Team Lead
Summary

➔
    what is mendeley?

➔
    crowdsourcing on a large scale

➔
    recommendations on a large scale

➔
    data for you
Mendeley is...


...a startup         ...going to change
company                 the way that we
                          do research...
Mendeley provides tools to help users...
                 ...collaborate with
                     one another
...organise                            ...discover new
their research                                research
Mendeley provides tools to help users...
                 ...collaborate with
                     one another
...organise                            ...discover new
their research                                research
Mendeley provides tools to help users...
                 ...collaborate with
                     one another
...organise                            ...discover new
their research                                research
Mendeley provides tools to help users...
                 ...collaborate with
                     one another
...organise                            ...discover new
their research                                research
Mendeley provides tools to help users...
                 ...collaborate with
                     one another
...organise                            ...discover new
their research                                research
Summary
Summary

➔
    what is mendeley?

➔
    crowdsourcing on a large scale

➔
    recommendations on a large scale

➔
    data for you
Mendeley          Last.fm
                                                   3) Last.fm builds your music
                works like this:                   profile and recommends you
                                                   music you also could like
1) Install “Audioscrobbler”




                                                             and it’s the world’s
                                                             largest open music
                              2) Listen to music             database!
Mendeley   Last.fm
music libraries                 research libraries


artists                         researchers


songs                           papers


genres                          disciplines




                                              Screenshot taken from
    Mendeley is the world’s                   www.mendeley.com
    largest crowdsourced                      on 04/09/11
    research catalogue!
Catalogue Crowdsourcing:
System Requirements



assimilate research artefacts
into catalogue in real time
(pdfs + citation metadata)




                          recognise duplicate and
                          non-duplicate artefacts
                          in noisy input
Main sources of input:
                          Main types of input:
                        → Mendeley Desktop
                        → Mendeley Web Importer
                          → article PDFs
                        → External catalogue imports (e.g. ArXiv)
                          → article metadata (e.g. reference)
articles                → External catalogue lookups (e.g.
                        CrossRef)




           catalogue generator




                                                 catalogue
articles




                         catalogue generator




Aims:

→ Cluster documents together
→ Generate catalogue entries

                                               catalogue
articles




                           catalogue generator


Process:

→ Filehash check (SHA-1)
→ Identifier check (e.g. PubMed id)
→ Document fingerprint (full text)
→ Metadata similarity check
→ Update individual article page                 catalogue
articles




Catalogue with:
                          catalogue generator
→ article metadata
→ aggregated statistics
→ support recs, etc.




                                                catalogue
Summary
Summary

➔
    what is mendeley?

➔
    crowdsourcing on a large scale

➔
    recommendations on a large scale

➔
    what does this mean for you?
Article Recommendation:
System Requirements



generate personal article
recommendations for users
(i.e. “here are some articles
that may interest you”)

                                update recommendations
                                every 24 hours
Input:
User libraries




                 Output:
                 Recommend 10
                 articles to each user
Recommendation through          Test:
collaborative filtering         10-fold cross validation
                                50,000 user libraries
Article's in library or not
(e.g. binary input)                  16 months ago

Various similarity metrics
(e.g. cooccurrence,
loglikelihood, tanimoto)




       Results:
       <0.025 precision at 10
Recommendation through        Test:
collaborative filtering       10-fold cross validation
                              50,000 user libraries
Article's in library or not        10 months ago
(e.g. binary input)                (i.e. + 6 months)

Various similarity metrics
(e.g. cooccurrence,
loglikelihood, tanimoto)




       Results:
       ~0.1 precision at 10
Recommendation through        Test:
collaborative filtering       Release to a subset of
                              users
Article's in library or not        10 months ago
(e.g. binary input)                (i.e. + 6 months)

Various similarity metrics
(e.g. cooccurrence,
loglikelihood, tanimoto)




       Results:
       ~0.4 precision at 10
Article Recommendation Acceptance Rates
Acceptance rate (i.e. accept/reject clicks)




                                                 Number of months live
Article Recommendation:
System Requirements

                                      1 million users!

generate personal article
recommendations users
(i.e. “here are some articles                            days!
that may interest you”)

                                update recommendations
                                every 24 hours



        How to scale up?
Test:
                                       10-fold cross validation
                                       50,000 user libraries




So, results comparable to non-   Completely distributed, so can
distributed recommender          easily run on EC2 within 24
                                 hours...
Article Recommendation Precision Across User
     Library Sizes (using cooccurrence)
Precision at 10 articles




                                  How will real
                                  users react?




                           Number of articles in user library
Summary
Summary

➔
    what is mendeley?

➔
    crowdsourcing on a large scale

➔
    recommendations on a large scale

➔
    data for you
Public Data


                               user libraries

                           50,000 libraries
                          4,848,724 articles
                       3,652,285 unique articles




          library readership                    library stars




    Obtain from: http://dev.mendeley.com/datachallenge
Mendeley's API
www.mendeley.com

Weitere ähnliche Inhalte

Ähnlich wie Mendeley: crowdsourcing and recommending research on a large scale

DataScience Meeting I - Cloud Elephants and Witches: A Big Data Tale from Men...
DataScience Meeting I - Cloud Elephants and Witches: A Big Data Tale from Men...DataScience Meeting I - Cloud Elephants and Witches: A Big Data Tale from Men...
DataScience Meeting I - Cloud Elephants and Witches: A Big Data Tale from Men...datascience_at
 
Cloud Elephants and Witches: A Big Data Tale from Mendeley
Cloud Elephants and Witches: A Big Data Tale from MendeleyCloud Elephants and Witches: A Big Data Tale from Mendeley
Cloud Elephants and Witches: A Big Data Tale from MendeleyKris Jack
 
Mendeley, putting data into the hands of researchers
Mendeley, putting data into the hands of researchersMendeley, putting data into the hands of researchers
Mendeley, putting data into the hands of researchersKris Jack
 
Mendeley manual
Mendeley manualMendeley manual
Mendeley manualSoon Kim
 
Usage-Based vs. Citation-Based Recommenders in a Digital Library
Usage-Based vs. Citation-Based Recommenders in a Digital LibraryUsage-Based vs. Citation-Based Recommenders in a Digital Library
Usage-Based vs. Citation-Based Recommenders in a Digital LibraryAndre Vellino
 
Session 2_Mendeley_2.pdf
Session 2_Mendeley_2.pdfSession 2_Mendeley_2.pdf
Session 2_Mendeley_2.pdfmuhirwaSamuel
 
Mendeley Institutional Edition - Universiti Kebangasaan Malaysia
Mendeley Institutional Edition - Universiti Kebangasaan MalaysiaMendeley Institutional Edition - Universiti Kebangasaan Malaysia
Mendeley Institutional Edition - Universiti Kebangasaan MalaysiaNurhazman Abdul Aziz
 
000000-tutorial_mendeley.pdf
000000-tutorial_mendeley.pdf000000-tutorial_mendeley.pdf
000000-tutorial_mendeley.pdfIzz-mohd
 
Building bibliographies and managing citations with Mendeley
Building bibliographies and managing citations with MendeleyBuilding bibliographies and managing citations with Mendeley
Building bibliographies and managing citations with MendeleyAda Giannatelli
 
Academic SEO, or: How do I get my research to show up in search engines and d...
Academic SEO, or: How do I get my research to show up in search engines and d...Academic SEO, or: How do I get my research to show up in search engines and d...
Academic SEO, or: How do I get my research to show up in search engines and d...Open Knowledge Maps
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisStuart Wrigley
 
Mendeley and Activity Data
Mendeley and Activity DataMendeley and Activity Data
Mendeley and Activity DataIan Mulvany
 
Let Your Conscience Be Your Guide: Taming Online Research Guides at the NCSU ...
Let Your Conscience Be Your Guide: Taming Online Research Guides at the NCSU ...Let Your Conscience Be Your Guide: Taming Online Research Guides at the NCSU ...
Let Your Conscience Be Your Guide: Taming Online Research Guides at the NCSU ...Lillian Rigling
 
Literature Searching For Your Summer Scholarship 2011 - Science and Engineering
Literature Searching For Your Summer Scholarship 2011 - Science and EngineeringLiterature Searching For Your Summer Scholarship 2011 - Science and Engineering
Literature Searching For Your Summer Scholarship 2011 - Science and EngineeringDeborah Fitchett
 
Mendeley teaching presentation_0981_template
Mendeley teaching presentation_0981_templateMendeley teaching presentation_0981_template
Mendeley teaching presentation_0981_templateWilliam Gunn
 
Mendeley Workshop Presentation
Mendeley Workshop PresentationMendeley Workshop Presentation
Mendeley Workshop PresentationSalma Patel
 

Ähnlich wie Mendeley: crowdsourcing and recommending research on a large scale (20)

DataScience Meeting I - Cloud Elephants and Witches: A Big Data Tale from Men...
DataScience Meeting I - Cloud Elephants and Witches: A Big Data Tale from Men...DataScience Meeting I - Cloud Elephants and Witches: A Big Data Tale from Men...
DataScience Meeting I - Cloud Elephants and Witches: A Big Data Tale from Men...
 
Cloud Elephants and Witches: A Big Data Tale from Mendeley
Cloud Elephants and Witches: A Big Data Tale from MendeleyCloud Elephants and Witches: A Big Data Tale from Mendeley
Cloud Elephants and Witches: A Big Data Tale from Mendeley
 
Mendeley, putting data into the hands of researchers
Mendeley, putting data into the hands of researchersMendeley, putting data into the hands of researchers
Mendeley, putting data into the hands of researchers
 
Mendeley manual
Mendeley manualMendeley manual
Mendeley manual
 
Usage-Based vs. Citation-Based Recommenders in a Digital Library
Usage-Based vs. Citation-Based Recommenders in a Digital LibraryUsage-Based vs. Citation-Based Recommenders in a Digital Library
Usage-Based vs. Citation-Based Recommenders in a Digital Library
 
Session 2_Mendeley_2.pdf
Session 2_Mendeley_2.pdfSession 2_Mendeley_2.pdf
Session 2_Mendeley_2.pdf
 
Mendeley Institutional Edition - Universiti Kebangasaan Malaysia
Mendeley Institutional Edition - Universiti Kebangasaan MalaysiaMendeley Institutional Edition - Universiti Kebangasaan Malaysia
Mendeley Institutional Edition - Universiti Kebangasaan Malaysia
 
Mendeley software presentation
Mendeley software presentationMendeley software presentation
Mendeley software presentation
 
Mendeley Teaching Presentation
Mendeley Teaching PresentationMendeley Teaching Presentation
Mendeley Teaching Presentation
 
000000-tutorial_mendeley.pdf
000000-tutorial_mendeley.pdf000000-tutorial_mendeley.pdf
000000-tutorial_mendeley.pdf
 
Building bibliographies and managing citations with Mendeley
Building bibliographies and managing citations with MendeleyBuilding bibliographies and managing citations with Mendeley
Building bibliographies and managing citations with Mendeley
 
Academic SEO, or: How do I get my research to show up in search engines and d...
Academic SEO, or: How do I get my research to show up in search engines and d...Academic SEO, or: How do I get my research to show up in search engines and d...
Academic SEO, or: How do I get my research to show up in search engines and d...
 
Introduction to-mendeley presentation-2014
Introduction to-mendeley presentation-2014Introduction to-mendeley presentation-2014
Introduction to-mendeley presentation-2014
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log Analysis
 
Mendeley and Activity Data
Mendeley and Activity DataMendeley and Activity Data
Mendeley and Activity Data
 
Libraries meet research 2.0
Libraries meet research 2.0Libraries meet research 2.0
Libraries meet research 2.0
 
Let Your Conscience Be Your Guide: Taming Online Research Guides at the NCSU ...
Let Your Conscience Be Your Guide: Taming Online Research Guides at the NCSU ...Let Your Conscience Be Your Guide: Taming Online Research Guides at the NCSU ...
Let Your Conscience Be Your Guide: Taming Online Research Guides at the NCSU ...
 
Literature Searching For Your Summer Scholarship 2011 - Science and Engineering
Literature Searching For Your Summer Scholarship 2011 - Science and EngineeringLiterature Searching For Your Summer Scholarship 2011 - Science and Engineering
Literature Searching For Your Summer Scholarship 2011 - Science and Engineering
 
Mendeley teaching presentation_0981_template
Mendeley teaching presentation_0981_templateMendeley teaching presentation_0981_template
Mendeley teaching presentation_0981_template
 
Mendeley Workshop Presentation
Mendeley Workshop PresentationMendeley Workshop Presentation
Mendeley Workshop Presentation
 

Mehr von Kris Jack

Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Machine Learning @ Mendeley
Machine Learning @ MendeleyMachine Learning @ Mendeley
Machine Learning @ MendeleyKris Jack
 
Mendeley Suggest: What will you read next?
Mendeley Suggest: What will you read next?Mendeley Suggest: What will you read next?
Mendeley Suggest: What will you read next?Kris Jack
 
Mendeley Suggest: Engineering a Personalised Article Recommender System
Mendeley Suggest: Engineering a Personalised Article Recommender SystemMendeley Suggest: Engineering a Personalised Article Recommender System
Mendeley Suggest: Engineering a Personalised Article Recommender SystemKris Jack
 
Mendeley's Data and Perspectives on Data Challenges
Mendeley's Data and Perspectives on Data ChallengesMendeley's Data and Perspectives on Data Challenges
Mendeley's Data and Perspectives on Data ChallengesKris Jack
 
Scientific Article Recommendation with Mahout
Scientific Article Recommendation with MahoutScientific Article Recommendation with Mahout
Scientific Article Recommendation with MahoutKris Jack
 
Mahout Becomes a Researcher: Large Scale Recommendations at Mendeley
Mahout Becomes a Researcher: Large Scale Recommendations at MendeleyMahout Becomes a Researcher: Large Scale Recommendations at Mendeley
Mahout Becomes a Researcher: Large Scale Recommendations at MendeleyKris Jack
 
improving explicit preference entry by visualising data similarities
improving explicit preference entry by visualising data similaritiesimproving explicit preference entry by visualising data similarities
improving explicit preference entry by visualising data similaritiesKris Jack
 
Etude de la pertinence de critères de recherche en recherche d'informations s...
Etude de la pertinence de critères de recherche en recherche d'informations s...Etude de la pertinence de critères de recherche en recherche d'informations s...
Etude de la pertinence de critères de recherche en recherche d'informations s...Kris Jack
 
A Computational Model of Staged Language Acquisition
A Computational Model of Staged Language AcquisitionA Computational Model of Staged Language Acquisition
A Computational Model of Staged Language AcquisitionKris Jack
 
From Syllables to Syntax: Investigating Staged Linguistic Development through...
From Syllables to Syntax: Investigating Staged Linguistic Development through...From Syllables to Syntax: Investigating Staged Linguistic Development through...
From Syllables to Syntax: Investigating Staged Linguistic Development through...Kris Jack
 
A Collaborative Tool for the Computational Modelling of Child Language Acquis...
A Collaborative Tool for the Computational Modelling of Child Language Acquis...A Collaborative Tool for the Computational Modelling of Child Language Acquis...
A Collaborative Tool for the Computational Modelling of Child Language Acquis...Kris Jack
 
Recommendation Engines for Scientific Literature
Recommendation Engines for Scientific LiteratureRecommendation Engines for Scientific Literature
Recommendation Engines for Scientific LiteratureKris Jack
 

Mehr von Kris Jack (13)

Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Machine Learning @ Mendeley
Machine Learning @ MendeleyMachine Learning @ Mendeley
Machine Learning @ Mendeley
 
Mendeley Suggest: What will you read next?
Mendeley Suggest: What will you read next?Mendeley Suggest: What will you read next?
Mendeley Suggest: What will you read next?
 
Mendeley Suggest: Engineering a Personalised Article Recommender System
Mendeley Suggest: Engineering a Personalised Article Recommender SystemMendeley Suggest: Engineering a Personalised Article Recommender System
Mendeley Suggest: Engineering a Personalised Article Recommender System
 
Mendeley's Data and Perspectives on Data Challenges
Mendeley's Data and Perspectives on Data ChallengesMendeley's Data and Perspectives on Data Challenges
Mendeley's Data and Perspectives on Data Challenges
 
Scientific Article Recommendation with Mahout
Scientific Article Recommendation with MahoutScientific Article Recommendation with Mahout
Scientific Article Recommendation with Mahout
 
Mahout Becomes a Researcher: Large Scale Recommendations at Mendeley
Mahout Becomes a Researcher: Large Scale Recommendations at MendeleyMahout Becomes a Researcher: Large Scale Recommendations at Mendeley
Mahout Becomes a Researcher: Large Scale Recommendations at Mendeley
 
improving explicit preference entry by visualising data similarities
improving explicit preference entry by visualising data similaritiesimproving explicit preference entry by visualising data similarities
improving explicit preference entry by visualising data similarities
 
Etude de la pertinence de critères de recherche en recherche d'informations s...
Etude de la pertinence de critères de recherche en recherche d'informations s...Etude de la pertinence de critères de recherche en recherche d'informations s...
Etude de la pertinence de critères de recherche en recherche d'informations s...
 
A Computational Model of Staged Language Acquisition
A Computational Model of Staged Language AcquisitionA Computational Model of Staged Language Acquisition
A Computational Model of Staged Language Acquisition
 
From Syllables to Syntax: Investigating Staged Linguistic Development through...
From Syllables to Syntax: Investigating Staged Linguistic Development through...From Syllables to Syntax: Investigating Staged Linguistic Development through...
From Syllables to Syntax: Investigating Staged Linguistic Development through...
 
A Collaborative Tool for the Computational Modelling of Child Language Acquis...
A Collaborative Tool for the Computational Modelling of Child Language Acquis...A Collaborative Tool for the Computational Modelling of Child Language Acquis...
A Collaborative Tool for the Computational Modelling of Child Language Acquis...
 
Recommendation Engines for Scientific Literature
Recommendation Engines for Scientific LiteratureRecommendation Engines for Scientific Literature
Recommendation Engines for Scientific Literature
 

Kürzlich hochgeladen

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 

Kürzlich hochgeladen (20)

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 

Mendeley: crowdsourcing and recommending research on a large scale

  • 1. Mendeley: crowdsourcing and recommending research on a large scale Kris Jack, PhD Data Mining Team Lead
  • 2. Summary ➔ what is mendeley? ➔ crowdsourcing on a large scale ➔ recommendations on a large scale ➔ data for you
  • 3. Mendeley is... ...a startup ...going to change company the way that we do research...
  • 4. Mendeley provides tools to help users... ...collaborate with one another ...organise ...discover new their research research
  • 5. Mendeley provides tools to help users... ...collaborate with one another ...organise ...discover new their research research
  • 6. Mendeley provides tools to help users... ...collaborate with one another ...organise ...discover new their research research
  • 7.
  • 8. Mendeley provides tools to help users... ...collaborate with one another ...organise ...discover new their research research
  • 9. Mendeley provides tools to help users... ...collaborate with one another ...organise ...discover new their research research
  • 10. Summary Summary ➔ what is mendeley? ➔ crowdsourcing on a large scale ➔ recommendations on a large scale ➔ data for you
  • 11. Mendeley Last.fm 3) Last.fm builds your music works like this: profile and recommends you music you also could like 1) Install “Audioscrobbler” and it’s the world’s largest open music 2) Listen to music database!
  • 12. Mendeley Last.fm music libraries research libraries artists researchers songs papers genres disciplines Screenshot taken from Mendeley is the world’s www.mendeley.com largest crowdsourced on 04/09/11 research catalogue!
  • 13. Catalogue Crowdsourcing: System Requirements assimilate research artefacts into catalogue in real time (pdfs + citation metadata) recognise duplicate and non-duplicate artefacts in noisy input
  • 14. Main sources of input: Main types of input: → Mendeley Desktop → Mendeley Web Importer → article PDFs → External catalogue imports (e.g. ArXiv) → article metadata (e.g. reference) articles → External catalogue lookups (e.g. CrossRef) catalogue generator catalogue
  • 15. articles catalogue generator Aims: → Cluster documents together → Generate catalogue entries catalogue
  • 16. articles catalogue generator Process: → Filehash check (SHA-1) → Identifier check (e.g. PubMed id) → Document fingerprint (full text) → Metadata similarity check → Update individual article page catalogue
  • 17. articles Catalogue with: catalogue generator → article metadata → aggregated statistics → support recs, etc. catalogue
  • 18. Summary Summary ➔ what is mendeley? ➔ crowdsourcing on a large scale ➔ recommendations on a large scale ➔ what does this mean for you?
  • 19. Article Recommendation: System Requirements generate personal article recommendations for users (i.e. “here are some articles that may interest you”) update recommendations every 24 hours
  • 20. Input: User libraries Output: Recommend 10 articles to each user
  • 21. Recommendation through Test: collaborative filtering 10-fold cross validation 50,000 user libraries Article's in library or not (e.g. binary input) 16 months ago Various similarity metrics (e.g. cooccurrence, loglikelihood, tanimoto) Results: <0.025 precision at 10
  • 22. Recommendation through Test: collaborative filtering 10-fold cross validation 50,000 user libraries Article's in library or not 10 months ago (e.g. binary input) (i.e. + 6 months) Various similarity metrics (e.g. cooccurrence, loglikelihood, tanimoto) Results: ~0.1 precision at 10
  • 23. Recommendation through Test: collaborative filtering Release to a subset of users Article's in library or not 10 months ago (e.g. binary input) (i.e. + 6 months) Various similarity metrics (e.g. cooccurrence, loglikelihood, tanimoto) Results: ~0.4 precision at 10
  • 24. Article Recommendation Acceptance Rates Acceptance rate (i.e. accept/reject clicks) Number of months live
  • 25. Article Recommendation: System Requirements 1 million users! generate personal article recommendations users (i.e. “here are some articles days! that may interest you”) update recommendations every 24 hours How to scale up?
  • 26.
  • 27. Test: 10-fold cross validation 50,000 user libraries So, results comparable to non- Completely distributed, so can distributed recommender easily run on EC2 within 24 hours...
  • 28. Article Recommendation Precision Across User Library Sizes (using cooccurrence) Precision at 10 articles How will real users react? Number of articles in user library
  • 29. Summary Summary ➔ what is mendeley? ➔ crowdsourcing on a large scale ➔ recommendations on a large scale ➔ data for you
  • 30. Public Data user libraries 50,000 libraries 4,848,724 articles 3,652,285 unique articles library readership library stars Obtain from: http://dev.mendeley.com/datachallenge
  • 32.
  • 33.