SlideShare ist ein Scribd-Unternehmen logo
1 von 14
The Mosaic Search Engine Mark van Harmelen Hedtek Ltd markvanharmelen@gmail.comhedtek.com
Aim Provide a proof of concept that  Users can have personalised search results according to their place and stage of studies Users can adopt other personas or points-of-view to explore academic resources We can exploit ‘mass’ attention data as revealed by library circulation information So far only working with ISBN identified books
HEI circulation data build Solr index anonymise partial Copac records annotated with use and reading list data reading lists Solr HEI anonymise front-end HEI anonymise
Anonymisation Level 1: Current prototype, enables faceting Level 2: With extra information, enables“people who borrowed this also borrowed”and“people who borrowed this went on to borrow” Anonymisationutility provided DPA compliant, can also use fair processing agreements
Augmenting Solr’s index Solr’s search index is loaded with items and any associated use information Use information is:   institution  course  progression level   year of use count of number of uses in that year Use information enables faceting Also add reading list info to items
Solr OPAC resultset itemquery item data query client-side front-end (browser)
Narrowing and broadening Thoughts (NB, ‘thoughts’) of narrowing of choice led to two features to broaden choice Don’t believe that the Mosaic demo in itself narrows when used for browsing Broadening features More like this link Reading lists
The Harry Potter ‘problem’ and scale The Harry Potter ‘problem’: Balderdash! We can control this using Library of Congress subject categories and Dewey Decimal shelfmarks Paul Miller raises questions of scale Dave Pattern has shown success use of use data at a single (small) institution We want to leverage reasonably large scale: 3.5-4M students in HE, over say the last five years
User context and attention Has been relatively simple to parameterise an open source search engine with user context Institution, course, progression level, academic year This is only part of the user context, can add Location Attention data, e.g., search history Further social search information
Disclaimer The next slide is independent of any decisions on a pure data approach Could be a pure data approach in there Or maybe not
Where is this going? A personal view Bind together ,[object Object]
Mosiac searchpersonalised/point-of-view search Massively parallel search for blindingly fast response times Data mining for library ‘stewardship’ We have prototypes for the first two, and we’re about to start experimenting with parallel search using Hadoop+Lucene
Building institutional contributions Propose union-cat-local: Search in local library Mosaic-like search utilises local loan data if it is available Two ways to encourage library contribution of loan data (thoughts in progress) Narrow: Libraries which contribute loan data to the pool get Mosaic search over the pool Broad: Offer the contextual/PoV search available everywhere; users will agitate if they don’t see local data
This is a Just Do It moment A national union catalogue with contextual search and local library interfaces Relatively cheap to do Potentially massive gains for learners, teachers and researchers Portends the development of shared services across the library domain and large cost savings Doesn’t preclude / agnostic on an open data approach Could incorporate a pure data service approach and/or a centralised service

Weitere ähnliche Inhalte

Was ist angesagt?

Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...
ijsrd.com
 
Data.Mining.C.8(Ii).Web Mining 570802461
Data.Mining.C.8(Ii).Web Mining 570802461Data.Mining.C.8(Ii).Web Mining 570802461
Data.Mining.C.8(Ii).Web Mining 570802461
Margaret Wang
 
Semantic Search Tutorial at SemTech 2012
Semantic Search Tutorial at SemTech 2012 Semantic Search Tutorial at SemTech 2012
Semantic Search Tutorial at SemTech 2012
Thanh Tran
 
Overbeeke
OverbeekeOverbeeke
Overbeeke
anesah
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
Peter Mika
 

Was ist angesagt? (11)

Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...
 
Data.Mining.C.8(Ii).Web Mining 570802461
Data.Mining.C.8(Ii).Web Mining 570802461Data.Mining.C.8(Ii).Web Mining 570802461
Data.Mining.C.8(Ii).Web Mining 570802461
 
Semantic Search Tutorial at SemTech 2012
Semantic Search Tutorial at SemTech 2012 Semantic Search Tutorial at SemTech 2012
Semantic Search Tutorial at SemTech 2012
 
Overbeeke
OverbeekeOverbeeke
Overbeeke
 
Structured data and metadata evaluation methodology for organizations looking...
Structured data and metadata evaluation methodology for organizations looking...Structured data and metadata evaluation methodology for organizations looking...
Structured data and metadata evaluation methodology for organizations looking...
 
Semantic Search on the Rise
Semantic Search on the RiseSemantic Search on the Rise
Semantic Search on the Rise
 
Personalized search
Personalized searchPersonalized search
Personalized search
 
Evaluation of Web Scale Discovery Services
Evaluation of Web Scale Discovery ServicesEvaluation of Web Scale Discovery Services
Evaluation of Web Scale Discovery Services
 
H0314450
H0314450H0314450
H0314450
 
A NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP
A NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLPA NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP
A NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
 

Andere mochten auch

Blogging and Internal Communications
Blogging and Internal CommunicationsBlogging and Internal Communications
Blogging and Internal Communications
sbooth
 
Internet Is Fun
Internet Is FunInternet Is Fun
Internet Is Fun
ngkaihoe
 
Enterprise2
Enterprise2Enterprise2
Enterprise2
ngkaihoe
 
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFaIl Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
Simone Onofri
 
Training Program
Training ProgramTraining Program
Training Program
ngkaihoe
 

Andere mochten auch (20)

Waiting For The Babel Fish
 Waiting For The Babel Fish Waiting For The Babel Fish
Waiting For The Babel Fish
 
Externally Hosted Web 2.0 Services
Externally Hosted Web 2.0 ServicesExternally Hosted Web 2.0 Services
Externally Hosted Web 2.0 Services
 
Blogging and Internal Communications
Blogging and Internal CommunicationsBlogging and Internal Communications
Blogging and Internal Communications
 
Italy and rome's geography
Italy and rome's geographyItaly and rome's geography
Italy and rome's geography
 
Rev1,1
Rev1,1Rev1,1
Rev1,1
 
Manpower
ManpowerManpower
Manpower
 
Elgg at the University of Brighton -- Stanier
Elgg at the University of Brighton -- StanierElgg at the University of Brighton -- Stanier
Elgg at the University of Brighton -- Stanier
 
Hea.Keynote
Hea.KeynoteHea.Keynote
Hea.Keynote
 
Internet Is Fun
Internet Is FunInternet Is Fun
Internet Is Fun
 
Introduction to CS60171 (2009)
Introduction to CS60171 (2009)Introduction to CS60171 (2009)
Introduction to CS60171 (2009)
 
Jh Student Handbook 09 10
Jh Student Handbook 09 10Jh Student Handbook 09 10
Jh Student Handbook 09 10
 
Teenagers and Blogs
Teenagers and BlogsTeenagers and Blogs
Teenagers and Blogs
 
Enterprise2
Enterprise2Enterprise2
Enterprise2
 
La scuola siamo noi: Matteucci Garibaldi
La scuola siamo noi: Matteucci Garibaldi La scuola siamo noi: Matteucci Garibaldi
La scuola siamo noi: Matteucci Garibaldi
 
giornalino3M, terzo numero
giornalino3M, terzo numerogiornalino3M, terzo numero
giornalino3M, terzo numero
 
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFaIl Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
 
Policy and Strategy
Policy and StrategyPolicy and Strategy
Policy and Strategy
 
Revelation 1st
Revelation 1stRevelation 1st
Revelation 1st
 
Web 2.0 and Learning and Teaching
Web 2.0 and Learning and TeachingWeb 2.0 and Learning and Teaching
Web 2.0 and Learning and Teaching
 
Training Program
Training ProgramTraining Program
Training Program
 

Ähnlich wie Mosiac Search Engine

Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)
Bradley Allen
 
Sem tech2013 tutorial
Sem tech2013 tutorialSem tech2013 tutorial
Sem tech2013 tutorial
Thengo Kim
 
Research4C4U
Research4C4UResearch4C4U
Research4C4U
ianmcnee
 
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Rafal Kasprowski
 
Investing in a time of desruptive change
Investing in a time of desruptive changeInvesting in a time of desruptive change
Investing in a time of desruptive change
Jisc
 

Ähnlich wie Mosiac Search Engine (20)

Inteligent Catalogue Final
Inteligent Catalogue FinalInteligent Catalogue Final
Inteligent Catalogue Final
 
Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)
 
How discovery impacts of users' experiences
How discovery impacts of users' experiencesHow discovery impacts of users' experiences
How discovery impacts of users' experiences
 
Webscale Discovery with the Enduser in Mind
Webscale Discovery with the Enduser in Mind Webscale Discovery with the Enduser in Mind
Webscale Discovery with the Enduser in Mind
 
Sem tech2013 tutorial
Sem tech2013 tutorialSem tech2013 tutorial
Sem tech2013 tutorial
 
Recent Trends in Semantic Search Technologies
Recent Trends in Semantic Search TechnologiesRecent Trends in Semantic Search Technologies
Recent Trends in Semantic Search Technologies
 
Establishing the Connection: Creating a Linked Data Version of the BNB
Establishing the Connection: Creating a Linked Data Version of the BNBEstablishing the Connection: Creating a Linked Data Version of the BNB
Establishing the Connection: Creating a Linked Data Version of the BNB
 
2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery
 
confernece paper
confernece paperconfernece paper
confernece paper
 
Falling in and out and in love with Information Architecture
Falling in and out and in love with Information ArchitectureFalling in and out and in love with Information Architecture
Falling in and out and in love with Information Architecture
 
Research4C4U
Research4C4UResearch4C4U
Research4C4U
 
Research4C4U
Research4C4UResearch4C4U
Research4C4U
 
Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3
 
Using Search Analytics to Diagnose What’s Ailing your Information Architecture
Using Search Analytics to Diagnose What’s Ailing your Information ArchitectureUsing Search Analytics to Diagnose What’s Ailing your Information Architecture
Using Search Analytics to Diagnose What’s Ailing your Information Architecture
 
Erl10 web scale-gb-sg
Erl10 web scale-gb-sgErl10 web scale-gb-sg
Erl10 web scale-gb-sg
 
2008 web-managers-hwilfert-final
2008 web-managers-hwilfert-final2008 web-managers-hwilfert-final
2008 web-managers-hwilfert-final
 
Encore Presentation - ACRL/NEC ITIG Annual Meeting
Encore Presentation - ACRL/NEC ITIG Annual MeetingEncore Presentation - ACRL/NEC ITIG Annual Meeting
Encore Presentation - ACRL/NEC ITIG Annual Meeting
 
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
 
Investing in a time of desruptive change
Investing in a time of desruptive changeInvesting in a time of desruptive change
Investing in a time of desruptive change
 
Web scale discovery service
Web scale discovery serviceWeb scale discovery service
Web scale discovery service
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Mosiac Search Engine

  • 1. The Mosaic Search Engine Mark van Harmelen Hedtek Ltd markvanharmelen@gmail.comhedtek.com
  • 2. Aim Provide a proof of concept that Users can have personalised search results according to their place and stage of studies Users can adopt other personas or points-of-view to explore academic resources We can exploit ‘mass’ attention data as revealed by library circulation information So far only working with ISBN identified books
  • 3. HEI circulation data build Solr index anonymise partial Copac records annotated with use and reading list data reading lists Solr HEI anonymise front-end HEI anonymise
  • 4. Anonymisation Level 1: Current prototype, enables faceting Level 2: With extra information, enables“people who borrowed this also borrowed”and“people who borrowed this went on to borrow” Anonymisationutility provided DPA compliant, can also use fair processing agreements
  • 5. Augmenting Solr’s index Solr’s search index is loaded with items and any associated use information Use information is: institution course progression level year of use count of number of uses in that year Use information enables faceting Also add reading list info to items
  • 6. Solr OPAC resultset itemquery item data query client-side front-end (browser)
  • 7. Narrowing and broadening Thoughts (NB, ‘thoughts’) of narrowing of choice led to two features to broaden choice Don’t believe that the Mosaic demo in itself narrows when used for browsing Broadening features More like this link Reading lists
  • 8. The Harry Potter ‘problem’ and scale The Harry Potter ‘problem’: Balderdash! We can control this using Library of Congress subject categories and Dewey Decimal shelfmarks Paul Miller raises questions of scale Dave Pattern has shown success use of use data at a single (small) institution We want to leverage reasonably large scale: 3.5-4M students in HE, over say the last five years
  • 9. User context and attention Has been relatively simple to parameterise an open source search engine with user context Institution, course, progression level, academic year This is only part of the user context, can add Location Attention data, e.g., search history Further social search information
  • 10. Disclaimer The next slide is independent of any decisions on a pure data approach Could be a pure data approach in there Or maybe not
  • 11.
  • 12. Mosiac searchpersonalised/point-of-view search Massively parallel search for blindingly fast response times Data mining for library ‘stewardship’ We have prototypes for the first two, and we’re about to start experimenting with parallel search using Hadoop+Lucene
  • 13. Building institutional contributions Propose union-cat-local: Search in local library Mosaic-like search utilises local loan data if it is available Two ways to encourage library contribution of loan data (thoughts in progress) Narrow: Libraries which contribute loan data to the pool get Mosaic search over the pool Broad: Offer the contextual/PoV search available everywhere; users will agitate if they don’t see local data
  • 14. This is a Just Do It moment A national union catalogue with contextual search and local library interfaces Relatively cheap to do Potentially massive gains for learners, teachers and researchers Portends the development of shared services across the library domain and large cost savings Doesn’t preclude / agnostic on an open data approach Could incorporate a pure data service approach and/or a centralised service