6. Traditional (Web)OPAC
Pros Cons
Keyword search! Uses database queries
Author, title, subject „LIKE‟ statements
ISBN/LCCN search Exact/partial match
Boolean queries Limited use of search
Proximity search algorithm
Browse index No relevance ranking
Authority headings
Only physical collection
Title, Call Number
and e-books
Real-time item status!
Copies & availability info
Link to URL (tag 856)
7. Integrated OPAC Portal
Enrichment
Web services Services
Web Server Application
Website ILS Database ILS Database
content (Bibs) (Patrons)
8.
9. Integrated OPAC Portal
Pros Cons
All WebOPAC features Uses database queries
Keyword search
„LIKE‟ statements
Headings browse
Exact/partial match
Availability info
Limited use of search
Library website integration
algorithms
Patron empowerment
Circ/Account details No relevance ranking
Online renewal Still limited to only physical
Online hold placement
collection & e-books
SDI services
New arrivals
OPAC enrichment
Book cover/reviews
Thesaurus integration
10. Federated Search Service
360 Search dbWiz
Research Pro Pazpar2
Full-text links
Web Server Application
Library Digital Science
Catalog Repository
ProQuest EBSCO
Direct
PubMed … Emerald
13. Federated Search Service
Pros Cons
Single search broadcast Not all databases are
Real-time search results standards compliant
Based on standards Requires custom search scripts
Z39.50, SRU/W
Requires metadata crosswalk
MARC, ISO2709, XML
Supports large set of Network intensive
databases Performance issues
7000+ in “360 Search”
Mostly available as hosted
6300+ in Muse platform
Merging and sorting service
No local index Annual subscription
(maintenance free!)
14.
15. Discovery Interface
Enrichment
Web services Services
Web Server Application
Full-text link Availability/Holds
Digital Central Index
Repository (Solr/Lucene) ILS Database
DC XML data MARC Bib data
16.
17. Discovery Interface
Word stemming Phrase query
„fishing‟, „fished‟, „fish‟, „Did you mean?‟
„fisher‟ => „fish‟
Spell Checker
Fuzzy search
insertion: cot coat Relevance ranking
deletion: coat cot TF-IDF / Term Vector
substitution: coat cost Term weights
Auto-suggest Lucene scores
N-gram, Edge N-gram Faceted browsing
analysis Who are main authors and
their count?
What are main subjects and
their count?
18. Discovery Interface
Pros Cons
Google-like search box Searches only locally hosted
Advanced features collections
Fuzzy searching
Relevance ranking
Word stemming algorithms
Social tagging/reviews
“Did you mean?” feature
Auto-suggest (type ahead)
Faceted browsing
Availability/Hold requests
Metadata enrichment
Linking
Amazon/Google/Wikipedia
Digital repository integration
19. Can we combine the two?
Modern discovery interface
Local collections +
Remote databases
Unified search result
20. Web-scale Discovery Services
EBSCO
ProQuest
ABI Inform
Web Server Application
PubMed
Availability
Full-text link
Science
Direct Library
… Central Index
MARC data Catalog
Full-text and metadata
Digital
Lexis-Nexis Repository
DC data
22. Web-scale Discovery Services
Summon Service
Content types include:
Library catalog records Conference proceedings
E-journal articles Grey literature
Institutional repositories Cited references
Newspaper articles Reports
E-books Digital library
Dissertations Databases and more.
23.
24. Web-scale Discovery Services
Pros Cons
Google-like single search box Supports limited number of
Pre-indexed licensed content databases (1000-1500)
Inclusion of local collection Requires huge investment to
OAI-PMH, MARC updates maintain centralized index
Advanced features Publisher partnerships
Relevance ranking (Licensing/legal issues)
“Did you mean?” Regular pre-publication indexing
Auto-suggest (type ahead)
Mostly hosted-only service
Faceted navigation
Content bias? (ranking)
Availability/Full-text links
Vendor lock-in?
Mobile friendly
Web-service APIs Annual subscription
Easier off-campus access
No installation/maintenance
25. Can we have best of both worlds?
Modern discovery interface Supports large number of
databases
Local collections + Based on open standards
Remote databases (extensible)
Can be maintained locally
Unified search result (No subscription!)
Web Server Application
Remote Remote Remote Remote
Digital ILS database database database database database
Repository (Bibs)
Remote Remote Remote Remote
database database database database
28. Integrated Discovery Platform
Pazpar2 Architecture Open source (GPL)
Build your own connector!
https://www.indexdata.com/pazpar2
29. Conclusion
Each platform has its own goals:
Pure library catalog can provide expressive search (high precision)
Federated search improves content coverage in single search
Discovery interfaces are designed to improve user experience for
local collections
Web-scale discovery provides unified search experience for local and
remote collections (still way short in content coverage)
Integrated platform provides extensibility (but requires significant
effort in development and maintenance)
One size does not fit all. No single system is perfect.
As content becomes more open, the focus of discovery
solutions should be on open platforms that are extensible
as well as affordable.