Web scale discovery

WEB TECHNOLOGIES FOR LIBRARIES - 2 WORKSHOPS
June 28.-29. 2011 in Petrozavodsk

Web-scale discovery systems
Karen J. Buset, NTNU University Library
Trondheim, Norway

WHY
Google has set the standard for searching not only for our users, but
also for a lot of librarians.

Federated search was implemented at libraries in an attempt to
compete with Google/Google scholar

This failed because of the limitations of this technology:
• the small number of resources that could be searched
simultaneously
• the speed and the problems encountered merging results
• dealing with all the different and constantly changing interfaces.

WHAT
Content
• harvest content from local and remotely hosted repositories
• create a centralized index—to the article level
• suited for rapid search and retrieval of results ranked by
relevancy.
• harvesting of local library resources, combined with brokered
agreements with publishers and aggregators allowing access to
metadata and/or full-text content

Discovery
• single search box providing a Google-like search experience

Delivery
• quick results ranked by relevancy
• modern interface offering functionality such as faceted navigation
to drill down to more speciﬁc results

Flexibility
• agnostic to underlying systems,
• open compared to traditional library systems and allow a library
greater possibility to customize the services

HOW
Each vendor has agreements with several content suppliers from
whom they harvest materials. In addition, they harvest locally held
material such as existing library catalogues and institutional
repositories within the library using protocols such as OAI-PMH and
FTP.

Pre-harvesting eliminates the need to merge results as was the case
with federated search, which in turn makes de-duplication and
relevancy ranking easier.

Users can search all available metadata, but authentication is needed
to get access to full text. In this way, Google-like functionality is
provided to a delimited collection of resources.

PROBLEMS
The system vendors agree that there will still be a need for direct
access to specialised search interfaces because:
• Some resources are not indexed
• Some resources are not full-text indexed
• Some resources are not available
• Some databases might offer specialised search tools not available
in web-scale-discovery systems

SYSTEMS
«The big 4»

• OCLCs WorldCat Local
• ExLibris’ Primo
• Serials Solutions Summon
• EBSCO Discovery Services

Find links to system here:
https://sites.google.com/site/urd2comparison/home

USER OPINION
Several surveys at the library show that the users want a simple,
Google-like interface that will quickly provide them with relevant
results.

It seems probable that any of the these systems have enough
coverage that most users will be satisﬁed.

Relevance ranking in these systems cannot be compared with Google
pagerank; it might be a challenge to provide good relevance ranking in
a service aggregating such a diversity of metadata.

SOURCES
A brief overview of three web-scale-discovery systems: Summon,
Primo and OCLC Worldcat. NTNU University Library 2011

Jason Vaughan. Web Scale Discovery What and Why? Library
Technology reports. 2011

Web scale discovery

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Mehr von Karen Johanne Buset

Mehr von Karen Johanne Buset (7)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Web scale discovery

Hinweis der Redaktion