摘要
自2009年左右,資源探索服務一詞面世以來,受到圖書館界的熱烈討論,不少資源探索服務因運而生。資源探索服務乃是由聯合索引和資源探索層所構成,藉由兩者的搭配提供使用者單一的檢索介面,同時檢索圖書館自有、訂購和公開取用的多樣化資源,並提供相關排序、層面瀏覽、個人化與社群服務等功能。本文的主旨在於闡述資源探索服務的特性,並歸納整理資源探索服務的功能指標,供圖書館界選擇資源探索服務時參考。
關鍵字:資源探索服務;聯合索引;資源探索層
ABSTRACT
A Web-scale discovery service (discovery service, for short) is a new service that may realize the discovery and delivery of high-quality information in the library. A discovery service is composed of a unified index and a discovery layer. The unified index pre-harvests and pre-indexes a variety of information resources, including the MARC records created by the library, the metadata of the institutional repository or digital content management system of the library, the metadata and full-text (for indexing) of the databases and electronic journals subscribed by the library, and the metadata and full-text (for indexing) of open-access systems. A user can then search the contents in the unified index through the discovery layer. The discovery layer incorporates functionality such as relevance ranking, facet navigation, personalized service and social networking service. This article aims at explicating the features of a discovery service, and draw the functionality indicators that may be adopted and(or) amended by any library which wants to purchase a discovery service.
1. Exploring Functionality Indicators for
Web-Scale Discovery Service
資源探索服務之功能評估指標
柯皓仁 Hao-Ren Ke
國立臺灣師範大學圖書資訊學研究所
Graduate Institute of Library and Information Studies,
National Taiwan Normal University
1
4. Introduction
Electronic resources and materials
expenditure soars
Users choose search engines as their “portal”
for information
One stop service
Simple search interface
Enormous and diverse information in one system
The principle of least effort
Quality vs. availability
4
6. Electronic Resources and Materials
Expenditures in ARL University
Libraries, 1992-2010
2005-06 2006-07 2007-08 2008-09 2009-10
a. Computer File Expenditures (monographic/onetime)
Total $48,793,981 $59,808,658 $73,102,024 69,148,203 78,775,329
Average $478,372 $558,959 $676,574 628,620 709,688
Median $336,338 $352,802 $410,202 363,746 511,334
N 102 107 108 110 111
b. Electronic Serial Expenditures
Total $383,127,163 $476,225,086 $554,637,844 637,458,376 714,622,502
Average $3,547,474 $4,290,316 $5,042,162 5,691,593 6,268,618
Median $3,349,709 $4,240,530 $4,899,366 5,337,237 6,044,532
N 108 111 110 112 114
c. Total Electronic Resources (Total a+b)
Total $431,921,144 $536,033,744 $627,707,869 706,606,579 793,397,831
Average $3,962,579 $4,786,016 $5,655,026 6,253,156 6,959,630
Median $3,792,873 $4,661,123 $5,410,421 5,854,147 6,689,378
N 109 112 111 113 114
Total Library Materials Expenditures**
1,315,122,261
Total 1,109,340,878 1,213,082,871 1,279,690,962 1,335,309,871
11,638,250
Average 10,177,439 10,831,097 11,528,747 11,713,244
10,364,778
Median 9,156,974 9,597,677 10,416,077 10,529,327
N 109 112 111 113 114
Electronic Resources Expenditures as a Percent of Total
Materials Expenditures
Average 40.93 46.55 51.46 56.33 62.24
Median 43.14 47.68 53.06 57.03 62.70
N 109 112 111 113 114
Expenditures for Bibliographic Utilities, Networks, etc. (External)
Total $15,930,476 $18,931,797 $21,079,241 21,695,047 22,546,140
Average $318,610 $225,379 $242,290 235,816 230,063
Median $143,649 $33,247 $54,750 44,745 19,326
N 50 84 87 92 98
6
7. Discovery and Delivery
Discovery – let me find relevant information
Bibliographic information, citation metadata,
subject descriptor, (author) keywords, (author)
abstract, full-text indexing
OPAC, A&I databases, citation databases
Delivery – let me GET the information I find
Physical information – Book shelves, ILL
Electronic information – E-journal systems,
Aggregator databases, ILL
ONE-STOP SERVICE?
7
8. Discovery and Delivery Tools
WebOPAC + 856
A-Z database / e-journal list
Feberated search system
OpenURL Link Resolver
ONE-STOP SERVICE?
8
9. Federated Search
Metasearch, parallel search, federated
search, broadcast search, cross-database
search, search portal
Allows search and retrieval to cross multiple
databases, sources, platforms, protocols, and
vendors at once
Unified UI
Broker/Agent/Value-added Service
Electronic Resource 1 Electronic Resource 2 Electronic Resource 3
9
10. Complaint about Federated
Search
Complicated interface
Which databases can be crossly searched?
Max. number of databases to be crossly searched
Slow response time (or connection timeout) –
distributed search on-the fly
Poor relevant ranking
Poor de-duplication
10
11. Web-Scale Discovery Service
Superstar for one-stop service in the library ?
Google can. WE can too. (Ah, who are “we”?)
Web-scale discovery (WSD) service
Google(-scholar)-like one-stop service, simple
search (discovery) interface, excellent relevance
ranking, effective information delivery
Two important characteristics
Pre-harvested central index
Discovery layer
11
12. WSD Products
Innovative Interfaces Encore Synergy
EBSCO Discovery Services
Ex Libris Primo Central
OCLC WorldCat Local
Serials Solution Summon
12
14. Sites Used for Testing
EBSCO Discovery Services (EDS)
澳門科技大學圖書館
臺灣師範大學圖書館
Ex Libris Primo Central (and Primo)
交通大學圖書館
Serials Solution Summon
中央研究院
Due to user authentication, I may not
discover all the functions in a system.
14
16. Pre-harvested central Index
The central index periodically pre-harvests
metadata and full-texts from various
information sources, normalizes them into a
unified schema, and uses the techniques of
information retrieval for indexing
The contents harvested into the central index
comprise the contents that can be discovered from
a Web-scale discovery service
Information sources
Local collection
Global resources
16
17. Pre-harvested central Index
Physical
Holdings (ILS)
Institutional A&I DB Citation DB
Repository (IR) E-Journals E-Books
DA IR
Digital
Archives (DA) Lib Collections
…
Various
CMSs
Local collection Global resources
Pre-harvested
central Index
17
18. Information Sources of the
central Index (CI)
Library supplied data
MARC records from ILS
Metadata records from IRs, DAs, and CMSs
Open access data
arXiv.org, e-Prints, Hindawi Publishing, DOAJ
Publisher metadata and full text
WSD-licensed material
Mutually licensed material
Ask WSD vendors to give you
an overlapping report
(Hoeppner, 2012)
18
19. Metadata and Full Text in CI
Metadata types
MARC, Dublin core, EAD
Generic XML
Levels of metadata
Citation metadata:
identifier, contributor, title, date, edition, place
published, publisher, URL, context
Subject descriptors
(Author-supplied) keywords and abstracts
Full text
Full texts are used for (full-text) indexing/search
19
22. Factors Affecting the Content
Available to MY Library
The five types of content in CI
How many contents subscribed by MY library
are covered by CI
Does MY library want users to DISCOVER
any contents that are not subscribed
Difficult to DELIVERY?
Use OpenURL LinkResolver to connect to NDDS
or Rapid ILL?
(Hoeppner, 2012)
22
23. Watch Out! Coverage!
A WSD vendor may claim its contents
covering X% of the contents in Y database
The WSD vendor may negotiate with a publisher
directly to license the publisher’s contents
Do you appreciate the value-added process
conducted by the vendor of Y?
Levels of metadata
So… can MY library cancel Y?
Some WSD service may recommend databases
according to a user’s query
23
24. Steps for Importing Library
Supplied Data into CI
Data mapping: MARC WSD schema
Flexible MARC mapping mechanism
Search and display fields
CMARC, US-MARC , MARC21 to WSD schema?
How to markup NEW, DELETE, UPDATE records?
Data extraction (Daniels & Roth, 2012)
OAI-PMH? FTP?
Automated process? Frequency? De-duplication?
Report? Metadata quality
is essential
Verification
Integrate the verification process into daily routine
24
26. Record Coding System
Record code Match on MARC ID Action
N or D YES Remove record from CI
X or - YES Update record in CI
N or D NO Don’t add record to CI
X or - NO Add record to CI
• N = suppressed from public view
• D = record ready for delete
• X = available for public view
• - = available for public view
(Daniels & Roth, 2012)
26
28. Discovery Layer
The user interface and search system for
discovering, displaying, and interacting with
the content in library systems, such as a WSD
central index
Functionality
Google-like simple search and advanced search
Query refinement and faceted browsing
Relevance ranking
Display and delivery
Branding and customization
Personalized and community service
28
29. Google-like Simple Search and
Advanced Search
Query fields
Boolean logic, relation
logic, truncation, wildcards
Contain, equal to, start with
Phrase, adjacent, stopwords
Spelling suggestion / do you mean?
Integration with Federated Search System
Can the search box be embedded into library
web sites, LCMSs, subject guides?
29
33. Why Federated Search System?
Physical
Holdings (ILS)
Institutional A&I DB Citation DB
Repository (IR) E-Journals E-Books
DA IR
Digital
Archives (DA) Lib Collections
…
Various
CMSs
Local collection Global resources
Pre-harvested
central Index
33
37. Search History and
Result Export
Search history Result export
Query strategies of Mark and save search
current session results
Query strategies Print, email, and
combination search results
Query strategy Export search results
modification to bibliographic
Save query strategies management software
Create SDI from Support APA, MLA,
query strategies Chicago
37
40. Relevance Ranking
TF*IDF
Term frequency * Inverse Document Frequency
Occurrence of query terms in important
metadata fields
Adjacency of query terms in metadata/full text
Currency of information
Type-specific parameters
Increase the ranking of library-supplied data
40
41. Branding and Customization
Template
Header/footer customization: naming, logo,
hyperlinks
Color scheme
Customize interfaces for different group of
users
Provide API/Web services for customization
Widget
Value-added contents (Google Books Preview,
Amazon…)
41
43. Personalized Services
Integrate with library’s authentication
mechanism (EzProxy, LDAP, ILS…)
Personalized page layout
Save query
Selective Dissemination of Information (SDI)
Integrate with the personalized service of ILS
43
44. Community Services
Review / comment
Tagging
Share with social networks (Facebook, Twitter)
Share with social bookmarks (Delicious,
Connotea)
44
49. Central Index
Scope and depth of content being indexed,
including CHINESE content
Fitness of content being indexed with the
requirement of MY library
License between WSD and publishers,
database vendors, aggregators
Richness and consistency of metadata
included in CI
Frequency of content updates
Ease of incorporating local content
49
50. Discovery Layer
Usability of discovery layer
Simple and ease-of-use query interface
Quality of query results (like relevance
ranking)
Customization of query and relevance ranking
Query refinement and faceted navigation
Integration with the library’s existing systems
New user environment support (like mobile
WSD and community services)
50
51. Fitness with the Library
Ease of implementation
Compatibility with existing software and
content
Response speed for user requirements and
problems
Mid- and long-term development plan
Overall evaluation about the vendor
51
52. Pricing and Implementation
Model
Purchase or subscription
Local implementation or cloud service (SaaS)
Pricing model (FTE, size of library supplied
data…)
Maintenance or subscription fee
52
55. Reference
ARL (n.d.). Electronic Resources and Materials Expenditures in ARL University Libraries, 1992-2010. Retrieved from
http://www.arl.org/bm~doc/t7_emat_intro.xls.
ANDS(n.d.). Citation of datasets and collections. Retrieved from http://ands.org.au/guides/cpguide/cpgcitation.html.
Breeding, M. (2011). Automation marketplace 2011: The new frontier. Library Journal, 136(6). Retrieved rom
http://www.libraryjournal.com/lj/home/889533-264/automation_marketplace_2011_the_new.html.csp.
Daniel, J. & Roth, P. (2012). Incorporating Millennium catalog records into Serials Solutions' Summon. Technical
Services Quarterly, 29, 193-199.
Gross, J. & Sheridan, L. (2011). Web scale discovery: the user experience. New Library World, 12(5/6), 236-247.
Hoeppner, A. (2012). The ins and outs of evaluating Web-scale discovery services. Computers in Libraries, 32(3), 6-10,
38-40.
Luther, J. & Kelly, M. C. (2011). The next generation of discovery. Library Journal, 136(5), 66-71. Retrieved from
http://www.libraryjournal.com/lj/home/889250-264/the_next_generation_of_discovery.html.csp.
Manning, C. D., Raghavan, P., and Schutze, H. (2008). Introduction to Information Retrieval. Cambridge University
Press.
Miller, P. (2006). Library 2.0: The challenge of disruptive innovation. Retrieved from
http://cmapspublic2.ihmc.us/rid=1211299379745_1806224281_20373/447_Library_2_prf1.pdf
OCLC (2005). Perceptions of library and information resources. Retrieved from
http://www.oclc.org/reports/pdfs/Percept_all.pdf.
Vaughan, J. (2011). Web scale discovery services. Library Technology Reports, 47(1).
柯皓仁(2011)。圖書館自動化與數位化—綜述。中華民國一百年圖書館年鑑。頁157-164。
黃明居(2011)。圖書館自動化與數位化—次世代圖書館館藏整合查詢系統。中華民國一百年圖書館年鑑。頁164-166。
黃鴻珠(2011)。大專校院圖書館—綜述。中華民國一百年圖書館年鑑。頁95-108。
麥綺雯(2012)。如何挑選合適的探索工具—香港教育學院圖書館的經驗分享。2012年第十一屆海峽兩岸圖書資訊學學術
研討會論文集A輯(頁295-306)。
55