1. Old silos, new silos, no silos
From redundancy to aggregation or distribution?
Lukas Koster
Library Systems Coordinator
Library of the University of Amsterdam
@lukask
l.koster@uva.nl
SWIB12, Köln, November 26-28, 2012
http://www.flickr.com/photos/alpoma/4786094569
2. “System incapable
of reciprocal
operation with
other, related
What is a silo? information
systems”
http://en.wikipedia.org/wiki/Information_silo
Old silos, new silos, no silos - SWIB12 -
2
@lukask
http://www.flickr.com/photos/7716310@N03/1472185959
3. “System containing data, only
accessible via internal
system calls”
access
data
SILO NO SILO
Old silos, new silos, no silos - SWIB12 -
3
@lukask
4. Application Programming Interface
API Direct access
system call data call
(LOD)
http://twitter.com/hochstenbach/status/268033173366124544
http://twitter.com/jindrichmynarz/status/269089964103434242
Old silos, new silos, no silos - SWIB12 -
4
@lukask
5. Focus
Traditional library content: publications
http://commons.wikimedia.org/wiki/File:Latin_dictionary.jpg
Old silos, new silos, no silos - SWIB12 -
5
@lukask
7. http://igelu.org/wp-
content/uploads/2010/10/smug4eu_issue2.pdf
Old silos, new silos, no silos - SWIB12 -
7
@lukask
8. ! 7 years ago!
A: Federated search systems are dead
But you won’t get those kids back that you have already lost. They will stay
with Google
B: But what is the alternative? Build big databases like Google does?
Harvest data till you drop?
Probably we do need both. Big databases as well as federated search.
C: Perhaps only a limited number of distributed harvested indexes
around the world.
Federated search can survive and be used for accessing those indexes!
Old silos, new silos, no silos - SWIB12 -
8
@lukask
9. ! 7 years ago!
A: Federated search systems are dead
But you won’t get those kids back that you have already lost. They will stay
with Google
B: But what is the alternative? Build big databases like Google does?
Harvest data till you drop?
Probably we do need both. Big databases as well as federated search.
C: Perhaps only a limited number of distributed harvested indexes
around the world.
Federated search can survive and be used for accessing those indexes!
Old silos, new silos, no silos - SWIB12 -
9
@lukask
10. http://igelu.org/wp-
content/uploads/2010/10/smug4eu_issue4.pdf
Old silos, new silos, no silos - SWIB12 -
10
@lukask
11. If libraries describe their philosophy they’ll certainly also talk about integration
– integration of services, integration of assets and resources.
To make an excellent service all these tools need to function as an integrated
system. Not only with each other, but also interacting with the rest, such as the
institutional website, the institutional portal, third party’s learning environments.
Technically speaking, integration may result in a functioning or a unified whole.
A “functioning whole” may be understood as an integrated structure of
cooperating tools in which the different parts (like in this case SFX, MetaLib or
other tools) can be distinguished as such, whereas a “unified whole” would
mean that the end users experience one “new” system or user interface that
conceals the integrated parts.
Combining different databases in real or virtual collections that can be
accessed by a number of integrated or nonintegrated tools.
When libraries are talking about integration, they are not only thinking about
the user’s perspective, but also about the backend of their business, where
internal workflows are in focus. In general, our aim is to reduce duplicating
work, to store the same information only once, to cooperate closely.
Old silos, new silos, no silos - SWIB12 -
11
@lukask
12. Libraries – integration of services, assets and resources.
Integrated systems
• Functioning whole: an integrated structure of cooperating,
distinguishable tools
• Unified whole: one new system of concealed integrated parts
Combined databases
• Real collections
• Virtual collections
Data management
• Harvest data
• Big data
• Distributed harvested indexes
Backend integration of internal workflows
• Reduce duplicating work
• Store the same information only once
• Cooperate closely
Old silos, new silos, no silos - SWIB12 -
12
@lukask
21. Silos
Data traps
http://www.flickr.com/photos/msueasrecords/6242093298/
Old silos, new silos, no silos - SWIB12 -
21
@lukask
22. Silo origins
Technology
Economics
Licenses
http://www.flickr.com/photos/kotomi-jewelry/5035479051/
Tradition
Psychology
Old silos, new silos, no silos - SWIB12 -
22
@lukask
23. Serious Story
Old silos, new silos, no silos - SWIB12 -
23
@lukask
24. ILS = Library Catalogue
REPO = Institutional Repository
DB = Database
EJ = EJournal
Bibliographic data
ILS REPO ILS REPO ILS REPO DB DB DB EJ EJ EJ
Old silos, new silos, no silos - SWIB12 -
24
@lukask
25. ILS = Library Catalogue
REPO = Institutional Repository
DB = Database
EJ = EJournal
Federated search
ILS REPO ILS REPO ILS REPO DB DB DB EJ EJ EJ
Old silos, new silos, no silos - SWIB12 -
25
@lukask
26. ILS = Library Catalogue
REPO = Institutional Repository
DB = Database
EJ = EJournal
AGG = Aggregation
Aggregations
Google Scholar
Worldcat
Union Repository
Catalogue Gateway AGG AGG AGG
ILS REPO ILS REPO ILS REPO DB DB DB EJ EJ EJ
Old silos, new silos, no silos - SWIB12 -
26
@lukask
27. ILS = Library Catalogue
REPO = Institutional Repository
DB = Database
EJ = EJournal
AGG = Aggregation
Federated search
Google Scholar
Worldcat
Union Repository
Catalogue Gateway AGG AGG AGG
ILS REPO ILS REPO ILS REPO DB DB DB EJ EJ EJ
Old silos, new silos, no silos - SWIB12 -
27
@lukask
28. ILS = Library Catalogue
REPO = Institutional Repository
DB = Database
EJ = EJournal
AGG = Aggregation
Aggregated search
Google Scholar
Worldcat
Union Repository
Catalogue Gateway AGG AGG AGG
ILS REPO ILS REPO ILS REPO DB DB DB EJ EJ EJ
Old silos, new silos, no silos - SWIB12 -
28
@lukask
29. ILS = Library Catalogue
REPO = Institutional Repository
DL DL DL
DB = Database
EJ = EJournal
AGG = Aggregation
DI = Discovery Index
DL = Discovery Layer
DI
Discovery DI DI
Google Scholar
Worldcat
Union Repository
Catalogue Gateway AGG AGG AGG
ILS REPO ILS REPO ILS REPO DB DB DB EJ EJ EJ
Old silos, new silos, no silos - SWIB12 -
29
@lukask
30. ILS = Library Catalogue
REPO = Institutional Repository
DL DL DL
DB = Database
EJ = EJournal
AGG = Aggregation
DI = Discovery Index
DL = Discovery Layer
DI DI DI
Google Scholar
Worldcat
Bibliographic data
Union Repository
Catalogue Gateway AGG AGG AGG
ILS REPO ILS REPO ILS REPO DB DB DB EJ EJ EJ
Old silos, new silos, no silos - SWIB12 -
30
@lukask
31. ILS = Library Catalogue
REPO = Institutional Repository
DL DL DL
DB = Database
EJ = EJournal
AGG = Aggregation
DI = Discovery Index
DL = Discovery Layer B B B
C H C H C
H
DI DI DI
B Bibliographic data
B
B H
H Holdings data B Google Scholar
H H
C Circulation data
Worldcat B
H
B
H Repository
Union
Catalogue Gateway AGG AGG AGG
B B B B B
H H H C H C
ILS REPO ILS REPO ILS REPO DB DB DB EJ EJ EJ
B B B B B B B
B B B B B H H H H
H H C
H Csilos, new silos, no silos - SWIB12 -
Old C C
C C @lukask
C 31
32. ILS = Library Catalogue
REPO = Institutional Repository
DL DL DL
DB = Database
EJ = EJournal
AGG = Aggregation
DI = Discovery Index
DL = Discovery Layer B B B
NG = Next Generation ILS
C H C H C
H
DI DI DI
B Bibliographic data
B
B H
H Holdings data B Google Scholar
H H
C Circulation data
NG
Worldcat B
NG H
NG
B
H Repository
Union
Catalogue Gateway AGG AGG AGG
B B B B B
H H H C H C
ILS REPO ILS REPO ILS REPO DB DB DB EJ EJ EJ
B B B B B B B
B B B B B H H H H
H H C
H Csilos, new silos, no silos - SWIB12 -
Old C C
C C @lukask
C 32
33. ILS = Library Catalogue
REPO = Institutional Repository
DL DL DL
DB = Database
EJ = EJournal
AGG = Aggregation
DI = Discovery Index
DL = Discovery Layer B B B
NG = Next Generation ILS
C H C H C
H
DI DI DI
B Bibliographic data
B
B H
H Holdings data B Google Scholar
H H
C Circulation data
NG
Worldcat B
NG H
B NG
B H C B C
H Repository H
Union
Gateway B
Catalogue H C AGG AGG AGG
B B B B B
H H H C H C
ILS REPO ILS REPO ILS REPO DB DB DB EJ EJ EJ
B B B B B B B
B B B B B H H H H
H H C
H Csilos, new silos, no silos - SWIB12 -
Old C C
C C @lukask
C 33
34. ILS = Library Catalogue
REPO = Institutional Repository
DL DL DL
DB = Database
EJ = EJournal
AGG = Aggregation
DI = Discovery Index
DL = Discovery Layer B B B
NG = Next Generation ILS
C H C H C
H
DI DI DI
B Bibliographic data
B
B H
H Holdings data B Google Scholar
H H
C Circulation data
NG
Worldcat B
NG H
B NG
B H C B C
H Repository H
Union
Gateway B
Catalogue H C AGG AGG AGG
B B B B B
H H H C H C
REPO REPO ILS REPO DB DB DB EJ EJ EJ
B B B B B B B
B B B H H H H
Old C
H Csilos, new silos, no silos - SWIB12 - C C
@lukask
C 34
35. Not very efficient
Free market leads to efficiency?
Not really!
Huge redundancies
Missing data
Obscure, cloudy situation
Unequal access
Old silos, new silos, no silos - SWIB12 -
35
@lukask
36. Library crisis
245 $ a
Old silos, new silos, no silos - SWIB12 -
36
@lukask
38. Stand-alone Aggregation Redundant Complementary
Stand-alone Redundant Complementary Redundant
aggregation Aggregations aggregations Complementary
Old silos, new silos, no silos - SWIB12 -
38
@lukask
39. Completely distributed
• Less redundancy • Redundancy risk
• Up to date • Performance risk
• Trust • Persistence risk
• Share workload • Inconsistent indexes
• … • Trust
• …
http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing
Old silos, new silos, no silos - SWIB12 -
39
@lukask
40. Is linked data the answer?
Karen Coyle at EMTACL12:
“The web is awash in bibliographic
data”
“Most of what we would contribute
would be duplication of data that already
exists”
“It is library holdings that is key to
providing service and furthering
knowledge creation”
http://www.kcoyle.net/presentations/thinkDiff.pdf
Old silos, new silos, no silos - SWIB12 -
40
@lukask
44. Next Generation ILS in the cloud
FRBR Worldcat
Community Work
/shared Expression Union
zone Manifestation Catalogue
Library Items
ILS
zone
Holdings
http://thoughts.care-affiliates.com/2012/10/impressions-of-new-library-service.html
Old silos, new silos, no silos - SWIB12 -
44
@lukask
45. LoC Bibliographic Framework
Work
Items Expression
Manifestation
Holdings
http://www.loc.gov/marc/transition/
Old silos, new silos, no silos - SWIB12 -
45
@lukask
46. Worldcat
Community
/shared Union
Catalogue
zone
Library ILS
zone
Old silos, new silos, no silos - SWIB12 -
46
@lukask
47. Worldcat
Community
Google Scholar
/shared
zone
Library
zone
Old silos, new silos, no silos - SWIB12 -
47
@lukask
48. Brave new world
Old silos, new silos, no silos - SWIB12 -
48
@lukask
49. Library roles
• Reconsider objectives
• Shift focus
• Communicate with system vendors
• Communicate with content vendors
• Engage in framework transition
• Cooperate
Old silos, new silos, no silos - SWIB12 -
49
@lukask
50. Don’t think systems
Think data
http://www.flickr.com/photos/57402879@N00/261930788/
Old silos, new silos, no silos - SWIB12 -
50
@lukask
Hinweis der Redaktion
Questions, questions, questions
(Wikipedia) Information silos: systemsincapable of reciprocal operation with other, related information systemsFor who is this a problem in the library world?
Silo: data within a system, onlyaccessible via internal system calls.Betternow: direct access to data without goingthrough the system
Lots of “silo” systems provideAPIs, but theyonlygive access to systemfunctionality, notdirectly to the data
Focus in this talk willmainlybe on traditional library contentThis is were the silos are
A couple of stories. Here is one, it starts 7 yearsagoSFX/MetaLib User Group – informalorganisation, Ex Libris customersSmug= selbstgefälligSMUG 4 EU NewsletterinitiativeThis: “funny” conversationbetween 3 conference delegates, addressing issues facinglibrariesanddistributed search
Summary of issues from SMUG-4-EU 2005-2007Issues currentlyverymuchalive
The origins of silos in the library data world
Paper,physicalcataloguestransposed to digital, standalone world at firstMARC was actually the first attempt to solvelibrary silo problem ;-)
Cataloguesmoved to online web world
Scholarly databases appeared, first standalone, CD-ROM
Scholarly databases, eJournalsmoved to the online web world
All these systems kept the closedintrovertednature of pre-web: silos, data trappedinside systems
Summary of the origins of silos, in the libraryworldBesides “old” technologyand commercial reasons:also important psychologicalmindset, traditions. Libraries want to defendtheir “authoritative” positions as much as publishersandvendors want to defendtheir commercial positions
This is aboutabout data, including metadataStartingwith traditional bibliographic data containers.
On top of low level systems, aggregations.Twoareas:Locallibrary data (books, journals)Shared databases, ejournalsExtra “super” levels; WorldCat etc. – Google Scholar
Federated search alsopossiblewithaggregations
Or new level of federated search: aggregated search?
New level of aggregation: discoverylayersCombininglocalandglobal data sources: local+sharedindexesNew super level of silos + redundancy
Illustration of hugeredundancy of bibliographic data of one ‘book’, one ‘article’.
Different perspective: where are library data of different typesstored?Bibliographic data; everywhereHoldings data: local systems, aggregated systems; commercial vendorsCirculation data (usage data): local systems, commercial vendors, open access?
Andanother type of systems/silosemerge: (multiple) next generation ILS, combination of aggregations of shared bibliographic data + shared ‘local’ holdings + localcirculation data.
Somelocalsilos+redundancywilldisappear, data moved to Next Gen silos
Intermediateconclusion:thissituation is notveryefficient.For librarians, librarycustomers, system librarians
Like banking crisis: library crisis.No EU to take measures. Maybe we need a “Bad Library” (like a “bad bank”) to put all the MARC records in, and get on withrestructuring
All types of combinationspossibleAll kinds of technologiespossible
Oneuniversal silo ;-)Ideally no redundancy
Next Gen ILS (Alma,Worldshare, Intota, Kuali OLE, etc.)Basic design: shared zone (with data for FRBR type Work, Expression, Manifestation)Library zone: holdings, items etc.For Print+ElectronicSimilartoexistingunioncatalogues/shared cataloguing/Worldcat
LoCBibliographic Framework TransitionInitiative: Mapping of Bibliographic data tooneWork level.Just a framework, no model/implementation
Design of Next Gen ILS, andexistingunion/sharedcataloguingcanbemappedto the Framework.All kinds of aggregationpossible.
For instance: all next gen ILS,instead of becoming new silosuseWorldcat as Community/Shared zone for printuse Google Scholar as Community/Shared zonen forelectronic/articles
New frameworkbased on linked open data makesitpossibletoconnectlibrary data toanythingelse, eliminating the old silo mindset