An invited talk to 40+ directors of national libraries worldwide at the annual ExLibris member meeting at IFLA (Helsinki, Finland) on August 15th, 2012.
4. Worldwide Presence
MSR India
MSR New England
Redmond
• Redmond, Washington Sept 1991
• Cambridge, United Kingdom July 1997 MSR Cambridge, UK
• Beijing, China Nov 1998
• Silicon Valley, California July 2001
MSR Asia (Beijing)
• Bangalore, India Jan 2005
• Cambridge, Massachusetts July 2008
• New York City, NY May 2012 Silicon Valley, California
8. Engagement and Collaboration Focus
Core Computer Natural User Earth, Education & Health &
Science Interface Energy & Scholarly Wellbeing
Environment Communication
11. …Thus far we seem to be worse off than before—for we can
enormously extend the record; yet even in its present bulk we
can hardly consult it. This is a much larger matter than merely
the extraction of data for the purposes of scientific research; it
involves the entire process by which man profits by his
inheritance of acquired knowledge. The prime action of use is
selection, and here we are halting indeed. There may be
millions of fine thoughts, and the account of the experience on
which they are based, all encased within stone walls of
acceptable architectural form; but if the scholar can get at
only one a week by diligent search, his syntheses are not likely
to keep up with the current scene…
As We May Think
by Vannevar Bush
The Atlantic, July 1945
http://www.theatlantic.com/doc/194507/bush
12. According to study called How Much Information by
the University of California at San Diego,
“…consumption totaled 3.6 zettabytes and 10,845
trillion words, corresponding to 100,500 words and
34 gigabytes for an average person on an average
day. A zettabyte is 10 to the 21st power bytes, a
million million gigabytes. These estimates are from
an analysis of more than 20 different sources of
information, from very old (newspapers and books)
to very new (portable computer games, satellite
radio, and Internet video)."
[Note: Information at work is not included!]
13. “It’s not information overload.
It’s filter failure.”
Clay Shirky
at Web 2.0 Expo 2008
14. http://en.wikipedia.org/wiki/Big_data
• information technology [1][2][3] data sets
[4] [5]
• [update]
petabytes exabytes [9]
meteorology
genomics [10] connectomics [11]
[12]
Internet search finance business informatics
•
remote sensing
radio-frequency identification
[13][14]
•
[15] [update]
quintillion [16]
16. Present
The Future: an Explosion of Data
Experiments Simulations Archives Literature Instruments
The Challenge:
Enable Discovery
Petabytes Deliver the capability to mine,
search and analyze this
data in near real-time.
21. • Figshare is the first online repository for
storing and sharing all of your preliminary
findings in the form of individual figures,
datasets, media or filesets. Post preprint
figures on Figshare to claim priority and
receive feedback on your findings prior to http://figshare.com/
formal publication.
• Figshare allows researchers to publish all
of their research outputs in seconds in an
easily citable, sharable and discoverable
manner. All file formats can be published,
including videos and datasets that are
often demoted to the supplemental
materials section in current publishing
models.
• Figshare uses Creative Commons
licensing to allow frictionless sharing of
research data while allowing users to
maintain their ownership. Figshare gives http://www.digital-science.com/
users unlimited public space and 1GB of
private storage space for free.
22. •
•
•
• http://datacite.org/
•
•
http://databib.org/
• Registry of research data
• repositories (hosted by
• [Digital Purdue University)
Object Identifier]
•
•
23. Developed by the Institute
of Quantitative Social
Science (IQSS) at Harvard
University
http://thedata.org/
29. WorldWideScience.org is a global science gateway connecting you to national and
international scientific databases and portals. WorldWideScience.org accelerates
scientific discovery and progress by providing one-stop searching of global science
sources. The WorldWideScience Alliance, a multilateral partnership, consists of
participating member countries and provides the governance structure for
WorldWideScience.org.
WorldWideScience.org was developed and is maintained by the Office of Scientific and
Technical Information (OSTI), an element of the Office of Science within the U.S.
Department of Energy. Please contact webmaster@worldwidescience.org if you
represent a national or international science database or portal and would like your
source searched by WorldWideScience.org.
In 3+ years since launch, the site has grown to 65+
countries, 400+ million pages – 96.5% of which is *not*
available via commercial search engines – and can now be
translated into multiple world languages (on demand).
31. “Future Career Opportunities and
Educational Requirements for Digital Curation”
1.
2.
3.
4.
http://sites.nationalacademies.org/PGA/brdi/PGA_069853/
41. "Data Services for the Sciences: A Needs Assessment”
Study by Brian Westra (University of Oregon, July 2010)
1. Data storage and backup
2. Making this data findable by others
3. Connecting data acquisition to data storage
4. Allowing or controlling access to this data by others
5. Documenting and tracking updates to the asset
6. Data analysis/manipulation
7. Finding and accessing related data from others
8. Connecting data storage to data analysis
9. Linking this data to publications or other assets
10. Insuring data is secure/trustworthy
11. Other
Westra, B. "Data Services for the Sciences: A Needs Assessment“ (30-July-2010) Ariadne Issue 64 [URL:
http://www.ariadne.ac.uk/issue64/westra/]
43. Workforce Demand
and Career Opportunities
in University and
Research Libraries
NAS Symposium on Digital Curation
Anne R. Kenney
July 19, 2012
44. 7 NEW ROLES FOR LIBRARIANS*
1. Acquisitions and Rights Advisors
2. Instructional Partners in Learning Spaces
3. Observers/anthropologists of Information Users and
Producers
4. Systems Builders
5. Content Producers and Disseminators
6. Organizational Designers
7. Collaborative Network Creators and Participants
Walters and Skinner, New Roles for New Times:
Digital Curation for Preservation, ARL, Mar 2001
45. RATINGS OF IMPORTANCE AND FREQUENCY OF
ESCIENCE INTERNSHIP TASKS
From Youngseek Kim, et al, “Education for eScience Professionals”,
IJDC 6:1 (2011) http://www.ijdc.net/index.php/ijdc/article/view/168
47. MOST SIGNIFICANT SKILLS GAPS
(CONTINUED)
5. Knowledge to advise on data mining
6. Knowledge to advocate, and advise on, the use of metadata
7. Ability to advise on the preservation of project records, e.g.
correspondence
8. Knowledge of sources of research funding to assist
researchers to identify potential funders
9. Skills to develop metadata schema, and advise on
discipline/subject standards and practices, for individual
research projects
Mary Auckland, “Re-skilling for Research,” RLUK, January 2012
48. REQUISITE EXPERTISE FOR DIGITAL
HUMANITIES AND SOCIAL SCIENCES
Requisite Expertise
Domain/subject expertise
Analytical expertise
Data expertise
Project management expertise
Williford and Henry, “One Culture: Computationally Intensive Research in the Humanities
and Social Sciences,” CLIR, 2012
49. 39 schools worldwide and growing
http://www.ischools.org/
The iSchools organization was founded in 2005 by a collective of Information Schools dedicated to advancing the
information field in the 21st Century. These schools, colleges, and departments have been newly created or are evolving
from programs formerly focused on specific tracks such as information technology, library science, informatics, and
information science. While each individual iSchoolhas its own strengths and specializations, together they share a
fundamental interest in the relationships between information, people, and technology.
50. Digital Curation
as a Core Competency
Symposium on Digital Curation in the Era of Big Data:
Career Opportunities & Educational Requirements
July 19, 2012
Dean Elizabeth D. Liddy
iSchool, Syracuse University
51. 5 Stages of the Data Life Cycle
Data Archiving / Preservation
Data Presentation / Visualization
Data Analytics
Data Management
Data Collection
52.
53. Three Additional Vital Competencies
Data Archiving / Preservation
Data Presentation / Visualization
Data Analytics
Data Management
Data Collection
54.
55.
56. Preserving Access to Our Digital Future:
Building an International Digital Curation Curriculum
DigCCurr Matrix
Competencies for Curators
http://www.ils.unc.edu/digccurr/digccurr-matrix.html/
60. The Opportunity Before Us
• Seek out and initiate data projects
– Cross-domain partnerships
– Enhance broad availability
• Pursue value-added services
– Data storage and backup services
– Enhancing data mark-up and findability
– Securing/controlling access to data
– Maintaining provenance
– Developing analytical and visualization tools
– Seeking related data/research
– Hosting and linking data to publications/assets
– Ensuring that data is preserved for the long-term
• Grow your people
– Invest in training your existing staff
– Change the technical profile of who you hire
– Support the evolution of how we educate the field
60
62. "If you don't like change, you're
going to like irrelevance even less.“
—General Eric Shinseki
Retired United States Army four-star general,
currently US Secretary of Veterans Affairs
63. Paintings by
Xiaoze Xie
Xiaoze Xie immigrated from the People’s Republic
of China in 1992, where he was born and studied
art and architecture. He has MFA degrees from
Beijing and North Texas University, and taught at
Bucknell University before assuming his current
post at Stanford. His works are in the collections of
the Museum of Fine Arts, Houston, the Scottsdale
Museum of Contemporary Art and distinguished
private collections.
Xie’s oil paintings bring together serene qualities of
traditional still-life painting and photography. From
the long tradition of still-life painting he employs a
rich and selected palette to represent the books,
which take on a nearly symbolic role.
Webpage: http://art.stanford.edu/profile/Xiaoze+Xie/
Email: xzxie@stanford.edu
64. “We’d now like to open the floor
to shorter speeches disguised as questions.”
Published in The New Yorker 10/18/2010
by Steve Macone
65. Thank you!
Lee Dirks
Directory, Portfolio Strategy
Microsoft Research | Connections
ldirks@microsoft.com
Editor's Notes
XiaozeXieLos Angeles Public Library (R740), 2009, oil on canvas, 32 x 64 inches
Rob Vargas created it from a study called How Much Information by the University of California at San Diego
TITLE: Stanford Art Library (NA7764-NA8206) ARTIST: XiaozeXieWORK DATE: 2009 CATEGORY: Paintings MATERIALS: oil on canvas SIZE: h: 30 x w: 60 in / h: 76.2 x w: 152.4 cm REGION: Chinese STYLE: Contemporary (ca. 1945-present) PRICE*: Contact Gallery for Price GALLERY: 415.433.2710 Send EmailONLINE CATALOGUE(S): XiaozeXie Jan 6 - Jan 30, 2010
Chinese Library No. 42, 2009oil on canvas32” x 61”
The MoMA Library (3), 2007, oil on canvas, 40 x 60 inches