VIVO at the University of Idaho

VIVO at the
University of
Idaho
SHINY HAPPY PEOPLE HOLDING
NODES: USING VIVO (A
SEMANTIC WEB APPLICATION)
TO REVEAL UNIVERSITY OF IDAHO
RESEARCH AND RESEARCHERS

What is VIVO?


An Open-Source …




Semantic Web application …




RDF (Resource Description Framework) Triples, which are controlled
subject-predicate-object expressions that produce consistent
relationships

and Data Harvesting procedures




Data structured so that it can be shared and reused

using Linked Data practices and standards…




Freely available with a community of librarians and web developers

Collecting, ingesting and publishing (public/private) data in batches

to create a searchable, browseable, and reusable network of
information on research and researchers.

Early History of VIVO



1997-2005: VIVO Network idea developed at Cornell
for life and social sciences.


Intended to provide a view of sciences and research
“across disciplinary and administrative boundaries.”



2005: Released for Life Sciences



2007: Expanded to all of Cornell University (thru
Library)



2009: $12.2 million NIH grant provided to develop a
national version with several other partners



2010 – Present: More and more institutions adopting
and developing VIVO instances
from “VIVO: Enabling National Networking of Scientists”

VIVO at the University of Idaho



Spring 2012 – Fall 2012


Approached by Idaho INBRE (a Biomedical Researcher
network in Idaho) with question about possibly installing
VIVO instance



Installed VIVO, began setting up and learning the
system, while gathering feedback from INBRE and other
stakeholders



Garnered approval from INBRE faculty to publish their
information in the system



Harvested INBRE related information from public
resources: PubMed and NIH and NSF grants database




Spring 2013


Began to pursue expanded VIVO



Receive approval from institutional IT evaluation group
to go forward



Re-branded instance



Presented VIVO to library faculty and administration as
possible project going forward



Presented instance and proposal for new position to VP
of Research




Summer 2013


VP approved expanded use of VIVO for Research
Groups on campus and funding for position



Annie Gaines begins as Scholarly Communication
Librarian



Ingest, Ingest, Ingest,


Added three additional research groups, as well as the Law
School, and associated faculty



Added thousands of grants, publications, and people into
the system.




Fall 2013


Presented VIVO publicly on campus for first time



VIVO goes live (accessible from off campus)



Additional organizational descriptions added
(Department, College, Grant Strucutures, etc.)



Gained approval and access to use campus database
system, Banner




VIVO Today


Beginning to explore VIVO as front-end for historical
documents



Adding all University Faculty



Creating applications and access points for data



Cleaning, always cleaning …



Using this presentation as a prompt for further
development of application, as well as further defining:


the system’s presentation



our data’s preservation



and our mission and goals in using the system

Hosting



Provided by the Northwest Knowledge Network


www.northwestknowledge.net



NKN focuses on providing technical support to
researchers



Division of UI’s Office of Research



Strong relationship with the UI Library (they are in the
building)



Data is replicated to a data center at Idaho National
Laboratory



Present future opportunities for integrating VIVO’s
information with other research-related tools/systems

Technical Specs



Our installation



Apache Web Server



MySQL





Red Hat Linux

Tomcat

Current Version of VIVO


1.5.2



Probably upgrade to 1.6 in March 2014

Building VIVO – Two Approaches



Approach #1 – the high-resource approach (ideal)


Requires



Available programmers and developers





Discrete IT department
Formal IT project management

Advantages



Advanced customization and configuration





High-level of integration into existing systems/services
Reasonably short time from inception to production

Disadvantages


Red-tape


Represents a large commitment by the unit

Building VIVO – Two Approaches



Approach #2 – the low-resource approach (practical)


Requires



Experimental mindset





Minimum recommended staff identified in the VIVO implementation guide
View VIVO as a series of small projects, rather than one large integration into
university activities

Advantages





Simple
Manageable

Disadvantages


Time (takes much longer)


Integration with existing services



Creation of custom data ingest tools

Implementation Goals


Start with low-hanging fruit. It is easier to collect



When considering custom tools and processes, our priorities:


1 – re-use from community or locally



2 – buy if possible



3 – build as needed



Build institutional interest in the existing data before soliciting more
resources to further our development



Investigate third-party solutions (Symplectic Elements) as
alternatives to custom-building internal methods of collecting data

Data Ingestion - General
Typical workflow:
1. Receive data in source format
2. Convert to RDF (usually RDF/XML or Turtle)
3. Associate with VIVO ontology (as needed)
4. Reconcile against existing database
5. Load into the application
6. Re-index if needed

Data Ingestion - Sources



Public Sources





NSF, NIH, USDA Awards
Pubmed

Commercial Sources





Web of Science
Must remove “intellectual effort”

CVs, Publication Lists




Must have some means of soliciting them

Local Databases (central university, research groups)


Several institutional sources


Must work through the gatekeepers of each



Need data security review to ensure that institutional concerns are met before
public exposure

Data Ingestion - Tools


VIVO Harvester




Extract, Transform, and Load (ETL) tool that takes data from
a source and loads it into VIVO automatically

OpenRefine



Very flexible for different datatypes



Extension enables export in RDF format





Data cleaning tool

Reconciliation service allows us to match and deduplicate entries before export

Custom Conversion Tools (in Python)


Used for CRIS reports output, as well as other consistent,
but unusual formats

Ontology Extensions



Custom University of Idaho model prefixed with
“uidaho:”



Goals with our extensions



Establish the local need before creating





Re-use as much as possible
Always associate classes within the VIVO hierarchy so
that data is not fully reliant on uidaho for context

Examples


Members of Idaho EPSCoR, Idaho INBRE, REACCH-PNA



Non-UI/Courtesy Faculty

Data Re-use - Fuseki


Apache Jena - Fuseki project




jena.apache.org/documentation/serving_data/

Enables external access to VIVO data


Without Fuseki, data re-use is limited to those authenticated
with the system



Created examples of data re-use to assist in marketing efforts



Goal: to establish value-addness of putting data in VIVO


Example: Labs who need to report the results of their research
by creating publication lists, or displaying spatial, temporal, or
conceptual aspects of UI research to stakeholders or students
could use this feature

Example 1:
A very simple way to
look at awards data.
This presents the number
of awards by agency. It
is using a javascript
library called sgvizler to
turn JSON data from
Fuseki into a Google
Charts visualization.

Example 2:
An other simple view
using sg-vizler. This
shows a comparison of
two variables – awards
and publications – for
personnel in a specific
research group. It
would need work as a
formal graph, but it
points to the way that
the data can be reused.

Example 3:
An other simple example
of data re-use using a
javascript/ajax technique
to display a list of journal
titles and faculty within a
specific research group.
Links to the faculty
members’ VIVO profiles
are associated with their
names.

VIVO as
Institutional
Repository

Background



When Annie was brought on for Scholarly
Communications, one of her tasks was to develop
an IR for the UI.



Some potential platforms to use for UI IR:


CONTENTdm – too flat



Bepress – too expensive



VIVO?

‘Institutional repositories’
“A set of services that a university offers to the
members of its community for the management
and dissemination of digital materials created by
the institution and its community members.”
Clifford Lynch, ARL Bimonthly Report 226, Feb. 2003.

“Digital collections that capture and preserve the
intellectual output of university communities.”
Ryam Crowe, Case for Institutional Repositories, SPARC,
2002

‘Institutional repositories’



Are:



Collection of scholarly work



Both cumulative and perpetual





Institutionally defined and managed

Open

Provide:


Long term preservation



Wide dissemination



Showcase for scholars and the institution

Challenges



Copyright issues, varying access



Buy-in from faculty, voluntary submissions



Getting people to care

VIVO as IR?



Not your typical IR interface



Interconnectedness in a large network



Includes diverse materials, not just article pre-prints



Includes citations for all works, not just the ones hosted
in the IR





Dynamic browsing and searching

Linked data format allows for reuse of data for a variety
of purposes

The following page shows a theses document in
VIVO

Theory vs. Practice



Although VIVO can act as a front end, the
documents must be hosted elsewhere



We deposit our docs in CONTENTdm and link to the
PDF in VIVO



This makes things easier, but also more complicated



See example of the same theses document in
CONTENTdm on the next page

Theory vs. Practice



We wanted to close this presentation by asking
some questions to the group. If you have any
advice for us on this project we would love to hear
from you!


Are more access points better or more confusing?



Should we include historical documents in the VIVO IR?



Which page should be the main collection?



Should we provide links to all collections? Or link from
one into the other?



What are best practices with unusually constructed Irs?

VIVO at the University of Idaho

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie VIVO at the University of Idaho

Ähnlich wie VIVO at the University of Idaho (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

VIVO at the University of Idaho

Hinweis der Redaktion