Taming the Wilderness of Open Research Information

Ergebnisse eines Projekts mit Studierenden der Hochschule Hannover sowie aktuelle Entwicklungarbeiten im Kontext Forschungsinformation (VIVO) am Open Science Lab der TIB

Vortrag von Dr. Ina Blümel und Gabriel Birke auf der i-Know conference 2014 (https://i-know.tugraz.at/) am 18. September 2014 in Graz

  1. 1. Taming the Wilderness of Open Research Information Student project at HS Hannover, participants: Wendinda Carine Donessonne, Felix Kommnick, Elena Liventsova, Rahima Medshid, Bengt Olschewski, Anna Petersmeier, Tatiana Walther, Jana Wolf Dr. Ina Blümel, Gabriel Birke i-Know conference September 18, 2014
  2. 2. Research Information: Paradigms • Institutional • research management as driving force: reporting tools, etc. • mostly proprietary CRIS implementations at institutional and partly national level, … (Pure, Converis, et al) • “closed world” • Community based / discovery layer • merging & linking research information from various sources • Supporting scientists to establish networks, see success of ResearchGate, academia.edu, etc. 2
  3. 3. 3 VIVO • Model for linkable research information with LOD ontologies • Open source software • Originally developed at Cornell with NSF funding, now supported by a consortium at DuraSpace • Numerous implementations, previously primarily in the English-language bio/medical area (CTSA) • Research profiles, visualisations, …
  4. 4. 4 “feed” VIVO 1. External data sources, esp. websites (harvesting) 2. Internal data sources (Web API or other type of access) 3. Individual customization to suit professional needs Challenge: • From the vast array of research inf. objects on the web to structured research information • If possible, automatically 
  5. 5. Sources Science 2.0 community • Websites with publications, projects, information about organizations, persons, ... • with structured and unstructured information  Identify websites with repetitive, similarly structured content, worth setting up a harvesting pipeline! 5
  6. 6. Setting, Task • 16 weeks project • 6th semester bachelor students of library and information science • supported by an information and a computer scientist • identify and document research information items on the websites • map to the VIVO ontology • certain steps re-defined or split up during running project according to students needs / prior knowledge 6
  11. 11. Steps 11
  13. 13. Challenges 13 • inconsistent publication data, entered as freeform text in CMS, e.g., up to 13 different versions of journal volume representation • templates don’t provide RI in machine-readable formats
  14. 14. Challenges • Variable content, stable structures • Duplicates with different structure (publications, persons, …) http://www.hiig.de/ausgewahlte-publikationen/ http://www.hiig.de/ausgewaehlte-veroffentlichungen/ 14
  15. 15. Man and machine drawing same conclusions? http://www.hiig.de/kooperationen/  Partners are marked with a logo (image)  Luckily „alt“-tags available 15 Challenges
  16. 16. Results 16 • Discovery layer with aggregated research information • Also approach for bootstrapping institutional research information systems from available web sources • no substitute, but complementary to those systems
  17. 17. 17 Some activities (beside VIVO implementation) • Community building • VIVOcamp13, first workshop for EU VIVO community, SWIB13 satellite November 2013 • VIVO Bootcamp at ELAG Conference (European Library Automation Group) in Bath, June 2014, „hands-on„ • euroCRIS LOD group participation • Policy & Standards Making: Position paper DINI AG FIS • Supervising bachelor thesis for extending VIVO ontology • DFG application: “German Academic Web”
  18. 18. Thank you for your attention!