Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
dans.knaw.nl
DANS is een instituut van KNAW en NWO
Building an electronic repository and archives on Dataverse
in the Euro...
About me
• was born in Kyiv in 1979
• studied in the National Technical University of
Ukraine – Kyiv Polytechnic Institute...
DANS-KNAW core services
Why Dataverse?
• Open source project developed by IQSS of Harvard University
and published on github
• Great product with ...
Dataverse and API economy
Dataverse is data repository platform with 4 API endpoints:
- Native API
- SWORD API
- Search AP...
DataverseNL as a shared service
Datasets container for Leiden University
DataverseNL as collaboration platform
• DataverseNL is a shared service provided by the participating institutions
and DAN...
Dataset submission form
Published dataset in Ukrainian
SSHOC DataverseEU project
SSHOC is Social Sciences and Humanities Open Cloud
The goal of SSHOC Dataverse project (CESSDA, ...
SSHOC Dataverse project has two parallel tracks of the development:
• Core development team is working on the modification...
Maturity evaluation of DataverseEU services
• testing process should be compliant with CESSDA services maturity
model http...
Services in European Open Science Cloud (EOSC)
• EOSC requires the level 8 of
maturity (at least)
• we need the highest qu...
Research data management
Data standardization process plays a key role in the data
management plan of any organization but...
Controlled vocabulary and thesaurus
• Linked data is one step forward (or actually backward in the right
direction) on sol...
CESSDA CV Service
External controlled vocabularies in Dataverse
Standardized metadata in Dataverse
Weblate as a multilingual support service
Managing translations with Weblate
Questions?
Contact me:
Slava Tykhonov
vyacheslav.tykhonov@dans.knaw.nl
https://www.linkedin.com/in/vyacheslavtikhonov/
htt...
Nächste SlideShare
Wird geladen in …5
×

Building an electronic repository and archives on Dataverse in the European Open Science Cloud

205 Aufrufe

Veröffentlicht am

The presentation for XVIII International Scientific and Practical conference
"BUILDING OF INFORMATION SOCIETY: RESOURCES AND TECHNOLOGIES" in Kyiv

Veröffentlicht in: Technologie
  • Positions Available Now! We currently have several openings for writing workers. ●●● http://t.cn/AieXS5j0
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • Gehören Sie zu den Ersten, denen das gefällt!

Building an electronic repository and archives on Dataverse in the European Open Science Cloud

  1. 1. dans.knaw.nl DANS is een instituut van KNAW en NWO Building an electronic repository and archives on Dataverse in the European Open Science Cloud Vyacheslav Tykhonov Senior Information Scientist Data Archiving and Networked Services (DANS-KNAW, Netherlands) XVIII International Scientific and Practical conference "BUILDING OF INFORMATION SOCIETY: RESOURCES AND TECHNOLOGIES" September 19, 2019 in Kyiv
  2. 2. About me • was born in Kyiv in 1979 • studied in the National Technical University of Ukraine – Kyiv Polytechnic Institute (MSc, 2002) • used to work for international search engines companies and media monitoring agencies in the past (1999-2010) • started to work for the Royal Netherlands Academy of Arts and Sciences (KNAW) in 2011 • Senior Data Scientist at DANS-KNAW from 2016 • currently leading the technical development of DataverseEU cloud efforts in SSHOC Dataverse and other projects
  3. 3. DANS-KNAW core services
  4. 4. Why Dataverse? • Open source project developed by IQSS of Harvard University and published on github • Great product with very long history (from 2006) • Very dynamic and experienced development team working in the Agile environment (community call scheduled once in two weeks) • Clear vision and understanding of research communities requirements, public roadmap • Strong community behind of Dataverse is helping to improve the basic functionality and develop it further • Dataverse has been selected as a data repository infrastructure by countries from all continents • Well developed architecture with rich API endpoints to build application layers around Dataverse
  5. 5. Dataverse and API economy Dataverse is data repository platform with 4 API endpoints: - Native API - SWORD API - Search API - Data Access API API token is the key to connect Dataverse with unlimited amount of tools developed by different research communities and integrate it with other repositories.
  6. 6. DataverseNL as a shared service
  7. 7. Datasets container for Leiden University
  8. 8. DataverseNL as collaboration platform • DataverseNL is a shared service provided by the participating institutions and DANS. DANS performs back office tasks, including server and software maintenance and administrative support. • The participating institutions are responsible for managing the deposited data and the content. Every institution has own data manager. • User friendly:users at participating institutions simply log in and DataverseNL will be ready for use. • Reliable and safe: in cooperation with the participating institutions and universities, standard procedures have been established which ensure sound data management. Data are stored in the Netherlands. • Accessible: the service can be accessed online, from anywhere and at any time. Just open dataverse.nl!
  9. 9. Dataset submission form
  10. 10. Published dataset in Ukrainian
  11. 11. SSHOC DataverseEU project SSHOC is Social Sciences and Humanities Open Cloud The goal of SSHOC Dataverse project (CESSDA, DARIAH and CLARIN) is to create a reliable and production ready Open Source data infrastructure that everybody can install and reuse for their own needs and requirements. We’re developing multilingual web interface and localizing metadata fields and developed data standardization technique based on APIs for CESSDA CVs, Topic Classification and CESSDA CV Manager services. DataverseEU countries: • Hungary (TARKI) • Sweden(SND) • Slovenia (ADP) • Germany (GESIS) • France (SciencesPro) • Austria (AUSSDA) • United Kingdom (UKDA) • Italy (UniData) • Belgium (SODA) • Latvia (LSZDA) • Netherlands (DANS-KNAW)
  12. 12. SSHOC Dataverse project has two parallel tracks of the development: • Core development team is working on the modification and extension of the Dataverse core functionality. • The application development team will create new or will integrate existent tools that will be published on Dataverse App Store website. Our goal is to build the distributed and mature data infrastructure based on sustainable microservices. Development process
  13. 13. Maturity evaluation of DataverseEU services • testing process should be compliant with CESSDA services maturity model https://zenodo.org/record/2591055#.XKR6ny2B2u5 • every change of Dataverse functionality should be supplied with unit test, changes of external functionality should get Selenium scenarios. • the service should score as high as possible according to CESSDA maturity model
  14. 14. Services in European Open Science Cloud (EOSC) • EOSC requires the level 8 of maturity (at least) • we need the highest quality of software to be accepted as a service • clear and transparent evaluation of services is essential • the evidence of technical maturity is the key to success • the limited warranty will allow to stop out-of-warranty services
  15. 15. Research data management Data standardization process plays a key role in the data management plan of any organization but current situation in research data management is very complex: • too much data chaos in datasets • no data transparency • sometimes no standards available • no provenance information attached to data • homonyms, synonyms, generalizations, specializations, spelling variations and mistakes, language versions are all complicating the keyword-based search and retrieval of information
  16. 16. Controlled vocabulary and thesaurus • Linked data is one step forward (or actually backward in the right direction) on solving some of standardization problems. • By having shared controlled vocabularies (CV) created and maintained by experts on various domains, the digital items can be annotated with them and easily retrieved by other experts from the same domain without being librarian. It’s clear indication which vocabulary is good enough and shared by a critical mass. • A thesaurus is a semantic network of unique concepts, including relationships between synonyms, broader and narrower (parent/child) contexts, and other related concepts. Thesaurus is hierarchy for controlled vocabularies.
  17. 17. CESSDA CV Service
  18. 18. External controlled vocabularies in Dataverse
  19. 19. Standardized metadata in Dataverse
  20. 20. Weblate as a multilingual support service
  21. 21. Managing translations with Weblate
  22. 22. Questions? Contact me: Slava Tykhonov vyacheslav.tykhonov@dans.knaw.nl https://www.linkedin.com/in/vyacheslavtikhonov/ https://twitter.com/4tykhonov Watch SSHOC Dataverse presentation at Harvard University: https://www.youtube.com/watch?v=vAPpKuDQUDY Try now! https://dataverse.harvard.edu and https://dataverse.nl http://dataverse.org.ua (Ukrainian portal) http://github.com/IQSS/dataverse (application source code) http://github.com/IQSS/dataverse-docker (Cloud release for Kubernetes)

×