SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Integration – the heart of
researcher centric research
data management systems
Steve Mackey
15 January 2015 1
Agenda
• Who we are, what we do
• How it works
• RDM systems, where it fits
• Workflows
• Integrations
21 October 2014 2
Archive storage with a difference
Flagship Arkivum100
service with 100% data
integrity guarantee
World-wide professional
indemnity insurance –
Arkivum100
Long term contracts for
enterprise data archiving
Fully automated and
managed solution
Audited and certified
to ISO27001
Data escrow, exit
plan, no lock-in
21 October 2014 3
Adding media – effectively continual process
Monthly checks and maintenance updates
Annual data retrieval and integrity checks
Hardware refresh
Software migration
Hardware migration
Tape format migration – LTO n to LTO n+2
Support and admin staff migration
Change of supplier of products and
services
Keeping Data Alive for 25+ Years
3-5 year
obsolescence of
servers, operating
systems and
software
Arkivum Appliance
• CIFS/NFS presentation
(integrates easily to local file
systems)
• Simple administration of
user access permissions and
storage allocations
• Robust REST API for
application integration
• GUI for file ingest status,
recovery pre-staging,
security
• Ingest triggered by:
timeout, checksum
exchange, manifest (bulk).
• Checksum/fixity chain of
custody from ingest through
replication
• Immutable (WORM)
• Regular (6 monthly) data
copy read verify
• Offline Escrow data copy
(open source, self
describing)
• Data encryption throughout
keys only held by customer
21 October 2014 5
Arkivum Service
Arkivum Gateway
on ApplianceOriginal
Datasets
& Files
Copy for
ingest
Arkivum Service
Arkivum Gateway
on Appliance
Copy for
ingest
Original
Datasets
& Files
Encrypted
Archive
Encrypted
Archive
Arkivum Service
Arkivum Gateway
on Appliance
Copy for
ingest
Original
Datasets
& Files
Validated
Archive
Decrypted
object
Arkivum Service
Arkivum Gateway
on Appliance
Copy for
ingest
Original
Datasets
& Files
Archive Copy 1
Validated
Archive
Arkivum/100
Arkivum Gateway
on Appliance
Archive Copy 1
Archive Copy 2
Copy for
ingest
Original
Datasets
& Files
Validated
Archive
Arkivum/100
Arkivum Gateway
on Appliance
Archive Copy 1
Archive Copy 2
Copy for
ingest
Original
Datasets
& Files
Validated
Archive
Arkivum/100
Arkivum Gateway
on Appliance
Archive Copy 1
Archive Copy 2
Escrow Copy
Copy for
ingest
Original
Datasets
& Files
Validated
Archive
Arkivum/100
Arkivum Gateway
on Appliance
Archive Copy 1
Archive Copy 2
Escrow Copy
Original
Datasets
& Files
Validated
Archive
Cached
Copy
Arkivum/100
Arkivum Gateway
on Appliance
Archive Copy 1
Archive Copy 2
Escrow Copy
Cached
Copy
Validated
Archive
http://datablog.is.ed.ac.uk/2013/12/06/the-four-quadrants-
of-research-data-curation-systems/
PURE
Elements
Converis
ePrints,
Dspace,
Hydra
Figshare
Re3data.org
Landing
pages
CKAN
Institutional
storage
Workflows
• RDM Workflow - The sequence of repeatable
processes (steps) through which Research
Data passes during its lifecycle, including the
steps involved in its creation, curation,
preservation, access and eventual disposal.
21 October 2014 17
RDM Workflows Report
• JISC Research Data
Spring
• A Consortial
Approach to Building
an Integrated RDM
System – “Small and
Specialist”
• http://dx.doi.org/10.6
084/m9.figshare.1476
832
21 October 2014 18
Researcher
Centric
Workflow
21 October 2014 19
Figshare
(Amazon)
Archive
(Arkivum)
Researcher
8. Data DOI
2. Data files
Local Research Data
5. Data DOI
DataCite (BL)
HR
system
1. Researcher
details
Web
browser
4. Mint DOI
3. Data Description
Journal7. Article
CRIS
(Elements)
6. Data
DOI
12. Dataset
Description and Data DOI
9.Article and
Article DOI
14. Data files
Repository
(DSpace)
10. Article and Article DOI
13. Dataset Description
And Data DOI
Article DOI
16. Data
is safe
15. Data
is safe
11.
Article
DOI
Why integrate?
• Simpler and easier RDM processes from a Researcher perspective, which both
encourages adoption and lowers the cost of institutional support to the research
base.
• Clear and repeatable RDM processes that help ensure higher levels of quality and
consistency in RDM across the research base.
• Ability to deploy RDM as community-driven shared service(s) so that smaller
institutions can ‘join forces’ to benefit from having access to a common RDM
infrastructure.
• Scaling RDM up across a large research base using automation and ‘factory’ type
approaches to achieve ‘economies of scale’ and move away from RDM being a
manual and labour intensive endeavour.
• Specifically for Archive layer storage this may include:
– Confirmation of integrity of received files via checksums/fixity
– File archive status reporting
– Trigger for original file deletion
– File location, data pool management
– File recovery staging
– Encryption key management
21 October 2014 21
Data Archiving - Integrations
21 October 2014 22
21 October 2014 23
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Bristol's Research Data Service - Debra Hiom - Jisc Digital Festival 2014
Bristol's Research Data Service - Debra Hiom - Jisc Digital Festival 2014Bristol's Research Data Service - Debra Hiom - Jisc Digital Festival 2014
Bristol's Research Data Service - Debra Hiom - Jisc Digital Festival 2014
Jisc
 

Was ist angesagt? (19)

UK RepositoryNet+ Project: New Services for the Institutional Repository Netw...
UK RepositoryNet+ Project: New Services for the Institutional Repository Netw...UK RepositoryNet+ Project: New Services for the Institutional Repository Netw...
UK RepositoryNet+ Project: New Services for the Institutional Repository Netw...
 
Integrating repositories and eLab notebooks through an open science framework
Integrating repositories and eLab notebooks through an open science frameworkIntegrating repositories and eLab notebooks through an open science framework
Integrating repositories and eLab notebooks through an open science framework
 
Finalrevc
FinalrevcFinalrevc
Finalrevc
 
From Box to Hydra via Archivematica
From Box to Hydra via ArchivematicaFrom Box to Hydra via Archivematica
From Box to Hydra via Archivematica
 
DMPOnline by Sarah Jones
DMPOnline by Sarah JonesDMPOnline by Sarah Jones
DMPOnline by Sarah Jones
 
Exploiting the value of Dublin Core through pragmatic development
Exploiting the value of Dublin Core through pragmatic developmentExploiting the value of Dublin Core through pragmatic development
Exploiting the value of Dublin Core through pragmatic development
 
Lightning Talk - Angela Dappart
Lightning Talk - Angela DappartLightning Talk - Angela Dappart
Lightning Talk - Angela Dappart
 
Northumbria University case study
Northumbria University case studyNorthumbria University case study
Northumbria University case study
 
SMRUDAS
SMRUDAS SMRUDAS
SMRUDAS
 
COBWEB Project Status
COBWEB Project StatusCOBWEB Project Status
COBWEB Project Status
 
Recognising data sharing
Recognising data sharingRecognising data sharing
Recognising data sharing
 
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...
 
Developing Infrastructure to Support Closer Collaboration of Aggregators with...
Developing Infrastructure to Support Closer Collaboration of Aggregators with...Developing Infrastructure to Support Closer Collaboration of Aggregators with...
Developing Infrastructure to Support Closer Collaboration of Aggregators with...
 
Building research data management services at the University of Edinburgh: a ...
Building research data management services at the University of Edinburgh: a ...Building research data management services at the University of Edinburgh: a ...
Building research data management services at the University of Edinburgh: a ...
 
Bristol's Research Data Service - Debra Hiom - Jisc Digital Festival 2014
Bristol's Research Data Service - Debra Hiom - Jisc Digital Festival 2014Bristol's Research Data Service - Debra Hiom - Jisc Digital Festival 2014
Bristol's Research Data Service - Debra Hiom - Jisc Digital Festival 2014
 
RDN Lightning talk - Open Research Leeds (@OpenResLeeds): networks, metrics a...
RDN Lightning talk - Open Research Leeds (@OpenResLeeds): networks, metrics a...RDN Lightning talk - Open Research Leeds (@OpenResLeeds): networks, metrics a...
RDN Lightning talk - Open Research Leeds (@OpenResLeeds): networks, metrics a...
 
Jisc unleashing data 5 minutes
Jisc unleashing data 5 minutesJisc unleashing data 5 minutes
Jisc unleashing data 5 minutes
 
Securing continuing access to ejournal content
Securing continuing access to ejournal contentSecuring continuing access to ejournal content
Securing continuing access to ejournal content
 
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
 

Andere mochten auch

Andere mochten auch (6)

Vivo, Repositories and FigShare - Graham Triggs
Vivo, Repositories and FigShare - Graham TriggsVivo, Repositories and FigShare - Graham Triggs
Vivo, Repositories and FigShare - Graham Triggs
 
Analysis of requirements and benchmarking of CRIS for the Universities of Cat...
Analysis of requirements and benchmarking of CRIS for the Universities of Cat...Analysis of requirements and benchmarking of CRIS for the Universities of Cat...
Analysis of requirements and benchmarking of CRIS for the Universities of Cat...
 
DSpace-CRIS: An Open Source Solution for Research - @THETA15
DSpace-CRIS: An Open Source Solution for Research - @THETA15DSpace-CRIS: An Open Source Solution for Research - @THETA15
DSpace-CRIS: An Open Source Solution for Research - @THETA15
 
Sistemas de Gestão de Ciência e Repositórios - DSpaceCRIS
Sistemas de Gestão de Ciência e Repositórios - DSpaceCRISSistemas de Gestão de Ciência e Repositórios - DSpaceCRIS
Sistemas de Gestão de Ciência e Repositórios - DSpaceCRIS
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platform
 
DSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISDSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRIS
 

Ähnlich wie Integration - the heart of researcher centric research data management systems - Steve Mackey, Arkivum

How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
Perficient
 
TDWI Checklist Report: Active Data Archiving
TDWI Checklist Report:  Active Data ArchivingTDWI Checklist Report:  Active Data Archiving
TDWI Checklist Report: Active Data Archiving
RainStor
 
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
CTSI at UCSF
 

Ähnlich wie Integration - the heart of researcher centric research data management systems - Steve Mackey, Arkivum (20)

National Archives of Australia. AVAMS Project Achievements August 2014
National Archives of Australia. AVAMS Project Achievements August 2014National Archives of Australia. AVAMS Project Achievements August 2014
National Archives of Australia. AVAMS Project Achievements August 2014
 
Archivematica Camp Houston Slides Stream1.pdf
Archivematica Camp Houston Slides Stream1.pdfArchivematica Camp Houston Slides Stream1.pdf
Archivematica Camp Houston Slides Stream1.pdf
 
Project update: A collaborative approach to "filling the digital preservation...
Project update: A collaborative approach to "filling the digital preservation...Project update: A collaborative approach to "filling the digital preservation...
Project update: A collaborative approach to "filling the digital preservation...
 
Efficient & effective data management for research projects : ILRI's Data Ma...
Efficient & effective  data management for research projects : ILRI's Data Ma...Efficient & effective  data management for research projects : ILRI's Data Ma...
Efficient & effective data management for research projects : ILRI's Data Ma...
 
Using Archivemedia to preserve research data
Using Archivemedia to preserve research dataUsing Archivemedia to preserve research data
Using Archivemedia to preserve research data
 
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
 
2013 OHSUG - Use Cases for Using the Program Type View in Oracle Life Science...
2013 OHSUG - Use Cases for Using the Program Type View in Oracle Life Science...2013 OHSUG - Use Cases for Using the Program Type View in Oracle Life Science...
2013 OHSUG - Use Cases for Using the Program Type View in Oracle Life Science...
 
Research Data Management at the University of Salford
Research Data Management at the University of SalfordResearch Data Management at the University of Salford
Research Data Management at the University of Salford
 
Cut End-to-End eDiscovery Time in Half: Leveraging the Cloud
Cut End-to-End eDiscovery Time in Half: Leveraging the CloudCut End-to-End eDiscovery Time in Half: Leveraging the Cloud
Cut End-to-End eDiscovery Time in Half: Leveraging the Cloud
 
Criteria for a trusted institutional repository
Criteria for a trusted institutional repositoryCriteria for a trusted institutional repository
Criteria for a trusted institutional repository
 
Goethals Harvard Library's Digital Preservation Repository
Goethals Harvard Library's Digital Preservation RepositoryGoethals Harvard Library's Digital Preservation Repository
Goethals Harvard Library's Digital Preservation Repository
 
Building Cyber-infrastructure at UNC-CH
Building Cyber-infrastructure at UNC-CHBuilding Cyber-infrastructure at UNC-CH
Building Cyber-infrastructure at UNC-CH
 
TDWI Checklist Report: Active Data Archiving
TDWI Checklist Report:  Active Data ArchivingTDWI Checklist Report:  Active Data Archiving
TDWI Checklist Report: Active Data Archiving
 
A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...
 
OU Library Research Support webinar: Working with research data
OU Library Research Support webinar: Working with research dataOU Library Research Support webinar: Working with research data
OU Library Research Support webinar: Working with research data
 
Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster
 
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
UCSF Informatics Day 2014 - Jocel Dumlao, "REDCap / MyResearch"
 
fundamentals of data warehouse. initial level.
fundamentals of data warehouse. initial level.fundamentals of data warehouse. initial level.
fundamentals of data warehouse. initial level.
 
Research Data Shared Services
Research Data Shared ServicesResearch Data Shared Services
Research Data Shared Services
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
 

Mehr von Repository Fringe

Mehr von Repository Fringe (20)

Open Access workshop at Repository Fringe 2015 - Valerie McCutcheon
Open Access workshop at Repository Fringe 2015 - Valerie McCutcheonOpen Access workshop at Repository Fringe 2015 - Valerie McCutcheon
Open Access workshop at Repository Fringe 2015 - Valerie McCutcheon
 
Repository Fringe 2015 - Jisc RDM Session, Linda Naughton, Jisc
Repository Fringe 2015 - Jisc RDM Session, Linda Naughton, JiscRepository Fringe 2015 - Jisc RDM Session, Linda Naughton, Jisc
Repository Fringe 2015 - Jisc RDM Session, Linda Naughton, Jisc
 
IRUS-UK at Repository Fringe 2015 - Jo Alcock
IRUS-UK at Repository Fringe 2015 - Jo AlcockIRUS-UK at Repository Fringe 2015 - Jo Alcock
IRUS-UK at Repository Fringe 2015 - Jo Alcock
 
Impact and EPrints - Rosie-Marie Barbeau and Mick Eadie
Impact and EPrints - Rosie-Marie Barbeau and Mick EadieImpact and EPrints - Rosie-Marie Barbeau and Mick Eadie
Impact and EPrints - Rosie-Marie Barbeau and Mick Eadie
 
Open Data and Sharing Science - Graham Steel, Contentmine
Open Data and Sharing Science - Graham Steel, ContentmineOpen Data and Sharing Science - Graham Steel, Contentmine
Open Data and Sharing Science - Graham Steel, Contentmine
 
SHERPA Services breakout session - Bill Hubbard
SHERPA Services breakout session - Bill HubbardSHERPA Services breakout session - Bill Hubbard
SHERPA Services breakout session - Bill Hubbard
 
REF compliance - what Jisc is doing
REF compliance - what Jisc is doingREF compliance - what Jisc is doing
REF compliance - what Jisc is doing
 
RCUK - what Jisc is doing
RCUK - what Jisc is doingRCUK - what Jisc is doing
RCUK - what Jisc is doing
 
Linking Software: citations, roles, references and more
Linking Software: citations, roles, references and moreLinking Software: citations, roles, references and more
Linking Software: citations, roles, references and more
 
Jisc Publications Router
Jisc Publications RouterJisc Publications Router
Jisc Publications Router
 
Linking Research Outputs - Rachel Kotarski
Linking Research Outputs - Rachel KotarskiLinking Research Outputs - Rachel Kotarski
Linking Research Outputs - Rachel Kotarski
 
HHuLO Access – Hull, Huddersfield and Lincoln explore open access good practi...
HHuLO Access – Hull, Huddersfield and Lincoln explore open access good practi...HHuLO Access – Hull, Huddersfield and Lincoln explore open access good practi...
HHuLO Access – Hull, Huddersfield and Lincoln explore open access good practi...
 
Latest developments in Hydra-land - Chris Awre, University of Hull
Latest developments in Hydra-land - Chris Awre, University of HullLatest developments in Hydra-land - Chris Awre, University of Hull
Latest developments in Hydra-land - Chris Awre, University of Hull
 
ArchivesSpace - Scott Renton, University of Edinburgh
ArchivesSpace - Scott Renton, University of EdinburghArchivesSpace - Scott Renton, University of Edinburgh
ArchivesSpace - Scott Renton, University of Edinburgh
 
Collections.ed – Launching the University Collections Online, Ianthe Sutherla...
Collections.ed – Launching the University Collections Online, Ianthe Sutherla...Collections.ed – Launching the University Collections Online, Ianthe Sutherla...
Collections.ed – Launching the University Collections Online, Ianthe Sutherla...
 
EPrints Update, Les Carr, University of Southampton
EPrints  Update, Les Carr, University of SouthamptonEPrints  Update, Les Carr, University of Southampton
EPrints Update, Les Carr, University of Southampton
 
The Open to Open Access (O2OA) project, Miggie Pickton, University of Northam...
The Open to Open Access (O2OA) project, Miggie Pickton, University of Northam...The Open to Open Access (O2OA) project, Miggie Pickton, University of Northam...
The Open to Open Access (O2OA) project, Miggie Pickton, University of Northam...
 
Jisc Monitor Pilot Project: an exploration of how a Jisc managed shared servi...
Jisc Monitor Pilot Project: an exploration of how a Jisc managed shared servi...Jisc Monitor Pilot Project: an exploration of how a Jisc managed shared servi...
Jisc Monitor Pilot Project: an exploration of how a Jisc managed shared servi...
 
DSpace Update from Open Repositories 2014
DSpace Update from Open Repositories 2014DSpace Update from Open Repositories 2014
DSpace Update from Open Repositories 2014
 
The revolution has been cancelled: the current state of UK Open Access
The revolution has been cancelled:  the current state of UK Open AccessThe revolution has been cancelled:  the current state of UK Open Access
The revolution has been cancelled: the current state of UK Open Access
 

Kürzlich hochgeladen

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Kürzlich hochgeladen (20)

APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

Integration - the heart of researcher centric research data management systems - Steve Mackey, Arkivum

  • 1. Integration – the heart of researcher centric research data management systems Steve Mackey 15 January 2015 1
  • 2. Agenda • Who we are, what we do • How it works • RDM systems, where it fits • Workflows • Integrations 21 October 2014 2
  • 3. Archive storage with a difference Flagship Arkivum100 service with 100% data integrity guarantee World-wide professional indemnity insurance – Arkivum100 Long term contracts for enterprise data archiving Fully automated and managed solution Audited and certified to ISO27001 Data escrow, exit plan, no lock-in 21 October 2014 3
  • 4. Adding media – effectively continual process Monthly checks and maintenance updates Annual data retrieval and integrity checks Hardware refresh Software migration Hardware migration Tape format migration – LTO n to LTO n+2 Support and admin staff migration Change of supplier of products and services Keeping Data Alive for 25+ Years 3-5 year obsolescence of servers, operating systems and software
  • 5. Arkivum Appliance • CIFS/NFS presentation (integrates easily to local file systems) • Simple administration of user access permissions and storage allocations • Robust REST API for application integration • GUI for file ingest status, recovery pre-staging, security • Ingest triggered by: timeout, checksum exchange, manifest (bulk). • Checksum/fixity chain of custody from ingest through replication • Immutable (WORM) • Regular (6 monthly) data copy read verify • Offline Escrow data copy (open source, self describing) • Data encryption throughout keys only held by customer 21 October 2014 5
  • 6. Arkivum Service Arkivum Gateway on ApplianceOriginal Datasets & Files Copy for ingest
  • 7. Arkivum Service Arkivum Gateway on Appliance Copy for ingest Original Datasets & Files Encrypted Archive
  • 8. Encrypted Archive Arkivum Service Arkivum Gateway on Appliance Copy for ingest Original Datasets & Files Validated Archive Decrypted object
  • 9. Arkivum Service Arkivum Gateway on Appliance Copy for ingest Original Datasets & Files Archive Copy 1 Validated Archive
  • 10. Arkivum/100 Arkivum Gateway on Appliance Archive Copy 1 Archive Copy 2 Copy for ingest Original Datasets & Files Validated Archive
  • 11. Arkivum/100 Arkivum Gateway on Appliance Archive Copy 1 Archive Copy 2 Copy for ingest Original Datasets & Files Validated Archive
  • 12. Arkivum/100 Arkivum Gateway on Appliance Archive Copy 1 Archive Copy 2 Escrow Copy Copy for ingest Original Datasets & Files Validated Archive
  • 13. Arkivum/100 Arkivum Gateway on Appliance Archive Copy 1 Archive Copy 2 Escrow Copy Original Datasets & Files Validated Archive Cached Copy
  • 14. Arkivum/100 Arkivum Gateway on Appliance Archive Copy 1 Archive Copy 2 Escrow Copy Cached Copy Validated Archive
  • 16.
  • 17. Workflows • RDM Workflow - The sequence of repeatable processes (steps) through which Research Data passes during its lifecycle, including the steps involved in its creation, curation, preservation, access and eventual disposal. 21 October 2014 17
  • 18. RDM Workflows Report • JISC Research Data Spring • A Consortial Approach to Building an Integrated RDM System – “Small and Specialist” • http://dx.doi.org/10.6 084/m9.figshare.1476 832 21 October 2014 18
  • 20. Figshare (Amazon) Archive (Arkivum) Researcher 8. Data DOI 2. Data files Local Research Data 5. Data DOI DataCite (BL) HR system 1. Researcher details Web browser 4. Mint DOI 3. Data Description Journal7. Article CRIS (Elements) 6. Data DOI 12. Dataset Description and Data DOI 9.Article and Article DOI 14. Data files Repository (DSpace) 10. Article and Article DOI 13. Dataset Description And Data DOI Article DOI 16. Data is safe 15. Data is safe 11. Article DOI
  • 21. Why integrate? • Simpler and easier RDM processes from a Researcher perspective, which both encourages adoption and lowers the cost of institutional support to the research base. • Clear and repeatable RDM processes that help ensure higher levels of quality and consistency in RDM across the research base. • Ability to deploy RDM as community-driven shared service(s) so that smaller institutions can ‘join forces’ to benefit from having access to a common RDM infrastructure. • Scaling RDM up across a large research base using automation and ‘factory’ type approaches to achieve ‘economies of scale’ and move away from RDM being a manual and labour intensive endeavour. • Specifically for Archive layer storage this may include: – Confirmation of integrity of received files via checksums/fixity – File archive status reporting – Trigger for original file deletion – File location, data pool management – File recovery staging – Encryption key management 21 October 2014 21
  • 22. Data Archiving - Integrations 21 October 2014 22
  • 23. 21 October 2014 23 Questions?

Hinweis der Redaktion

  1. speakers notes
  2. These are just some of the things that will happen over 25 years of trying to retain data. In the diagram, a change from blue to yellow is when something happens that has to be managed. In a growing archive, adding or replacing media, e.g. tapes or discs, can be a daily process, so is effectively continual. The archive system needs regular monitoring and maintenance, which might mean monthly checks and updates. Data integrity needs to be actively verified, for example annual retrievals and integrity tests. Then comes obsolescence of hardware and software, meaning refreshes or upgrades that will typically be 3 – 5 years, for example servers, operating systems, application software. The format of the data being held may need to change so it can still be read and even long-lived formats such as PDF-A will eventually be obsolete as they are replaced with something better and applications no longer provide backwards compatibility. In addition to technical change, there will be the need to manage staff transitions of those who run the system, for example support staff and administrators. And suppliers of products and services will come and go to. There are very few vendors that have been around for a long time in the IT industry and mergers, acquisitions, changes in direction and companies simply going bust are all common place. Basically, the lifetime of the data is longer than the lifetime of almost everything that’s used to keep that data safe and accessible. The key point is that long-term archiving is an active process and there’s always some form of change going on. And when change happens there’s always a risk that something goes wrong, and there’s always the need to validate that the change has been effected properly. This all requires time, expertise and money. Digital archiving is a case of continual interventions to keep content alive and accessible.
  3. A file is copied on to the appliance, how it gets there may very depending on the application and integration method. Its worth remembering that you should confirm the data got onto the appliance safely, some partner products perform the checksum validation to ensure the action of copying in hasn’t introduced data corruption.
  4. The appliance watches for the file being closed (to ensure we don’t try and process incomplete files), to ensure no further changes are going to be made. it will wait for two complete ‘ingest periods’ to pass, before the process begins at which point the file is marked as ‘Red’. The duration of the ingest time is set on a per ‘data pool’ basis and defaults to ten minutes.
  5. Multiple checksums are taken of the original file, and stored within the service. The file is then encrypted, to ensure the efficiency of the service larger files are split into ‘chunks’ up to 1GB in size before being encrypted. A key can be set at any point in the file try and applies to any object below that point. It is important to note that a custom must be applied to a folder before any data is add below it. Any keys that are used with the service must be kept safe by the client, as Arkivum never have access to theses. In addition to keeping digital copies of the keys it is also recommended that a hardcopy is made and stored securely. Without the keys, it would impossible to retrieve data from the service. An encrypted version of the file is created and then immediately decrypted, and compared with the original. If the encrypted archive is validated, the decrypted copy is removed and multiple checksums of the validated archive are taken and passed for replication into the service.
  6. The archive is replicated to our first datacentre, once the transfer has complete its integrity is confirmed using the checksums created earlier.
  7. The archive is then replicated on to our second datacentre, where again the integrity of the transfer is confirmed using the checksums.
  8. Once we have two validated copies in the service, the status of the file is updated to ‘amber’. The file is pretty well protected at this point but the 100% guarantee does apply until we reach the ‘Green’ state.
  9. A third copy is queued to be written for escrow, the tape is not written until a complete tapes worth has been queued. Currently this is 2.2 TB, depending on the rate at which data is archived this can me files remain in the ‘Amber’ state for sometime. Where this risk is an unacceptable ‘escrow events’ can be purchased.
  10. Once a tape written, and verified, it scheduled to be couriered to the escrow site. Once a receipt confirming it safe arrive has been received then the status is updated to ‘Green’. At this point the 100% guarantee comes into effect.
  11. Only now is it safe for any copies of the file outside of the service to be disposed of, or for it to be excluded from any conventional backups. The validated archive remains in the appliance cache but is now marked as being available for deletion as when the cache high water mark is reached.
  12. But more than just archiving is required of course to achieve these benefits. This is a diagram from the University of Edinburgh RDM blog from just before Christmas. It shows the components required, including: A Current Research Information System (CRIS) for tracking grants, projects, equipment, research results, etc. A Data Asset Register, which might an Institutional Repository, which provides a public gateway to research done at an institution, both publications and data. Then there are the multitude of public data repositories where open data can be deposited And finally a Data Vault as a safe storage facility for research data at various stages in its lifecycle. http://datablog.is.ed.ac.uk/2013/12/06/the-four-quadrants-of-research-data-curation-systems/
  13. One data centric way to look at Research Data Management is to consider the processes and infrastructure when research data is created and used, which is the ‘research’ side of the diagram, and the processes and infrastructure that is also needed so that some or all of this research data can be kept and made accessible for future reuse, which is the ‘reuse’ side of the diagram. You’ve got live, active and changing data on the left and then curated, retained and highly managed data assets on the right. Traditionally, Researchers occupy the left hand side and the Library, Research Office etc. occupy the right hand side. Research Data Management spans the whole space as it covers all aspects of the data lifecycle and should be considered as part of Good Research Practice and hence part of what Researchers do as a matter of course. We might not be there yet, but this is where I think we’d like to be. It’s also true that the boundaries are likely to get blurred as increasing amounts of research are data-driven based on existing and shared data sets. One of the challenges comes when thinking about all the tools and systems involved. So, for example, on the LHS, you might be using a CRIS when developing and bidding a project. When the project is live, Researchers might be using their own devices, collaboration and sharing platforms, lab systems and a host of other tools or platforms to do their research. There might be HPC systems to process data, or do simulations and modelling, and if data sets are large there could be big data analytics and other funky stuff. At some point, publications are made and the outcomes of the work are released. Then comes the question of what to keep, why, who for, and everything needed to ensure that enough context is captured for any data that should be retained for future use. Data might be kept because its needed for repeatability and verification of the research, or it might be kept because it has value to the researcher or others in future research. Tied in with publication, access and meeting funding body requirements are things like minting DOIs, adding records to IR and storing data in vaults or other facilities that ensures the data is held safely and securely for future access. Then comes activities around ensuring data remains usable, which is digital preservation, that access and retention continues to meet policies, and then finally, and last but certainly not least, that use and citation of the data is tracked so impact can be assessed and decisions on whether to continue keeping it. This might feed back into the CRIS, e.g. for REF, and also for further selection/curation. And again this is an ongoing and cyclic activity. What we’re seeing in working with a wide range of Universities is the challenge of how to make these circles meet and work smoothly together. You can’t expect the library or research data service part of an institution to get intimately involved with all the ways in which data is created and used. Likewise, you can’t expect Researchers to have to know, understand and use a whole host of systems and tools on the long-term research data management side, i.e. the right. What we’re seeing is a desire and need for the simplest interface between the two, a kind of meeting in the middle, which provides a very simple solution for the Researchers. Almost like a one-stop-shop – and crucially one that has value to the Researcher so helps motivate their engagement with Research data management. For example, helping them get more citations, downloads, collaboration requests based on their data. And its this simple one stop shop and clear process is what I think is so interesting about the Loughborough approach.