SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Preserving linked data: sustainability and
organizational infrastructure
Mariella Guercio
Sapienza UniversitĂ  di Roma
sustainability and organizational infrastructure:
assumptions for preserving LOD
• Dynamic and distributed environments (like LOD)
are always complex to preserve for reasons
inextricably connected with both the LOD nature
and the preservation goals
• The main challenges could find solutions if an
adequate and accurate organizational
infrastructure will be in place as early as possible
• The questions to solve are in many cases
conflicting and still open
sustainability and organizational infrastructure:
relevant issues for a LOD preservation roadmap
• the LOD imply changes which are the most challenging issue for DP: they
have to be tracked, documented and maintained for future assessment
• the links are essential LOD components, but according to the preservation
rules and standards the main/significant links cannot be preserved as
simple references to external resources, but must be part of the ingestion
process or, at least, well documented and assessed with reference to their
impact
• the definition of what is significant implies a coordination among
stakeholders and agreements with institutions of memory for ensuring
continuity of access over time and sufficient documentation for presuming
authenticity, while LOD are web based and not strictly related to the
institutional control
• documentation, procedures, policies are recognized as crucial tools for
preservation to be created and preserved in the creation phase, but the
awareness for documenting persistency is not, at the moment, a relevant
issue for the LOD communities
sustainability and organizational infrastructure: the
standards point of view (ISO 14721 OAIS and 16363) - 1
• The basic requirements imply the capacity of managing datasets and
organizational changes by
– early identification of representation information to collect, ingest,
archive and in case transform according to the DC involved (OAIS): it is
not a technical question and the definition of the DC cannot be
ignored if this definition can have more degrees of completeness and
clarity
– early definition of boundaries to ensure a sustainable approach for
ingestion into the repositories (related to an adequate description of
the DC): which links, which quantity of PDI, which vocabularies (OAIS)
– accurate choice of the TDR with reference to the governance, policies
systems, certification processes (ISO 16363) and an eventual
federated network
– clear identification of the profiles involved in the preservation
processes and the crucial responsibilities (OAIS and ISO 16363)
sustainability and organizational infrastructure: the
standards point of view (ISO 14721 OAIS and 16363) - 2
In particular, according to ISO 16363 for certification processes the competences
for organizational infrastructure (3.3) imply the documentation of the LOD
repository/provider (?) reliability and include the ability to:
• evaluate the process by which a designated community is defined
• determine whether system documentation is adequate for all aspect of the
TDR
• determine whether preservation plans are adequate and match the
preservation policies
• determine if preservation policies are accurately captured in system
workflows
• determine if workflows are adequately documented
• recognise whether an adequate level of detail has been recorded about
system changes
• evaluate the organisation’s commitment to transparency and accountability
sustainability and organizational infrastructure: how
to apply the preservation standards to LOD - 1
• the chain of responsibility should be based on a governance
system able to testify the commitment and the transparency
of the LOD system and/or its provider (ISO 16363,
requirements 3.1): they must be well defined, clearly ruled
and well documented and include:
– who takes the main responsibilities,
– who defines the policies and
– on which basis they are approved and disseminated,
– how to support the long-term persistency when the original LOD
sets are not anymore curated (like in case of disappearance of the
LOD providers), etc.
• best practices must be identified (for more scenarios)
• the repository should, at least, make evidence it is aware
of these risks
sustainability and organizational infrastructure: how
to apply the preservation standards to LOD - 2
• to ensure the continuity of the preservation services, the preservation strategy
should define the type of repository for long-term like archives/repositories held
by institutions of memory (against in-house solutions); in case of private
archives/repositories policies must be in place and must be able to testify the
dataset providers’ awareness for this critical aspect
• policies and procedural workflows have a key role but must be further detailed in
case of LOD (see the recommendations of APARSEN for policies as defined by D35
and the sections 3.1 and 3.3. of ISO 16363):
– general preservation policy
– policies for vocabularies, related changes and standards of reference
– policies for privacy
– policies for managing links and networks of SIPs/AIPs to ingest/manage
– policies for change management
– policies for appraisal and retention
– manuals which describe the fixity mechanisms
– updating system for all the policies included in the preservation plan
sustainability and organizational infrastructure: how
to apply the preservation standards to LOD - 3
• the level of granularity and functions to be preserved must
be supported by a strategic preservation plan whose specific
strategies are developed according to the datasets nature
and function (to maintain the correct degree of data
intelligibility)
• the economic sustainability has to be based on a cost model
which takes into account the specific role of the
stakeholders/providers and the custodians and risk
assessment definition
sustainability and organizational infrastructure:
open questions
• Many questions are still open:
– Who is going to pay for preserving LOD (also
within an organization)?
– Why institutional repositories and institutions of
memory should be interested? On which basis?
Which role can be designed?
– Can a network of federated repositories be
accepted and supported?
– Which level of service agreements is required
between Institutions of memory and LOD
providers?

Weitere ähnliche Inhalte

Ähnlich wie Preserving linked data: sustainability and organizational infrastructure

de theory and practice of digital preservation
de theory and practice of digital preservationde theory and practice of digital preservation
de theory and practice of digital preservationFIAT/IFTA
 
Governance in Ultra-Large-Scale Systems
Governance in Ultra-Large-Scale SystemsGovernance in Ultra-Large-Scale Systems
Governance in Ultra-Large-Scale SystemsBoxer Research Ltd
 
Control policy formulation
Control policy formulationControl policy formulation
Control policy formulationSCAPE Project
 
Standards for clinical research data - steps to an information model (CRIM).
Standards for clinical research data - steps to an information model (CRIM).Standards for clinical research data - steps to an information model (CRIM).
Standards for clinical research data - steps to an information model (CRIM).Wolfgang Kuchinke
 
imr504 classification and filing system week 2
imr504 classification and filing system week 2imr504 classification and filing system week 2
imr504 classification and filing system week 2Ahmad Shahir Mohamed Jalil
 
Digital Preservation Policies - SCAPE
Digital Preservation Policies - SCAPEDigital Preservation Policies - SCAPE
Digital Preservation Policies - SCAPESCAPE Project
 
Policy derivation and Quality Assurance workshop
Policy derivation and Quality Assurance workshopPolicy derivation and Quality Assurance workshop
Policy derivation and Quality Assurance workshopFabio Corubolo
 
PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...PERICLES_FP7
 
Introduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdfIntroduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdfAhmedHany Sayed
 
Managing Quality through Records Management in Open and Distance Leaning Inst...
Managing Quality through Records Management in Open and Distance Leaning Inst...Managing Quality through Records Management in Open and Distance Leaning Inst...
Managing Quality through Records Management in Open and Distance Leaning Inst...theijes
 
DCC 101: Preservation
DCC 101: PreservationDCC 101: Preservation
DCC 101: PreservationMichael Day
 
OAIS: What is it and Where is it Going? - Don Sawyer (2002)
OAIS: What is it and Where is it Going? - Don Sawyer (2002)OAIS: What is it and Where is it Going? - Don Sawyer (2002)
OAIS: What is it and Where is it Going? - Don Sawyer (2002)faflrt
 
Guidebook To Long-Term Retention Part 1: Challenges And Effective Approaches
Guidebook To Long-Term Retention Part 1: Challenges And Effective ApproachesGuidebook To Long-Term Retention Part 1: Challenges And Effective Approaches
Guidebook To Long-Term Retention Part 1: Challenges And Effective ApproachesIron Mountain
 
ECM study - strategy, management & technology
ECM study - strategy, management & technologyECM study - strategy, management & technology
ECM study - strategy, management & technologyNiklas Sinander
 
Digital Curation 101: Preserve
Digital Curation 101: PreserveDigital Curation 101: Preserve
Digital Curation 101: PreserveMichael Day
 

Ähnlich wie Preserving linked data: sustainability and organizational infrastructure (20)

de theory and practice of digital preservation
de theory and practice of digital preservationde theory and practice of digital preservation
de theory and practice of digital preservation
 
Governance in Ultra-Large-Scale Systems
Governance in Ultra-Large-Scale SystemsGovernance in Ultra-Large-Scale Systems
Governance in Ultra-Large-Scale Systems
 
Control policy formulation
Control policy formulationControl policy formulation
Control policy formulation
 
Standards for clinical research data - steps to an information model (CRIM).
Standards for clinical research data - steps to an information model (CRIM).Standards for clinical research data - steps to an information model (CRIM).
Standards for clinical research data - steps to an information model (CRIM).
 
Iso30300seminar16062014
Iso30300seminar16062014Iso30300seminar16062014
Iso30300seminar16062014
 
imr504 classification and filing system week 2
imr504 classification and filing system week 2imr504 classification and filing system week 2
imr504 classification and filing system week 2
 
Digital Preservation Policies - SCAPE
Digital Preservation Policies - SCAPEDigital Preservation Policies - SCAPE
Digital Preservation Policies - SCAPE
 
Policy derivation and Quality Assurance workshop
Policy derivation and Quality Assurance workshopPolicy derivation and Quality Assurance workshop
Policy derivation and Quality Assurance workshop
 
PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...
 
IMR652 Assignment 1.pdf
IMR652 Assignment 1.pdfIMR652 Assignment 1.pdf
IMR652 Assignment 1.pdf
 
Introduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdfIntroduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdf
 
Managing Quality through Records Management in Open and Distance Leaning Inst...
Managing Quality through Records Management in Open and Distance Leaning Inst...Managing Quality through Records Management in Open and Distance Leaning Inst...
Managing Quality through Records Management in Open and Distance Leaning Inst...
 
Trm Planets Training Pp Module
Trm Planets Training Pp ModuleTrm Planets Training Pp Module
Trm Planets Training Pp Module
 
DCC 101: Preservation
DCC 101: PreservationDCC 101: Preservation
DCC 101: Preservation
 
OAIS: What is it and Where is it Going? - Don Sawyer (2002)
OAIS: What is it and Where is it Going? - Don Sawyer (2002)OAIS: What is it and Where is it Going? - Don Sawyer (2002)
OAIS: What is it and Where is it Going? - Don Sawyer (2002)
 
Digital Curation 101 - Taster
Digital Curation 101 - TasterDigital Curation 101 - Taster
Digital Curation 101 - Taster
 
Guidebook To Long-Term Retention Part 1: Challenges And Effective Approaches
Guidebook To Long-Term Retention Part 1: Challenges And Effective ApproachesGuidebook To Long-Term Retention Part 1: Challenges And Effective Approaches
Guidebook To Long-Term Retention Part 1: Challenges And Effective Approaches
 
ECM study - strategy, management & technology
ECM study - strategy, management & technologyECM study - strategy, management & technology
ECM study - strategy, management & technology
 
BioSharing - Update - Feb2016
BioSharing - Update - Feb2016BioSharing - Update - Feb2016
BioSharing - Update - Feb2016
 
Digital Curation 101: Preserve
Digital Curation 101: PreserveDigital Curation 101: Preserve
Digital Curation 101: Preserve
 

Mehr von PRELIDA Project

Steps towards a Data Value Chain
Steps towards a Data Value ChainSteps towards a Data Value Chain
Steps towards a Data Value ChainPRELIDA Project
 
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...PRELIDA Project
 
Experiments with evolving RDF
Experiments with evolving RDFExperiments with evolving RDF
Experiments with evolving RDFPRELIDA Project
 
Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data ...
Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data ...Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data ...
Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data ...PRELIDA Project
 
Media Ecology Project
Media Ecology ProjectMedia Ecology Project
Media Ecology ProjectPRELIDA Project
 
HIBERLINK: Reference Rot and Linked Data: Threat and Remedy
HIBERLINK: Reference Rot and Linked Data: Threat and RemedyHIBERLINK: Reference Rot and Linked Data: Threat and Remedy
HIBERLINK: Reference Rot and Linked Data: Threat and RemedyPRELIDA Project
 
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataCEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataPRELIDA Project
 
DIACHRON Project Overview
DIACHRON Project OverviewDIACHRON Project Overview
DIACHRON Project OverviewPRELIDA Project
 
PRELIDA Project Draft Roadmap
PRELIDA Project Draft RoadmapPRELIDA Project Draft Roadmap
PRELIDA Project Draft RoadmapPRELIDA Project
 
Introduction to PRELIDA Consolidation and Dissemination Workshop
Introduction to PRELIDA Consolidation and Dissemination WorkshopIntroduction to PRELIDA Consolidation and Dissemination Workshop
Introduction to PRELIDA Consolidation and Dissemination WorkshopPRELIDA Project
 
D3.1 State of the art assessment on Linked Data and Digital Preservation
D3.1 State of the art assessment on Linked Data and Digital PreservationD3.1 State of the art assessment on Linked Data and Digital Preservation
D3.1 State of the art assessment on Linked Data and Digital PreservationPRELIDA Project
 
Introduction to Prelida
Introduction to PrelidaIntroduction to Prelida
Introduction to PrelidaPRELIDA Project
 

Mehr von PRELIDA Project (13)

Steps towards a Data Value Chain
Steps towards a Data Value ChainSteps towards a Data Value Chain
Steps towards a Data Value Chain
 
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...
 
Experiments with evolving RDF
Experiments with evolving RDFExperiments with evolving RDF
Experiments with evolving RDF
 
Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data ...
Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data ...Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data ...
Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data ...
 
Media Ecology Project
Media Ecology ProjectMedia Ecology Project
Media Ecology Project
 
HIBERLINK: Reference Rot and Linked Data: Threat and Remedy
HIBERLINK: Reference Rot and Linked Data: Threat and RemedyHIBERLINK: Reference Rot and Linked Data: Threat and Remedy
HIBERLINK: Reference Rot and Linked Data: Threat and Remedy
 
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataCEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
 
DIACHRON Project Overview
DIACHRON Project OverviewDIACHRON Project Overview
DIACHRON Project Overview
 
PRELIDA Project Draft Roadmap
PRELIDA Project Draft RoadmapPRELIDA Project Draft Roadmap
PRELIDA Project Draft Roadmap
 
Introduction to PRELIDA Consolidation and Dissemination Workshop
Introduction to PRELIDA Consolidation and Dissemination WorkshopIntroduction to PRELIDA Consolidation and Dissemination Workshop
Introduction to PRELIDA Consolidation and Dissemination Workshop
 
D3.1 State of the art assessment on Linked Data and Digital Preservation
D3.1 State of the art assessment on Linked Data and Digital PreservationD3.1 State of the art assessment on Linked Data and Digital Preservation
D3.1 State of the art assessment on Linked Data and Digital Preservation
 
Gap Analysis
Gap AnalysisGap Analysis
Gap Analysis
 
Introduction to Prelida
Introduction to PrelidaIntroduction to Prelida
Introduction to Prelida
 

Preserving linked data: sustainability and organizational infrastructure

  • 1. Preserving linked data: sustainability and organizational infrastructure Mariella Guercio Sapienza UniversitĂ  di Roma
  • 2. sustainability and organizational infrastructure: assumptions for preserving LOD • Dynamic and distributed environments (like LOD) are always complex to preserve for reasons inextricably connected with both the LOD nature and the preservation goals • The main challenges could find solutions if an adequate and accurate organizational infrastructure will be in place as early as possible • The questions to solve are in many cases conflicting and still open
  • 3. sustainability and organizational infrastructure: relevant issues for a LOD preservation roadmap • the LOD imply changes which are the most challenging issue for DP: they have to be tracked, documented and maintained for future assessment • the links are essential LOD components, but according to the preservation rules and standards the main/significant links cannot be preserved as simple references to external resources, but must be part of the ingestion process or, at least, well documented and assessed with reference to their impact • the definition of what is significant implies a coordination among stakeholders and agreements with institutions of memory for ensuring continuity of access over time and sufficient documentation for presuming authenticity, while LOD are web based and not strictly related to the institutional control • documentation, procedures, policies are recognized as crucial tools for preservation to be created and preserved in the creation phase, but the awareness for documenting persistency is not, at the moment, a relevant issue for the LOD communities
  • 4. sustainability and organizational infrastructure: the standards point of view (ISO 14721 OAIS and 16363) - 1 • The basic requirements imply the capacity of managing datasets and organizational changes by – early identification of representation information to collect, ingest, archive and in case transform according to the DC involved (OAIS): it is not a technical question and the definition of the DC cannot be ignored if this definition can have more degrees of completeness and clarity – early definition of boundaries to ensure a sustainable approach for ingestion into the repositories (related to an adequate description of the DC): which links, which quantity of PDI, which vocabularies (OAIS) – accurate choice of the TDR with reference to the governance, policies systems, certification processes (ISO 16363) and an eventual federated network – clear identification of the profiles involved in the preservation processes and the crucial responsibilities (OAIS and ISO 16363)
  • 5. sustainability and organizational infrastructure: the standards point of view (ISO 14721 OAIS and 16363) - 2 In particular, according to ISO 16363 for certification processes the competences for organizational infrastructure (3.3) imply the documentation of the LOD repository/provider (?) reliability and include the ability to: • evaluate the process by which a designated community is defined • determine whether system documentation is adequate for all aspect of the TDR • determine whether preservation plans are adequate and match the preservation policies • determine if preservation policies are accurately captured in system workflows • determine if workflows are adequately documented • recognise whether an adequate level of detail has been recorded about system changes • evaluate the organisation’s commitment to transparency and accountability
  • 6. sustainability and organizational infrastructure: how to apply the preservation standards to LOD - 1 • the chain of responsibility should be based on a governance system able to testify the commitment and the transparency of the LOD system and/or its provider (ISO 16363, requirements 3.1): they must be well defined, clearly ruled and well documented and include: – who takes the main responsibilities, – who defines the policies and – on which basis they are approved and disseminated, – how to support the long-term persistency when the original LOD sets are not anymore curated (like in case of disappearance of the LOD providers), etc. • best practices must be identified (for more scenarios) • the repository should, at least, make evidence it is aware of these risks
  • 7. sustainability and organizational infrastructure: how to apply the preservation standards to LOD - 2 • to ensure the continuity of the preservation services, the preservation strategy should define the type of repository for long-term like archives/repositories held by institutions of memory (against in-house solutions); in case of private archives/repositories policies must be in place and must be able to testify the dataset providers’ awareness for this critical aspect • policies and procedural workflows have a key role but must be further detailed in case of LOD (see the recommendations of APARSEN for policies as defined by D35 and the sections 3.1 and 3.3. of ISO 16363): – general preservation policy – policies for vocabularies, related changes and standards of reference – policies for privacy – policies for managing links and networks of SIPs/AIPs to ingest/manage – policies for change management – policies for appraisal and retention – manuals which describe the fixity mechanisms – updating system for all the policies included in the preservation plan
  • 8. sustainability and organizational infrastructure: how to apply the preservation standards to LOD - 3 • the level of granularity and functions to be preserved must be supported by a strategic preservation plan whose specific strategies are developed according to the datasets nature and function (to maintain the correct degree of data intelligibility) • the economic sustainability has to be based on a cost model which takes into account the specific role of the stakeholders/providers and the custodians and risk assessment definition
  • 9. sustainability and organizational infrastructure: open questions • Many questions are still open: – Who is going to pay for preserving LOD (also within an organization)? – Why institutional repositories and institutions of memory should be interested? On which basis? Which role can be designed? – Can a network of federated repositories be accepted and supported? – Which level of service agreements is required between Institutions of memory and LOD providers?