SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Adopting DSpace 7 and 8:
Challenges and Solutions from
Real Migration Experiences
AGENDA
4Science who we
are
It is not just an
update, it is always
a migration
A couple of hints
about your data
model
There are more data
that need to be
migrated than what
you expect
Plan,
Do, Check, Finalize
Common pitfalls &
Solution strategies
Take aways
Today's speaker
Susanna Mornati,
Chief Operating Officer at 4Science
susanna.mornati@4science.com
Who we are
OUR AIM: to enable implementationof the transnationally
importantpolicies
of Open Research,
Research Impact
and Digital Preservation.
DSpace
(CRIS/GLAM)
OJS
Dataverse
Our
services:
Our solutions
support
compliance
with key
international
standards:
Certified Platinum
Provider and leading
contributorto DSpace 7
✓ OpenAIRE
✓ ORCID
✓ CERIF
✓ IIIF
We provide
solutions for
research
information & data
management and
for cultural
heritage
• Installation
• Configuration
• Hosting and
maintenance
• System integration,
customization and
consultancy
What we believe in Security Certification is not a matterof compromise: our
solutions are secure by design; openness without
security would be counterproductive;security without
openness would be unproductive.
ISO/IEC 27001:2013,
27017:2015, 27018:
2019, and
ISO/IEC 9001:2015
Our solutionssupport
the key defining
transnational
policies, Open
Research and open
digitalcultural
heritage, and are
based on:
Open-
source
software
Open
standards
Interoperability
Preservation
Collaboration
Innovation
A fast-growing organization
Over 100 clients
in 5 continents
worldwide
https://www.4science.com/we-work-for/
The context in which we operate since 2016
We are driven by serving the
open knowledge ecosystem.
Proprietary products often
come with expensivelicenses
and pricing fluctuations,can
become obsolescent and can
result in vendor-lock-in.
Our open solutions (open
standards,open protocols,
open source) aredesigned to
support open science.
Open knowledge helps to
solve,by collaboration,the
world’s very
pressingproblems,and creates
new opportunities, especially
when cross-disciplinary.
4Science role in the Open Science and DSpace community
Certified Platinum
Provider and leading
contributor of DSpace
Our goal is to anticipate
the future making it
more accessible
2023 DSpace worldwide
community leaders for
hours donated for
DSpace development
Experts in the field and
enablers that can help
with any situation
At 4Science we are driven by serving the open knowledge ecosystem.
Openknowledge
empowering open access,
supporting open science,
advancing open scholarly
communication.
FAIR data
Our solutions enable your
data to be Findable,
Accessible, Interoperable and
Reusable
Interoperablesolutions
ORCID and Datacite Certified
Service Provider, CERIF and
IIIF enabler
Compliance& Quality
COAR-NGR, OpenAIRE,
Certified Platinum Provider of
DSpace, ISO 9001:2015
Security
Battle-tested solutions, secure
by design; Trusted Providers
of the Cloud Security Alliance
«Migration» or «update»? Not so different?
In this session we will lookat some insights frombest practices that we havelearned moving from DSpace 5,
DSpace 6, EPrints, Digital Commons, OPUS or even custom solutions, but the first thing we would like to share is…
Even when you are about to upgrade from an old to the new version of DSpace, keep in mind that it has been
completelyreengineered fromprevious ones:anyupdate to a major release should therefore be understood (and
planned with the appropriate timing)as if it were a migration to an entirelynewplatform,in additionto
integrations with systems alreadyin yourecosystem.
Consider it as it was a migration toa completelydifferent system, although the main paradigms and approaches
are preserved
Entities are
the
foundation of
the new data
model
An effective datamodel should also be
flexible
Entities are a pivotal part of defining a
whole datamodel contributing to its
design, they enable flexibility to reflect
your data in a more granular way
Your data model should be as close as
possible to international standards to
enhance interoperability
The current design of DSpace 7
provides the foundation for flexibility
ensuring that it can be tailored to your
requirements
Relations complete the definition of
your data model: authors, publications,
organizations and more, can be
interconnected to each other
Entities should reflect your data model, enabling
relations and exploring connections
ENTITIES AREA WAYOF
REPRESENTING DATA AND THEIR
RELATIONS IN A STRUCTURED
MANNER
ENTITIES ARECONSTITUTED BY
RECORDS THATCAN BE
DESCRIBED, IDENTIFIED, AND
RELATED TO OTHER RECORDS IN
A REPOSITORY
ENTITIES ARE USED TO
REPRESENT REAL-WORLD
OBJECTS SUCH AS PEOPLE,
ORGANIZATIONS,
PUBLICATIONS AS WELL AS
ABSTRACTCONCEPTS SUCH AS
SUSTAINABILITY GOALS,
RESEARCH LINES, THEMATIC
COLLECTIONS
ENTITIES AREUSED TO PROVIDE
CONTEXT, CONNECTIONS, AND
RELATIONSHIPS BETWEEN
OBJECTS INTHE REPOSITORY,
SUPPORTING DISCOVERYAND
COMPREHENSION OF THE
CONTEXT
But with a correct balance:
when you’re about to migrate…
• You could have processes that
you would like to drop
• Customizations that affect your
maintenance costs
• Metadata representing
information that is no longer
useful
• And processes…you’d like to add,
or change
• New features that can substitute
your old customizations
• Opportunity to add new
information to your repository
How to enable entities
during the upgrade: pt. 1
How to enable entities
during the upgrade: pt 2
This step/job may be slow!
How to enable
entities during
the
“migration”
from other
platforms
Follow the DSpace documentation, YES but...Howto import all the
metadata, relationship and files?
• The SAF import could be an option (single records), BUT... you
cannot set the relationship with not-yet-created entities: it is
preferable to individually create all entities, make sure to store a
local.legacyid value for each
• Use the CSV Bulk edit (manually or automatically updated) to
create the relationship(s)
Warning: CSV Bulk cannot manage ordering between entities and
simple strings (i.e. ordering of Authors when only few of them have
a profile)
All of that is
easier in DSpace-
CRIS thanks to
the possibility to
use…
• Denormalized tables where you can prepare your data for import (like
the CSV but on the database) → easier!
• Enhanced Bulk import from Excel instead of CSV (yes, it is a non-standard
format but easier to work with, available for non-technical people →
new lines can be created)
• Promise for future reference that will be resolved once the target item is
created (i.e. you can say will be referenced:ORCID:XXXXX to
create a relation with the item AUTHOR using
person.identifier.orcid = XXXXX)
• You can manage files directly providing a remote URL (no SAF process
needed)
• Ordering between Entities and strings is supported (column with the
specific relationship.type can be ordered by value/promise)
Not enough said, but…
Do not customize your DSpace
database tables/structure, nor
backport any feature that changes it
Why?
Because it could lead to your
automated database upgrade process
to fail
Create new tables (instead of
modifying existing ones)
ALREADYDID?
Consider replacing your additional contents (tables) → new entities enabled by DSpace 7
Yes, your institution has a lot of data
…and not all of them are visible in plain sight (as metadata of your
items)
There will be more data emerging that you did not imagine
So…please keep this in mind
OAI Identifiers should be preserved.This is currently not supported without code change
(we plan to generalize the solution and open a PR → DSpace 8)
OAI URLs should be preserved as well:
redirection is (almost) good but you should
check it at least with your known harvesters
→ Easy to do in Apache or nginX (light web
server)
Statistics can be migrated
Upgradeprocedures, if followed, will resultin a full migration of the data... not -really-
deleted items / bitstreams areloss
When you migratefromanother platformyou can bulk import your statistics data
directly in SOLRvia CSV. Data need to be prepared so a local.legacyidmetadata willbe
crucial to translateyour legacy ID into the new one
Step 1: PLAN - ask yourself all relevant questions
Make sure to sync your activities and preparatory/interdependent tasks...
Prepare a new,
separated, environment
for DSpace 7
Do you use the Handle
Server?
Do you mint DOIs?
Integration:whatapplication extractsdata fromDSpace?What application
pushesdataintoDSpace? Usingwhich technology:SWORD, REST API?How
much time 3rd partieswill needtoswitchfromthe oldintegrationtothe
newone?
Plan to put your
repository in READ ONLY
mode for enough time
to perform the final
migration
Prepare your UATs that
should take into account
of your customizations,
configurations and top-
priority functionalities
You need to run the migration at least two times and
usually you cannot afford to haveyour currentrepository
locked down for a long period
This means that the two runs will useslightly differentdata!
Even if the repository is
in READ ONLY mode,
there are still running
data... Statistics will
grow!
Step 2: DO
Verify Verify the timing for execution/import/indexing during this phase: you’ll
benefit from them for the final migration
Note Remember to keep track of all of your steps (you’ll have to exactly repeat
them for the final migration)
Do Do your first test migration
Step 3:
CHECK
Perform UATs to validate and
flag possible issues (and the
related fixes you applied)
If you notice something
wrong that was not covered
by UATs, you should not
ignore it: UATs should be
amended to reflect the
path
Verify that timing of the
first migration allows you
to meet the
deadlines you were
expecting?
Verufy which tasks could be
optimized/reviewed
Check data integrity: run the
checksum checker (fixed by
4Science in 7.6)
Temporarily disable indexing during intermediate milestones/steps to save some time…
(…but be careful of the interdependencies in further steps and keep in mind that you’ll have to run a full
indexing when needed)
About the automatic initial reindexing: it is not recommended to skip it, unless you will manually reindex at a
later time, or verify that a reindexing is not necessary. Forgetting to reindex your site after an upgrade may result
in unexpected errors or instabilities
Step 4: FINALIZE
Put in read-only mode your current production environment before performing the final
deployment
Alert your partners of integrated systems that the systems is freezed
Extract your data from your current freezed repository
Re-run the steps that you succesfully run during the first test migration: even small
differences may lead to unexpexted issues
Run the UAT books: if everything goes smooth, make the final switch into production
DOs
Alert Give notice to your partners that they can restartto perform ordinary activities on their
3rd party systems
Move Move your handle server to your new environment
Enable Enable all of your crontabjobs
Update Update ALL of your URLs to matchthe ones in productions
More pitfalls and solutions we adopted
with experience
…fromDSpace 5, DSpace 6, EPrints,Digital
Commons, OPUS,Invenio…
UATs, the world where the obvious is certainly
not – guidelines
A plan should be prepared and followed methodically to test and verify
consistencybetween the old systemand the new one. A few examples:
1. How many items were visible in the old system? How many in the new one?
2. How many items were present in the users' workspace? How many in the new
system?
3. Same for workflows: how many in the various steps, how many in charge of the
various users?
4. Are any items restricted or embargoed? Are restrictions migrated correctly and
working?
5. Are all protocols used by 3rd party systems enabled (SWORD? Legacy REST…)?
Time spent in
UATs is very
well-spent
Through these cross-checks we had the
opportunity to discover inconsistencies
between the database and UI of older
versions of DSpace:
oeven fixing the problem in the new
version did not always coincide with
the user's desires (e.g., items
previously not visible by mistake
becoming visible in the new version
and vice versa).
Fun facts and
unapparent trivia
Thumbnails in the new DSpace 7 are now larger
than in the old versions. We learned that the
layout, importing the old ones, would be
compromised.
This resulted in the discovery of the century: all
thumbnails had to be…regenerated.
4Science contributed the fix for the regeneration
of the thumbnails ☺
The moral: consider every possible interaction!
Fun facts and
unapparent trivia
Most viewed item? OH YES PLEASE.
…but the item in the new version turned out to be
different from the item in the old version. Why?
Because slightly different rules had simply been applied-
which led to a different result.
One can never be too cautious: watch out for
inconsistenciesand rule changes, even
minimal ones.
What about DSpace 8?
• DSpace 8 is expected to go live in the spring/summer of 2024
• It will not be a major change like DSpace 7 was
Should I upgrade to DSpace 7 or wait for DSpace 8 to be released?
• We suggest to cautiously migrate/upgrade to the most stable version at the
moment of the release, assessing what is better for your institution
• The upgrade from DSpace 7 to Dspace 8 will not require such a big effort
compared the upgrade from DSpace 5 / 6 to 7
• Institutions upgrading from DSpace 7 to DSpace 8 will enjoy features already
implemented in DSpace-CRIS 7, e.g. Notify protocol (contributedby 4Science
+ Harvard), Correction service to enhance data quality (4Science), Duplicate
detection (ported by TLC from our implementation in DSpace-CRIS)
Be sure to check every
minimal step and take careful
note of it.
Time spent in analysis and
double-checks is really well
spent
We, at 4Science, would love
to put out expertise at your
service on behalf of the
entire community.
Contact us at: info@4Science.com
Visit our website: www.4science.com
Follow us on social media!
4Science International 4ScienceDSpace
4ScienceIT
4Science
Join the 4Science
newsletter to keep up to
date with news about
our contributions to
DSpace and much more!

Weitere ähnliche Inhalte

Was ist angesagt?

Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
How to Improve Data Analysis Through Visualization in Tableau
How to Improve Data Analysis Through Visualization in TableauHow to Improve Data Analysis Through Visualization in Tableau
How to Improve Data Analysis Through Visualization in TableauEdureka!
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best PracticesAmazon Web Services
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Transparent Encryption in HDFS
Transparent Encryption in HDFSTransparent Encryption in HDFS
Transparent Encryption in HDFSDataWorks Summit
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerDatabricks
 
Building Big Data Applications using Spark, Hive, HBase and Kafka
Building Big Data Applications using Spark, Hive, HBase and KafkaBuilding Big Data Applications using Spark, Hive, HBase and Kafka
Building Big Data Applications using Spark, Hive, HBase and KafkaAshish Thapliyal
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure DatabricksJames Serra
 
Oracle GoldenGate Microservices Overview ( with Demo )
Oracle GoldenGate Microservices Overview ( with Demo )Oracle GoldenGate Microservices Overview ( with Demo )
Oracle GoldenGate Microservices Overview ( with Demo )Mari Kupatadze
 
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lakeMykola Zerniuk
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Presentation des Essentiels de MS Office365
Presentation des Essentiels de MS Office365Presentation des Essentiels de MS Office365
Presentation des Essentiels de MS Office365Laurent Rouable
 
Private cloud-webinar
Private cloud-webinarPrivate cloud-webinar
Private cloud-webinarWSO2
 
Migration to Oracle Multitenant
Migration to Oracle MultitenantMigration to Oracle Multitenant
Migration to Oracle MultitenantJitendra Singh
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Amazon Web Services
 
How to Migrate from Oracle to EDB Postgres
How to Migrate from Oracle to EDB PostgresHow to Migrate from Oracle to EDB Postgres
How to Migrate from Oracle to EDB PostgresAshnikbiz
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 

Was ist angesagt? (20)

Introduction to Dremio
Introduction to DremioIntroduction to Dremio
Introduction to Dremio
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
How to Improve Data Analysis Through Visualization in Tableau
How to Improve Data Analysis Through Visualization in TableauHow to Improve Data Analysis Through Visualization in Tableau
How to Improve Data Analysis Through Visualization in Tableau
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Transparent Encryption in HDFS
Transparent Encryption in HDFSTransparent Encryption in HDFS
Transparent Encryption in HDFS
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
Building Big Data Applications using Spark, Hive, HBase and Kafka
Building Big Data Applications using Spark, Hive, HBase and KafkaBuilding Big Data Applications using Spark, Hive, HBase and Kafka
Building Big Data Applications using Spark, Hive, HBase and Kafka
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
 
Oracle GoldenGate Microservices Overview ( with Demo )
Oracle GoldenGate Microservices Overview ( with Demo )Oracle GoldenGate Microservices Overview ( with Demo )
Oracle GoldenGate Microservices Overview ( with Demo )
 
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lake
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
Presentation des Essentiels de MS Office365
Presentation des Essentiels de MS Office365Presentation des Essentiels de MS Office365
Presentation des Essentiels de MS Office365
 
Private cloud-webinar
Private cloud-webinarPrivate cloud-webinar
Private cloud-webinar
 
Migration to Oracle Multitenant
Migration to Oracle MultitenantMigration to Oracle Multitenant
Migration to Oracle Multitenant
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
How to Migrate from Oracle to EDB Postgres
How to Migrate from Oracle to EDB PostgresHow to Migrate from Oracle to EDB Postgres
How to Migrate from Oracle to EDB Postgres
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 

Ähnlich wie “Adoption DSpace 7 and 8 Challenges and Solutions from Real Migration Experiences”.pdf

Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Vantara
 
Managing Large Amounts of Data with Salesforce
Managing Large Amounts of Data with SalesforceManaging Large Amounts of Data with Salesforce
Managing Large Amounts of Data with SalesforceSense Corp
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overviewvhrocca
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseDataWorks Summit
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
Solution Brief: Big Data Lab Accelerator
Solution Brief: Big Data Lab AcceleratorSolution Brief: Big Data Lab Accelerator
Solution Brief: Big Data Lab AcceleratorBlueData, Inc.
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAlluxio, Inc.
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldCA Technologies
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationDenodo
 
How to Migrate, Manage and Centralize your Web Infrastructure with Drupal
How to Migrate, Manage and Centralize your Web Infrastructure with DrupalHow to Migrate, Manage and Centralize your Web Infrastructure with Drupal
How to Migrate, Manage and Centralize your Web Infrastructure with DrupalAcquia
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierWebinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierDataStax
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 

Ähnlich wie “Adoption DSpace 7 and 8 Challenges and Solutions from Real Migration Experiences”.pdf (20)

Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Managing Large Amounts of Data with Salesforce
Managing Large Amounts of Data with SalesforceManaging Large Amounts of Data with Salesforce
Managing Large Amounts of Data with Salesforce
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 
Solution Brief: Big Data Lab Accelerator
Solution Brief: Big Data Lab AcceleratorSolution Brief: Big Data Lab Accelerator
Solution Brief: Big Data Lab Accelerator
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Talend for big_data_intorduction
Talend for big_data_intorductionTalend for big_data_intorduction
Talend for big_data_intorduction
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud World
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data Virtualization
 
How to Migrate, Manage and Centralize your Web Infrastructure with Drupal
How to Migrate, Manage and Centralize your Web Infrastructure with DrupalHow to Migrate, Manage and Centralize your Web Infrastructure with Drupal
How to Migrate, Manage and Centralize your Web Infrastructure with Drupal
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierWebinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 

Mehr von 4Science

From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...4Science
 
DSpace-CRIS design & Implementation
DSpace-CRIS design & ImplementationDSpace-CRIS design & Implementation
DSpace-CRIS design & Implementation4Science
 
Status of discussions with repository platforms_ DSpace.pdf
Status of discussions with repository platforms_ DSpace.pdfStatus of discussions with repository platforms_ DSpace.pdf
Status of discussions with repository platforms_ DSpace.pdf4Science
 
DSpace GLAM Infographic.pdf
DSpace GLAM Infographic.pdfDSpace GLAM Infographic.pdf
DSpace GLAM Infographic.pdf4Science
 
DSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdfDSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdf4Science
 
IIIF and DSpace 7 - IIIF Conference 2023.pdf
IIIF and DSpace 7 - IIIF Conference 2023.pdfIIIF and DSpace 7 - IIIF Conference 2023.pdf
IIIF and DSpace 7 - IIIF Conference 2023.pdf4Science
 
DSpace-CRIS, anticipating innovation
DSpace-CRIS, anticipating innovationDSpace-CRIS, anticipating innovation
DSpace-CRIS, anticipating innovation4Science
 
DSpace 7 ORCID Integration
DSpace 7 ORCID IntegrationDSpace 7 ORCID Integration
DSpace 7 ORCID Integration4Science
 
Bringing IIIF to the DSpace community
Bringing IIIF to the DSpace communityBringing IIIF to the DSpace community
Bringing IIIF to the DSpace community4Science
 
Implementing the Notify protocol and standard practices in DSpace
Implementing the Notify protocol and standard practices in DSpaceImplementing the Notify protocol and standard practices in DSpace
Implementing the Notify protocol and standard practices in DSpace4Science
 
The EOSC DIH "ELD Advance" project
The EOSC DIH "ELD Advance" projectThe EOSC DIH "ELD Advance" project
The EOSC DIH "ELD Advance" project4Science
 
DSpace implementation of the COAR Notify Project - status update
DSpace implementation of the COAR Notify Project - status updateDSpace implementation of the COAR Notify Project - status update
DSpace implementation of the COAR Notify Project - status update4Science
 
Convegno Stelline 2020 - 4Science -16 settembre _ pubbliche
Convegno Stelline 2020 - 4Science -16 settembre _ pubblicheConvegno Stelline 2020 - 4Science -16 settembre _ pubbliche
Convegno Stelline 2020 - 4Science -16 settembre _ pubbliche4Science
 
Convegno Stelline 2020 - 4Science -16 settembre _ accademiche
Convegno Stelline 2020 - 4Science -16 settembre _ accademicheConvegno Stelline 2020 - 4Science -16 settembre _ accademiche
Convegno Stelline 2020 - 4Science -16 settembre _ accademiche4Science
 
Convegno Stelline 2020 - 4Science
Convegno Stelline 2020 - 4Science Convegno Stelline 2020 - 4Science
Convegno Stelline 2020 - 4Science 4Science
 
DSpace-CRIS 7: What is Coming? OR2020
DSpace-CRIS 7: What is Coming? OR2020DSpace-CRIS 7: What is Coming? OR2020
DSpace-CRIS 7: What is Coming? OR20204Science
 
News about DSpace-CRIS Anwendertreffen 2020
News about DSpace-CRIS Anwendertreffen 2020News about DSpace-CRIS Anwendertreffen 2020
News about DSpace-CRIS Anwendertreffen 20204Science
 
Digital library: riflessioni su scelte e obiettivi. Visibilità delle collezio...
Digital library: riflessioni su scelte e obiettivi. Visibilità delle collezio...Digital library: riflessioni su scelte e obiettivi. Visibilità delle collezio...
Digital library: riflessioni su scelte e obiettivi. Visibilità delle collezio...4Science
 
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...4Science
 
DSpace-CRIS ORCID Integration
DSpace-CRIS ORCID IntegrationDSpace-CRIS ORCID Integration
DSpace-CRIS ORCID Integration4Science
 

Mehr von 4Science (20)

From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
From Digital Records to Digital Cultural Landscapes. Beyond Digital Library b...
 
DSpace-CRIS design & Implementation
DSpace-CRIS design & ImplementationDSpace-CRIS design & Implementation
DSpace-CRIS design & Implementation
 
Status of discussions with repository platforms_ DSpace.pdf
Status of discussions with repository platforms_ DSpace.pdfStatus of discussions with repository platforms_ DSpace.pdf
Status of discussions with repository platforms_ DSpace.pdf
 
DSpace GLAM Infographic.pdf
DSpace GLAM Infographic.pdfDSpace GLAM Infographic.pdf
DSpace GLAM Infographic.pdf
 
DSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdfDSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdf
 
IIIF and DSpace 7 - IIIF Conference 2023.pdf
IIIF and DSpace 7 - IIIF Conference 2023.pdfIIIF and DSpace 7 - IIIF Conference 2023.pdf
IIIF and DSpace 7 - IIIF Conference 2023.pdf
 
DSpace-CRIS, anticipating innovation
DSpace-CRIS, anticipating innovationDSpace-CRIS, anticipating innovation
DSpace-CRIS, anticipating innovation
 
DSpace 7 ORCID Integration
DSpace 7 ORCID IntegrationDSpace 7 ORCID Integration
DSpace 7 ORCID Integration
 
Bringing IIIF to the DSpace community
Bringing IIIF to the DSpace communityBringing IIIF to the DSpace community
Bringing IIIF to the DSpace community
 
Implementing the Notify protocol and standard practices in DSpace
Implementing the Notify protocol and standard practices in DSpaceImplementing the Notify protocol and standard practices in DSpace
Implementing the Notify protocol and standard practices in DSpace
 
The EOSC DIH "ELD Advance" project
The EOSC DIH "ELD Advance" projectThe EOSC DIH "ELD Advance" project
The EOSC DIH "ELD Advance" project
 
DSpace implementation of the COAR Notify Project - status update
DSpace implementation of the COAR Notify Project - status updateDSpace implementation of the COAR Notify Project - status update
DSpace implementation of the COAR Notify Project - status update
 
Convegno Stelline 2020 - 4Science -16 settembre _ pubbliche
Convegno Stelline 2020 - 4Science -16 settembre _ pubblicheConvegno Stelline 2020 - 4Science -16 settembre _ pubbliche
Convegno Stelline 2020 - 4Science -16 settembre _ pubbliche
 
Convegno Stelline 2020 - 4Science -16 settembre _ accademiche
Convegno Stelline 2020 - 4Science -16 settembre _ accademicheConvegno Stelline 2020 - 4Science -16 settembre _ accademiche
Convegno Stelline 2020 - 4Science -16 settembre _ accademiche
 
Convegno Stelline 2020 - 4Science
Convegno Stelline 2020 - 4Science Convegno Stelline 2020 - 4Science
Convegno Stelline 2020 - 4Science
 
DSpace-CRIS 7: What is Coming? OR2020
DSpace-CRIS 7: What is Coming? OR2020DSpace-CRIS 7: What is Coming? OR2020
DSpace-CRIS 7: What is Coming? OR2020
 
News about DSpace-CRIS Anwendertreffen 2020
News about DSpace-CRIS Anwendertreffen 2020News about DSpace-CRIS Anwendertreffen 2020
News about DSpace-CRIS Anwendertreffen 2020
 
Digital library: riflessioni su scelte e obiettivi. Visibilità delle collezio...
Digital library: riflessioni su scelte e obiettivi. Visibilità delle collezio...Digital library: riflessioni su scelte e obiettivi. Visibilità delle collezio...
Digital library: riflessioni su scelte e obiettivi. Visibilità delle collezio...
 
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
 
DSpace-CRIS ORCID Integration
DSpace-CRIS ORCID IntegrationDSpace-CRIS ORCID Integration
DSpace-CRIS ORCID Integration
 

Kürzlich hochgeladen

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 

Kürzlich hochgeladen (20)

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 

“Adoption DSpace 7 and 8 Challenges and Solutions from Real Migration Experiences”.pdf

  • 1. Adopting DSpace 7 and 8: Challenges and Solutions from Real Migration Experiences
  • 2. AGENDA 4Science who we are It is not just an update, it is always a migration A couple of hints about your data model There are more data that need to be migrated than what you expect Plan, Do, Check, Finalize Common pitfalls & Solution strategies Take aways
  • 3. Today's speaker Susanna Mornati, Chief Operating Officer at 4Science susanna.mornati@4science.com
  • 4. Who we are OUR AIM: to enable implementationof the transnationally importantpolicies of Open Research, Research Impact and Digital Preservation. DSpace (CRIS/GLAM) OJS Dataverse Our services: Our solutions support compliance with key international standards: Certified Platinum Provider and leading contributorto DSpace 7 ✓ OpenAIRE ✓ ORCID ✓ CERIF ✓ IIIF We provide solutions for research information & data management and for cultural heritage • Installation • Configuration • Hosting and maintenance • System integration, customization and consultancy
  • 5. What we believe in Security Certification is not a matterof compromise: our solutions are secure by design; openness without security would be counterproductive;security without openness would be unproductive. ISO/IEC 27001:2013, 27017:2015, 27018: 2019, and ISO/IEC 9001:2015 Our solutionssupport the key defining transnational policies, Open Research and open digitalcultural heritage, and are based on: Open- source software Open standards Interoperability Preservation Collaboration Innovation
  • 6. A fast-growing organization Over 100 clients in 5 continents worldwide https://www.4science.com/we-work-for/
  • 7. The context in which we operate since 2016 We are driven by serving the open knowledge ecosystem. Proprietary products often come with expensivelicenses and pricing fluctuations,can become obsolescent and can result in vendor-lock-in. Our open solutions (open standards,open protocols, open source) aredesigned to support open science. Open knowledge helps to solve,by collaboration,the world’s very pressingproblems,and creates new opportunities, especially when cross-disciplinary.
  • 8. 4Science role in the Open Science and DSpace community Certified Platinum Provider and leading contributor of DSpace Our goal is to anticipate the future making it more accessible 2023 DSpace worldwide community leaders for hours donated for DSpace development Experts in the field and enablers that can help with any situation At 4Science we are driven by serving the open knowledge ecosystem. Openknowledge empowering open access, supporting open science, advancing open scholarly communication. FAIR data Our solutions enable your data to be Findable, Accessible, Interoperable and Reusable Interoperablesolutions ORCID and Datacite Certified Service Provider, CERIF and IIIF enabler Compliance& Quality COAR-NGR, OpenAIRE, Certified Platinum Provider of DSpace, ISO 9001:2015 Security Battle-tested solutions, secure by design; Trusted Providers of the Cloud Security Alliance
  • 9. «Migration» or «update»? Not so different? In this session we will lookat some insights frombest practices that we havelearned moving from DSpace 5, DSpace 6, EPrints, Digital Commons, OPUS or even custom solutions, but the first thing we would like to share is… Even when you are about to upgrade from an old to the new version of DSpace, keep in mind that it has been completelyreengineered fromprevious ones:anyupdate to a major release should therefore be understood (and planned with the appropriate timing)as if it were a migration to an entirelynewplatform,in additionto integrations with systems alreadyin yourecosystem. Consider it as it was a migration toa completelydifferent system, although the main paradigms and approaches are preserved
  • 10. Entities are the foundation of the new data model An effective datamodel should also be flexible Entities are a pivotal part of defining a whole datamodel contributing to its design, they enable flexibility to reflect your data in a more granular way Your data model should be as close as possible to international standards to enhance interoperability The current design of DSpace 7 provides the foundation for flexibility ensuring that it can be tailored to your requirements Relations complete the definition of your data model: authors, publications, organizations and more, can be interconnected to each other
  • 11. Entities should reflect your data model, enabling relations and exploring connections ENTITIES AREA WAYOF REPRESENTING DATA AND THEIR RELATIONS IN A STRUCTURED MANNER ENTITIES ARECONSTITUTED BY RECORDS THATCAN BE DESCRIBED, IDENTIFIED, AND RELATED TO OTHER RECORDS IN A REPOSITORY ENTITIES ARE USED TO REPRESENT REAL-WORLD OBJECTS SUCH AS PEOPLE, ORGANIZATIONS, PUBLICATIONS AS WELL AS ABSTRACTCONCEPTS SUCH AS SUSTAINABILITY GOALS, RESEARCH LINES, THEMATIC COLLECTIONS ENTITIES AREUSED TO PROVIDE CONTEXT, CONNECTIONS, AND RELATIONSHIPS BETWEEN OBJECTS INTHE REPOSITORY, SUPPORTING DISCOVERYAND COMPREHENSION OF THE CONTEXT
  • 12. But with a correct balance: when you’re about to migrate… • You could have processes that you would like to drop • Customizations that affect your maintenance costs • Metadata representing information that is no longer useful • And processes…you’d like to add, or change • New features that can substitute your old customizations • Opportunity to add new information to your repository
  • 13. How to enable entities during the upgrade: pt. 1
  • 14. How to enable entities during the upgrade: pt 2 This step/job may be slow!
  • 15. How to enable entities during the “migration” from other platforms Follow the DSpace documentation, YES but...Howto import all the metadata, relationship and files? • The SAF import could be an option (single records), BUT... you cannot set the relationship with not-yet-created entities: it is preferable to individually create all entities, make sure to store a local.legacyid value for each • Use the CSV Bulk edit (manually or automatically updated) to create the relationship(s) Warning: CSV Bulk cannot manage ordering between entities and simple strings (i.e. ordering of Authors when only few of them have a profile)
  • 16. All of that is easier in DSpace- CRIS thanks to the possibility to use… • Denormalized tables where you can prepare your data for import (like the CSV but on the database) → easier! • Enhanced Bulk import from Excel instead of CSV (yes, it is a non-standard format but easier to work with, available for non-technical people → new lines can be created) • Promise for future reference that will be resolved once the target item is created (i.e. you can say will be referenced:ORCID:XXXXX to create a relation with the item AUTHOR using person.identifier.orcid = XXXXX) • You can manage files directly providing a remote URL (no SAF process needed) • Ordering between Entities and strings is supported (column with the specific relationship.type can be ordered by value/promise)
  • 17. Not enough said, but… Do not customize your DSpace database tables/structure, nor backport any feature that changes it Why? Because it could lead to your automated database upgrade process to fail Create new tables (instead of modifying existing ones) ALREADYDID? Consider replacing your additional contents (tables) → new entities enabled by DSpace 7
  • 18. Yes, your institution has a lot of data …and not all of them are visible in plain sight (as metadata of your items) There will be more data emerging that you did not imagine
  • 19. So…please keep this in mind OAI Identifiers should be preserved.This is currently not supported without code change (we plan to generalize the solution and open a PR → DSpace 8) OAI URLs should be preserved as well: redirection is (almost) good but you should check it at least with your known harvesters → Easy to do in Apache or nginX (light web server) Statistics can be migrated Upgradeprocedures, if followed, will resultin a full migration of the data... not -really- deleted items / bitstreams areloss When you migratefromanother platformyou can bulk import your statistics data directly in SOLRvia CSV. Data need to be prepared so a local.legacyidmetadata willbe crucial to translateyour legacy ID into the new one
  • 20. Step 1: PLAN - ask yourself all relevant questions Make sure to sync your activities and preparatory/interdependent tasks... Prepare a new, separated, environment for DSpace 7 Do you use the Handle Server? Do you mint DOIs? Integration:whatapplication extractsdata fromDSpace?What application pushesdataintoDSpace? Usingwhich technology:SWORD, REST API?How much time 3rd partieswill needtoswitchfromthe oldintegrationtothe newone? Plan to put your repository in READ ONLY mode for enough time to perform the final migration Prepare your UATs that should take into account of your customizations, configurations and top- priority functionalities You need to run the migration at least two times and usually you cannot afford to haveyour currentrepository locked down for a long period This means that the two runs will useslightly differentdata! Even if the repository is in READ ONLY mode, there are still running data... Statistics will grow!
  • 21. Step 2: DO Verify Verify the timing for execution/import/indexing during this phase: you’ll benefit from them for the final migration Note Remember to keep track of all of your steps (you’ll have to exactly repeat them for the final migration) Do Do your first test migration
  • 22. Step 3: CHECK Perform UATs to validate and flag possible issues (and the related fixes you applied) If you notice something wrong that was not covered by UATs, you should not ignore it: UATs should be amended to reflect the path Verify that timing of the first migration allows you to meet the deadlines you were expecting? Verufy which tasks could be optimized/reviewed Check data integrity: run the checksum checker (fixed by 4Science in 7.6) Temporarily disable indexing during intermediate milestones/steps to save some time… (…but be careful of the interdependencies in further steps and keep in mind that you’ll have to run a full indexing when needed) About the automatic initial reindexing: it is not recommended to skip it, unless you will manually reindex at a later time, or verify that a reindexing is not necessary. Forgetting to reindex your site after an upgrade may result in unexpected errors or instabilities
  • 23. Step 4: FINALIZE Put in read-only mode your current production environment before performing the final deployment Alert your partners of integrated systems that the systems is freezed Extract your data from your current freezed repository Re-run the steps that you succesfully run during the first test migration: even small differences may lead to unexpexted issues Run the UAT books: if everything goes smooth, make the final switch into production
  • 24. DOs Alert Give notice to your partners that they can restartto perform ordinary activities on their 3rd party systems Move Move your handle server to your new environment Enable Enable all of your crontabjobs Update Update ALL of your URLs to matchthe ones in productions
  • 25. More pitfalls and solutions we adopted with experience …fromDSpace 5, DSpace 6, EPrints,Digital Commons, OPUS,Invenio…
  • 26. UATs, the world where the obvious is certainly not – guidelines A plan should be prepared and followed methodically to test and verify consistencybetween the old systemand the new one. A few examples: 1. How many items were visible in the old system? How many in the new one? 2. How many items were present in the users' workspace? How many in the new system? 3. Same for workflows: how many in the various steps, how many in charge of the various users? 4. Are any items restricted or embargoed? Are restrictions migrated correctly and working? 5. Are all protocols used by 3rd party systems enabled (SWORD? Legacy REST…)?
  • 27. Time spent in UATs is very well-spent Through these cross-checks we had the opportunity to discover inconsistencies between the database and UI of older versions of DSpace: oeven fixing the problem in the new version did not always coincide with the user's desires (e.g., items previously not visible by mistake becoming visible in the new version and vice versa).
  • 28. Fun facts and unapparent trivia Thumbnails in the new DSpace 7 are now larger than in the old versions. We learned that the layout, importing the old ones, would be compromised. This resulted in the discovery of the century: all thumbnails had to be…regenerated. 4Science contributed the fix for the regeneration of the thumbnails ☺ The moral: consider every possible interaction!
  • 29. Fun facts and unapparent trivia Most viewed item? OH YES PLEASE. …but the item in the new version turned out to be different from the item in the old version. Why? Because slightly different rules had simply been applied- which led to a different result. One can never be too cautious: watch out for inconsistenciesand rule changes, even minimal ones.
  • 30. What about DSpace 8? • DSpace 8 is expected to go live in the spring/summer of 2024 • It will not be a major change like DSpace 7 was Should I upgrade to DSpace 7 or wait for DSpace 8 to be released? • We suggest to cautiously migrate/upgrade to the most stable version at the moment of the release, assessing what is better for your institution • The upgrade from DSpace 7 to Dspace 8 will not require such a big effort compared the upgrade from DSpace 5 / 6 to 7 • Institutions upgrading from DSpace 7 to DSpace 8 will enjoy features already implemented in DSpace-CRIS 7, e.g. Notify protocol (contributedby 4Science + Harvard), Correction service to enhance data quality (4Science), Duplicate detection (ported by TLC from our implementation in DSpace-CRIS)
  • 31. Be sure to check every minimal step and take careful note of it. Time spent in analysis and double-checks is really well spent We, at 4Science, would love to put out expertise at your service on behalf of the entire community. Contact us at: info@4Science.com Visit our website: www.4science.com Follow us on social media! 4Science International 4ScienceDSpace 4ScienceIT 4Science Join the 4Science newsletter to keep up to date with news about our contributions to DSpace and much more!