SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
@dataverseorg
Mercè Crosas, Director of Data Science, IQSS, Harvard University
@mercecrosas
Harvard Purdue Data Management Symposium, June 16-17, 2015
New!
About Dataverse
● Gives credit and control to data authors and distributors
● Follows best practices, standards for data management and archiving
● Dataverse development started in 2006 at Harvard’s IQSS
● Now widely used, with a vibrant development and user community
● Helped instigate and is at the center of a cultural change toward open and
reproducible research
Science requires
community access
to data
An open source software
project for sharing, citing
and archiving data
Technology
Solution
In 2015 ...
Dataverse 4.0
A full rewrite that improves usability defines a rigorous and standardized
data publishing workflow, and leverages the latest technologies.
Rich Set of Features
● Standard, persistent data citation
● Branding for each dataverse
● Standard, extensible metadata:
○ citation metadata
○ domain-specific metadata
○ file-level metadata
● Faceted search for all metadata
● Multiple levels of access control
○ CC0/ terms of use/ restricted
● Multiple roles and permissions
● Re-formatting of tabular data files
● Extraction of file metadata
● Versioning
● APIs for search, deposit, access
Upgraded Technology
● UI improved by usability testing
● Built with open source solutions
● Enhanced UI framework
○ PrimeFaces and Bootstrap
● Widely used, community driven
enterprise software platform
○ Java EE7 and Glassfish
● Reliable, scalable search platform
○ Solr
● Web standard programmatic interfaces
○ RESTful APIs
● Standards for archiving and
interoperability
○ OAI-PMH, LOCKSS
Dataverse Installations worldwide
Dataverse software installations around the world serve as public data repositories (Harvard and
ODUM Dataverses) or institutional research data repositories.
Harvard Dataverse
● A collaboration between the Harvard Library and IQSS
● Open to research data worldwide:
○ > 1000 dataverses
○ > 58,000 datasets
○ > 270,000 files
○ > 1.3 million downloads
○ > 11,000 registered users
● Includes dataverses for:
○ individual researchers
○ research teams
○ journals
○ institutions or organizations
● Rate of data deposit has increased by a factor of 30 since last year
Dataverse is now more than a
software project and a data repository
Dataverse
Development of
Protocols and
Software
Collaboration
with the Library
User Support,
curation and
Training
User Community
Input
Grant
Partnerships
Collaboration
with Broader Data
Community
Interns Program
Outreach
(meetings,
papers)
Journals
Integration
External Tools
Integration
Technical Support
for other
Installations
A vibrant community
Contributions:
● Internalization, translation to chinese (Fudan University)
● Integration with Archivematica (University of Toronto)
● Integration with iRODS (UNC)
● Integration with Shibboleth (Netherlands Dataverse, DANS)
What’s next
•Support new data types
•Sensitive information
•Large-scale data (terabytes to petabytes)
•Streaming data (e.g., sensors, cell phones)
•Make datasets shared in Dataverse more reusable
•Extend APIs to build community of app contributors
•Build integrated tools (e.g., data visualization, analysis)
dataverse demo
https://dataverse-demo.iq.harvard.edu
Thanks
@mercecrosas

Weitere ähnliche Inhalte

Was ist angesagt?

RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...ASIS&T
 
Manola-open aire and data publishing-nfdp13
Manola-open aire and data publishing-nfdp13Manola-open aire and data publishing-nfdp13
Manola-open aire and data publishing-nfdp13DataDryad
 
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types Pa...
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types  Pa...December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types  Pa...
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types Pa...DeVonne Parks, CEM
 
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...EDINA, University of Edinburgh
 
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...ASIS&T
 
Research data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their dataResearch data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their dataJisc RDM
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data networkJisc RDM
 
Research Data Support at the University of Edinburgh
Research Data Support at the University of EdinburghResearch Data Support at the University of Edinburgh
Research Data Support at the University of EdinburghRobin Rice
 
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareRobin Rice
 
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2 PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2 EDINA, University of Edinburgh
 
Establishing a UQ Research Data Management Service
Establishing a UQ Research Data Management Service Establishing a UQ Research Data Management Service
Establishing a UQ Research Data Management Service ARDC
 
JISC Managing Research Data: Liaison Librarian Training
JISC Managing Research Data: Liaison Librarian Training JISC Managing Research Data: Liaison Librarian Training
JISC Managing Research Data: Liaison Librarian Training EDINA, University of Edinburgh
 
Six Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareSix Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareRobin Rice
 

Was ist angesagt? (20)

Think like a Digital Curator
Think like a Digital CuratorThink like a Digital Curator
Think like a Digital Curator
 
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
 
Manola-open aire and data publishing-nfdp13
Manola-open aire and data publishing-nfdp13Manola-open aire and data publishing-nfdp13
Manola-open aire and data publishing-nfdp13
 
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types Pa...
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types  Pa...December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types  Pa...
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types Pa...
 
Deep Impact: Metadata and SUNCAT
Deep Impact: Metadata and SUNCATDeep Impact: Metadata and SUNCAT
Deep Impact: Metadata and SUNCAT
 
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
 
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
 
Research data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their dataResearch data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their data
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data network
 
Caldrone - Specific Needs and Concerns Associated with Data Repositories
Caldrone - Specific Needs and Concerns Associated with Data RepositoriesCaldrone - Specific Needs and Concerns Associated with Data Repositories
Caldrone - Specific Needs and Concerns Associated with Data Repositories
 
Research Data Support at the University of Edinburgh
Research Data Support at the University of EdinburghResearch Data Support at the University of Edinburgh
Research Data Support at the University of Edinburgh
 
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
 
Wilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of FedoraWilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of Fedora
 
SMRUDAS
SMRUDAS SMRUDAS
SMRUDAS
 
SCURL and SUNCAT serials holdings comparison service
SCURL and SUNCAT serials holdings comparison serviceSCURL and SUNCAT serials holdings comparison service
SCURL and SUNCAT serials holdings comparison service
 
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2 PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
 
Establishing a UQ Research Data Management Service
Establishing a UQ Research Data Management Service Establishing a UQ Research Data Management Service
Establishing a UQ Research Data Management Service
 
Rusbridge Feb 8 Improving Clarity around Continuing Access
Rusbridge Feb 8 Improving Clarity around Continuing AccessRusbridge Feb 8 Improving Clarity around Continuing Access
Rusbridge Feb 8 Improving Clarity around Continuing Access
 
JISC Managing Research Data: Liaison Librarian Training
JISC Managing Research Data: Liaison Librarian Training JISC Managing Research Data: Liaison Librarian Training
JISC Managing Research Data: Liaison Librarian Training
 
Six Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareSix Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShare
 

Ähnlich wie Dataverse hpdm symposium

Dataverse for Journals
Dataverse for JournalsDataverse for Journals
Dataverse for JournalsMerce Crosas
 
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020OpenAIRE
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing dataWorld Agroforestry (ICRAF)
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghRobin Rice
 
Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...vty
 
Dataverse opportunities
Dataverse opportunitiesDataverse opportunities
Dataverse opportunitiesvty
 
DSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdfDSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdf4Science
 
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary) Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary) OpenAIRE
 
A Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support ServicesA Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support ServicesSusanMRob
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...Sarah Anna Stewart
 
Integrating repositories and eLab notebooks through an open science framework
Integrating repositories and eLab notebooks through an open science frameworkIntegrating repositories and eLab notebooks through an open science framework
Integrating repositories and eLab notebooks through an open science frameworkrmacneil88
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”DuraSpace
 
RDAP 15: Research Data Integration in the Purdue Libraries
RDAP 15: Research Data Integration in the Purdue LibrariesRDAP 15: Research Data Integration in the Purdue Libraries
RDAP 15: Research Data Integration in the Purdue LibrariesASIS&T
 
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011Lee Dirks
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
The UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing ResearchThe UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing ResearchUniversity of California Curation Center
 

Ähnlich wie Dataverse hpdm symposium (20)

Dataverse for Journals
Dataverse for JournalsDataverse for Journals
Dataverse for Journals
 
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
 
Scholze liber 2015-06-25_final
Scholze liber 2015-06-25_finalScholze liber 2015-06-25_final
Scholze liber 2015-06-25_final
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of Edinburgh
 
Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...
Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...
Draux "Working with Scholarly APIs: A NISO Training Series, Session Four: Dig...
 
Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...
 
Dataverse opportunities
Dataverse opportunitiesDataverse opportunities
Dataverse opportunities
 
DSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdfDSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdf
 
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary) Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
Moving content across the OpenAIRE infrastructure boundaries (6th RDA Plenary)
 
A Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support ServicesA Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support Services
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
 
Integrating repositories and eLab notebooks through an open science framework
Integrating repositories and eLab notebooks through an open science frameworkIntegrating repositories and eLab notebooks through an open science framework
Integrating repositories and eLab notebooks through an open science framework
 
Setting up a data repository, what does it entail?
Setting up a data repository, what does it entail?Setting up a data repository, what does it entail?
Setting up a data repository, what does it entail?
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”
 
RDAP 15: Research Data Integration in the Purdue Libraries
RDAP 15: Research Data Integration in the Purdue LibrariesRDAP 15: Research Data Integration in the Purdue Libraries
RDAP 15: Research Data Integration in the Purdue Libraries
 
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
The UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing ResearchThe UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing Research
 
Goldman "Collaboratively Build Data Science Services and Skills"
Goldman "Collaboratively Build Data Science Services and Skills"Goldman "Collaboratively Build Data Science Services and Skills"
Goldman "Collaboratively Build Data Science Services and Skills"
 

Mehr von Merce Crosas

Practical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataversePractical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataverseMerce Crosas
 
Research Data Management @Harvard
Research Data Management @HarvardResearch Data Management @Harvard
Research Data Management @HarvardMerce Crosas
 
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudCloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudMerce Crosas
 
Can data access combat fake news?
Can data access combat fake news?Can data access combat fake news?
Can data access combat fake news?Merce Crosas
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories ImpactMerce Crosas
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsMerce Crosas
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingMerce Crosas
 
The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)Merce Crosas
 
Making Data Accessible
Making Data AccessibleMaking Data Accessible
Making Data AccessibleMerce Crosas
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasMerce Crosas
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceMerce Crosas
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Merce Crosas
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleMerce Crosas
 
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...Merce Crosas
 
A very Brief History of Communicating Science
A very Brief History of Communicating ScienceA very Brief History of Communicating Science
A very Brief History of Communicating ScienceMerce Crosas
 
Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at DataverseMerce Crosas
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Merce Crosas
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOCMerce Crosas
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse CommonsMerce Crosas
 

Mehr von Merce Crosas (20)

Practical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataversePractical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with Dataverse
 
Research Data Management @Harvard
Research Data Management @HarvardResearch Data Management @Harvard
Research Data Management @Harvard
 
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudCloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
 
Can data access combat fake news?
Can data access combat fake news?Can data access combat fake news?
Can data access combat fake news?
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories Impact
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
 
The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)
 
Cloud Dataverse
Cloud DataverseCloud Dataverse
Cloud Dataverse
 
Making Data Accessible
Making Data AccessibleMaking Data Accessible
Making Data Accessible
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with Confidence
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life Cycle
 
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
 
A very Brief History of Communicating Science
A very Brief History of Communicating ScienceA very Brief History of Communicating Science
A very Brief History of Communicating Science
 
Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at Dataverse
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOC
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse Commons
 

Kürzlich hochgeladen

Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 

Kürzlich hochgeladen (17)

2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 

Dataverse hpdm symposium

  • 1. @dataverseorg Mercè Crosas, Director of Data Science, IQSS, Harvard University @mercecrosas Harvard Purdue Data Management Symposium, June 16-17, 2015 New!
  • 2. About Dataverse ● Gives credit and control to data authors and distributors ● Follows best practices, standards for data management and archiving ● Dataverse development started in 2006 at Harvard’s IQSS ● Now widely used, with a vibrant development and user community ● Helped instigate and is at the center of a cultural change toward open and reproducible research Science requires community access to data An open source software project for sharing, citing and archiving data Technology Solution
  • 4. Dataverse 4.0 A full rewrite that improves usability defines a rigorous and standardized data publishing workflow, and leverages the latest technologies.
  • 5. Rich Set of Features ● Standard, persistent data citation ● Branding for each dataverse ● Standard, extensible metadata: ○ citation metadata ○ domain-specific metadata ○ file-level metadata ● Faceted search for all metadata ● Multiple levels of access control ○ CC0/ terms of use/ restricted ● Multiple roles and permissions ● Re-formatting of tabular data files ● Extraction of file metadata ● Versioning ● APIs for search, deposit, access Upgraded Technology ● UI improved by usability testing ● Built with open source solutions ● Enhanced UI framework ○ PrimeFaces and Bootstrap ● Widely used, community driven enterprise software platform ○ Java EE7 and Glassfish ● Reliable, scalable search platform ○ Solr ● Web standard programmatic interfaces ○ RESTful APIs ● Standards for archiving and interoperability ○ OAI-PMH, LOCKSS
  • 6. Dataverse Installations worldwide Dataverse software installations around the world serve as public data repositories (Harvard and ODUM Dataverses) or institutional research data repositories.
  • 7. Harvard Dataverse ● A collaboration between the Harvard Library and IQSS ● Open to research data worldwide: ○ > 1000 dataverses ○ > 58,000 datasets ○ > 270,000 files ○ > 1.3 million downloads ○ > 11,000 registered users ● Includes dataverses for: ○ individual researchers ○ research teams ○ journals ○ institutions or organizations ● Rate of data deposit has increased by a factor of 30 since last year
  • 8.
  • 9.
  • 10. Dataverse is now more than a software project and a data repository Dataverse Development of Protocols and Software Collaboration with the Library User Support, curation and Training User Community Input Grant Partnerships Collaboration with Broader Data Community Interns Program Outreach (meetings, papers) Journals Integration External Tools Integration Technical Support for other Installations
  • 11. A vibrant community Contributions: ● Internalization, translation to chinese (Fudan University) ● Integration with Archivematica (University of Toronto) ● Integration with iRODS (UNC) ● Integration with Shibboleth (Netherlands Dataverse, DANS)
  • 12. What’s next •Support new data types •Sensitive information •Large-scale data (terabytes to petabytes) •Streaming data (e.g., sensors, cell phones) •Make datasets shared in Dataverse more reusable •Extend APIs to build community of app contributors •Build integrated tools (e.g., data visualization, analysis)