SlideShare ist ein Scribd-Unternehmen logo
1 von 18
THE DATAVERSE COMMONS
Mercè Crosas, Ph.D.
Chief Data Science and Technology Officer
Institute for Quantitative Social Science
Harvard University
@mercecrosas
The Future of the Commons, November 18, 2015,
Today’s Scholarly Publication Landscape
article
analysis
Digital Data
Most scientific studies now involve large
amounts of digital data & software for analysis.
Software Publishing
Data Publishing
Article
Publishing
Data Repositories vs
Repository Software
Domain-
specific
repositories
GenBank
WW Protein Data
Bank
SBGrid Data Bank
…
General-
purpose
repositories
Harvard
Dataverse
DataDryad
Figshare
…
Repository
Software
Dataverse
Software
Dspace
Fedora
…
Data
Publishing
A formal data
citation
•Reference with
attribution
•Access with a persistent
identifier
Information about
the data (metadata)
•Discovery
•Data reuse
A trusted data
repository
•Access to data and
metadata (long-term
archival)
Dataverse follows best practices to
support Data Publishing
dataverse.org
Open-source software developed at Harvard’s IQSS since 2006
Installed in 12 sites world wide
Serving 100s of universities and organizations
Harvard Dataverse: dataverse.harvard.edu
Started as a community data repository for Social Science
Now open to all research fields and all researchers
More than 1300 dataverses
More than 59,000 datasets
More than 1,400,000 downloads
Dataverses are containers for Datasets
Each Dataverse can be for a researcher, a research project, a department, a
journal, or a larger organization.
Dataverse offers a rich feature set
to publish research data
Credit and Visibility
• Standard,
persistent data
citation
• Branding for each
dataverse
• Widgets to
embed in your
own website
Discovery
• Faceted search
for all metadata
• Standard
metadata:
• citation
• scientific
domain
• file-level
Access Control &
Roles
• CCO waiver for
public datasets
• Tiered access:
• terms of use
• guestbook
• restricted data
• Publishing
workflow
• Multiple roles:
• contribute
• curate, review
• administrate
Data Features
• Versioning
• Conversion of
tabular data files
to standard
format
• Automatic
extraction of file
metadata (R,
STATA, SPSS,
XSLX, FITS)
Journal Systems (Open Journal System, ScholarOne); Open Science Framework
Data Analysis (TwoRavens); Spatial Viz (WorldMap); Preservation systems (Archivematica)
Interoperability through APIs
Impact on the Social Science research
community and on the World
Antislavery petitions data Election Data Archive
Boston Area Research InitiativeProject TIER
Antislavery Petition Data
Daniel Carpenter, Garth Griffin (Harvard University)
3,500 antislavery and
antisegregation petitions
sent to Massachusetts from
1600s to 1870
Election Data Archive
Steve Ansholabehere (Harvard), Jonathan
Rodden (Stanford)
A collaborative archive to
share election results,
voting behavior, and
electoral politics.
Alaska electoral data:
1,500 data downloads
Project TIER
Richard Ball, Norm Medeiros (Haverford College)
Teaching empirical research
with reproducibility in mind
to future scholars
Provides a protocol to document
all steps in data management
and analysis:
• Data Files
• Metadata Files
• Computing Command Files
• Readme File
Boston Area Research Initiative
Daniel O’Brien (Northeastern), Robert Sampson, Christopher
Winship (Harvard)
Scholars, policymakers, practitioners
and civic leaders collaborating on
social science research and public
policy
• Dataset of Bicycle Collisions in
Boston (in collaboration with
Boston Police, Harvard School of
Public Health, and Cyclists Union)
• Data visualization with WorldMap
Future impact on other research
communities: Biomedical and Astronomy
OME-TIFF
Files FITS Files
• Data archival
• Conversion to standard formats
• Extraction of file-level metadata
R Data
Frames
Structural Biology Data Tuberculosis Genomics Data Astronomy Data
World Wide Telescope
Current Collaborations: Addressing the
Next Challenges in Data Sharing
Structural Biology Grid Data
Repository (Sliz, HMS, Crosas,
IQSS)
Social Science Big Data (King,
Crosas, IQSS, CGA)
Data Provenance (Seltzer,
SEAS, Crosas, King, IQSS)
Privacy Tools to share sensitive
data (SEAS, Berkman Center,
Privacy Lab, IQSS, MIT)
Sharing Sensitive Data with Confidence:
DataTags System
DataTag: A set of security features and access requirements for file handling
Sweeney, Crosas, Bar-Sinai, 2015, Technology Science
Data Sharing Workflow
for Sensitive Data
Sensitive
Dataset
Sensitive
Dataset
Direct
Access
Privacy
Preserving
Access
http://datatags.org
http://privacytools.seas.harvard.edu
Authorized
Signed DUA
THANKS
@mercecrosas
mcrosas@iq.harvard.edu
http://scholar.harvard.edu/mercecrosas

Weitere ähnliche Inhalte

Was ist angesagt?

Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
ASIS&T
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
Susanna-Assunta Sansone
 
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
Susanna-Assunta Sansone
 

Was ist angesagt? (20)

Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at Dataverse
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOC
 
Coping with Data for WHOI JP Students
Coping with Data for WHOI JP StudentsCoping with Data for WHOI JP Students
Coping with Data for WHOI JP Students
 
Research Data Management at USU
Research Data Management at USUResearch Data Management at USU
Research Data Management at USU
 
Mitigating the Risk: identifying Strategic University Partnerships for Compli...
Mitigating the Risk: identifying Strategic University Partnerships for Compli...Mitigating the Risk: identifying Strategic University Partnerships for Compli...
Mitigating the Risk: identifying Strategic University Partnerships for Compli...
 
NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
Preparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR PrinciplesPreparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR Principles
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
 
Managing and sharing confidential data in Australian social science
Managing and sharing confidential data	in Australian social scienceManaging and sharing confidential data	in Australian social science
Managing and sharing confidential data in Australian social science
 
Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck
 
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
Lightning Talk, Konkiel: Bootstrapping Library Data Management Services for E...
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
Levine - Data Curation; Ethics and Legal Considerations
Levine - Data Curation; Ethics and Legal ConsiderationsLevine - Data Curation; Ethics and Legal Considerations
Levine - Data Curation; Ethics and Legal Considerations
 
Presentation1
Presentation1Presentation1
Presentation1
 
The expanding dataverse
The expanding dataverseThe expanding dataverse
The expanding dataverse
 
dkNET Poster ENDO 2016
dkNET Poster ENDO 2016 dkNET Poster ENDO 2016
dkNET Poster ENDO 2016
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
 
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
 
Trustworthy AI and Open Science
Trustworthy AI and Open ScienceTrustworthy AI and Open Science
Trustworthy AI and Open Science
 

Ähnlich wie The Dataverse Commons

Ähnlich wie The Dataverse Commons (20)

Practical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataversePractical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with Dataverse
 
Next generation data services at the Marriott Library
Next generation data services at the Marriott LibraryNext generation data services at the Marriott Library
Next generation data services at the Marriott Library
 
Data Citation, The Dataverse Network ®, and Contributor Identifiers
Data Citation, The Dataverse Network ®, and Contributor IdentifiersData Citation, The Dataverse Network ®, and Contributor Identifiers
Data Citation, The Dataverse Network ®, and Contributor Identifiers
 
Lowenberg Making Data Count
Lowenberg Making Data CountLowenberg Making Data Count
Lowenberg Making Data Count
 
ODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific DataODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific Data
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
Data sharing in the age of the Social Machine
Data sharing in the age of the Social MachineData sharing in the age of the Social Machine
Data sharing in the age of the Social Machine
 
SEAD Prototype: Data Curation and Preservation for Sustainability Science
SEAD Prototype: Data Curation and Preservation for Sustainability ScienceSEAD Prototype: Data Curation and Preservation for Sustainability Science
SEAD Prototype: Data Curation and Preservation for Sustainability Science
 
Public engagement while you sleep
Public engagement while you sleepPublic engagement while you sleep
Public engagement while you sleep
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things data
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directions
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptx
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - final
 
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
 
Gobinda Chowdhury
Gobinda ChowdhuryGobinda Chowdhury
Gobinda Chowdhury
 
Open Science: Where Theory Meets Practice
Open Science: Where Theory Meets PracticeOpen Science: Where Theory Meets Practice
Open Science: Where Theory Meets Practice
 
Gaining credit for sharing research data
Gaining credit for sharing research dataGaining credit for sharing research data
Gaining credit for sharing research data
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptx
 

Mehr von Merce Crosas

Mehr von Merce Crosas (15)

Research Data Management @Harvard
Research Data Management @HarvardResearch Data Management @Harvard
Research Data Management @Harvard
 
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudCloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
 
Can data access combat fake news?
Can data access combat fake news?Can data access combat fake news?
Can data access combat fake news?
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
 
The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)
 
Cloud Dataverse
Cloud DataverseCloud Dataverse
Cloud Dataverse
 
Making Data Accessible
Making Data AccessibleMaking Data Accessible
Making Data Accessible
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
 
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
 
A very Brief History of Communicating Science
A very Brief History of Communicating ScienceA very Brief History of Communicating Science
A very Brief History of Communicating Science
 
Dataverse hpdm symposium
Dataverse   hpdm symposiumDataverse   hpdm symposium
Dataverse hpdm symposium
 
Collaboration in science and technology it summit
Collaboration in science and technology   it summitCollaboration in science and technology   it summit
Collaboration in science and technology it summit
 
Dataverse for Journals
Dataverse for JournalsDataverse for Journals
Dataverse for Journals
 
Collaboration in science and technology
Collaboration in science and technologyCollaboration in science and technology
Collaboration in science and technology
 
Force11 jddcp intro
Force11  jddcp introForce11  jddcp intro
Force11 jddcp intro
 

Kürzlich hochgeladen

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 

Kürzlich hochgeladen (20)

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 

The Dataverse Commons

  • 1. THE DATAVERSE COMMONS Mercè Crosas, Ph.D. Chief Data Science and Technology Officer Institute for Quantitative Social Science Harvard University @mercecrosas The Future of the Commons, November 18, 2015,
  • 2. Today’s Scholarly Publication Landscape article analysis Digital Data Most scientific studies now involve large amounts of digital data & software for analysis. Software Publishing Data Publishing Article Publishing
  • 3. Data Repositories vs Repository Software Domain- specific repositories GenBank WW Protein Data Bank SBGrid Data Bank … General- purpose repositories Harvard Dataverse DataDryad Figshare … Repository Software Dataverse Software Dspace Fedora …
  • 4. Data Publishing A formal data citation •Reference with attribution •Access with a persistent identifier Information about the data (metadata) •Discovery •Data reuse A trusted data repository •Access to data and metadata (long-term archival) Dataverse follows best practices to support Data Publishing
  • 5. dataverse.org Open-source software developed at Harvard’s IQSS since 2006 Installed in 12 sites world wide Serving 100s of universities and organizations
  • 6. Harvard Dataverse: dataverse.harvard.edu Started as a community data repository for Social Science Now open to all research fields and all researchers More than 1300 dataverses More than 59,000 datasets More than 1,400,000 downloads
  • 7. Dataverses are containers for Datasets Each Dataverse can be for a researcher, a research project, a department, a journal, or a larger organization.
  • 8. Dataverse offers a rich feature set to publish research data Credit and Visibility • Standard, persistent data citation • Branding for each dataverse • Widgets to embed in your own website Discovery • Faceted search for all metadata • Standard metadata: • citation • scientific domain • file-level Access Control & Roles • CCO waiver for public datasets • Tiered access: • terms of use • guestbook • restricted data • Publishing workflow • Multiple roles: • contribute • curate, review • administrate Data Features • Versioning • Conversion of tabular data files to standard format • Automatic extraction of file metadata (R, STATA, SPSS, XSLX, FITS) Journal Systems (Open Journal System, ScholarOne); Open Science Framework Data Analysis (TwoRavens); Spatial Viz (WorldMap); Preservation systems (Archivematica) Interoperability through APIs
  • 9. Impact on the Social Science research community and on the World Antislavery petitions data Election Data Archive Boston Area Research InitiativeProject TIER
  • 10. Antislavery Petition Data Daniel Carpenter, Garth Griffin (Harvard University) 3,500 antislavery and antisegregation petitions sent to Massachusetts from 1600s to 1870
  • 11. Election Data Archive Steve Ansholabehere (Harvard), Jonathan Rodden (Stanford) A collaborative archive to share election results, voting behavior, and electoral politics. Alaska electoral data: 1,500 data downloads
  • 12. Project TIER Richard Ball, Norm Medeiros (Haverford College) Teaching empirical research with reproducibility in mind to future scholars Provides a protocol to document all steps in data management and analysis: • Data Files • Metadata Files • Computing Command Files • Readme File
  • 13. Boston Area Research Initiative Daniel O’Brien (Northeastern), Robert Sampson, Christopher Winship (Harvard) Scholars, policymakers, practitioners and civic leaders collaborating on social science research and public policy • Dataset of Bicycle Collisions in Boston (in collaboration with Boston Police, Harvard School of Public Health, and Cyclists Union) • Data visualization with WorldMap
  • 14. Future impact on other research communities: Biomedical and Astronomy OME-TIFF Files FITS Files • Data archival • Conversion to standard formats • Extraction of file-level metadata R Data Frames Structural Biology Data Tuberculosis Genomics Data Astronomy Data World Wide Telescope
  • 15. Current Collaborations: Addressing the Next Challenges in Data Sharing Structural Biology Grid Data Repository (Sliz, HMS, Crosas, IQSS) Social Science Big Data (King, Crosas, IQSS, CGA) Data Provenance (Seltzer, SEAS, Crosas, King, IQSS) Privacy Tools to share sensitive data (SEAS, Berkman Center, Privacy Lab, IQSS, MIT)
  • 16. Sharing Sensitive Data with Confidence: DataTags System DataTag: A set of security features and access requirements for file handling Sweeney, Crosas, Bar-Sinai, 2015, Technology Science
  • 17. Data Sharing Workflow for Sensitive Data Sensitive Dataset Sensitive Dataset Direct Access Privacy Preserving Access http://datatags.org http://privacytools.seas.harvard.edu Authorized Signed DUA