SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Data Carpentry:
Enabling Researchers to
Work More Effectively
with Data
Tracy K. Teal, PhD
Data Carpentry Project Lead
Assistant Professor, BEACON, Michigan State University
@datacarpentry
http://datacarpentry.org
Training is a missing piece between
data collection & data-driven discovery
Training
http://widerplanet.org
Large scale data is being
generated in all domains
As well as in the
non-academic sector
Data potential
Data is not
information
Training is a missing piece between
data collection & data-driven discovery
Training
Biggest Bioinformatics Difficulty Most useful thing BRAEMBL could
do
Survey by Bioinformatics Resource Australia – EMBL
Researchers view the major limiting factor in
research progress as a lack of expertise in how
to handle and analyze data
http://braembl.org.au/news/braembl-community-survey-report-2013
Data Carpentry is filling
that training gap
Our mission is to provide researchers high-quality,
domain-specific training covering the full lifecycle
of data-driven research.
We’re here to help
(the logo is a saw)
• Training focused on data - teaching how to manage
and analyze data in an effective and reproducible
way.
• Domain specific by design – currently have lessons
in ecology and are developing lessons for genomics,
geosciences and social sciences.
• Initial focus is on novices - there are no
prerequisites, and no prior knowledge
computational experience is assumed. We plan to
expand to more advanced topics.
Grassroots training effort
- Developed by practitioners for practitioners
- Identify skill needs in data management and
analysis in given domains
- Collaboratively and iteratively developed openly
licensed (CC-BY) training materials
- Organize and deliver two-day, intensive hands-on
workshops in fundamental data analysis skills
using a pool of volunteer helpers and instructors
Grassroots training effort
- Developed by practitioners for practitioners
- Identify skill needs in data management and
analysis in given domains
- Collaboratively and iteratively developed openly
licensed (CC-BY) training materials
- Organize and deliver two-day, intensive hands-on
workshops in fundamental data analysis skills
using a pool of volunteer helpers and instructors
Grassroots training effort
- Developed by practitioners for practitioners
- Identify skill needs in data management and
analysis in given domains
- Collaboratively and iteratively developed openly
licensed (CC-BY) training materials
- Organize and deliver two-day, intensive hands-on
workshops in fundamental data analysis skills
using a pool of volunteer helpers and instructors
Software Carpentry
Since January 2013
With the help of the
Mozilla Science Lab
scaled to teach
North American Workshops 2012-
2014
Greg Wilson founded in 1998
• Over 270 two-day workshops
• For over 8300 learners
• Taught by over 200 volunteers
• In over 20 countries
Now its own non-profit the Software Carpentry Foundation
Data Carpentry workshops
Goals:
We can’t teach everything in two days, but the goal is
to teach foundational skills to reduce the activation
energy for getting started and know what’s possible
Curriculum: The data lifecycle from data organization
to analysis and visualization
Format
- Two days
- Hands on
- Qualified instructors
- Helpers
- Sticky notes!
Data Carpentry workshops
Demand is high
Workshops internationally
Started in November, 2014; since Jan 2015
have taught 10 workshops and have more
than 24 scheduled for this year
Interest from broad domains – biology,
genomics, social science, digital humanities,
libraries, geosciences
Curriculum for ecology
The data lifecycle from data organization to
analysis and visualization
• Data organization in spreadsheets
• OpenRefine for data cleaning
• R for data analysis and visualization
• SQL
Workshop at NDIC
https://github.com/datacarpentry/2015-05-03-NDIC/wiki
People are learning things!
0
5
10
15
Very
Low
Low Neither High Very
High
Before the workshop
Freq
Level of data management
and analysis skills prior to
the workshop
0
5
10
15
About
the same
Somewhat
higher
Higher Much
higher
After the workshop
Freq
Rate your level of data
management and analysis skills
following the workshop
People are learning things!
Compared to before the workshop, I have a better understanding of how to
They feel the workshop
was worthwhile
How much practical knowledge did you gain from this workshop?
This workshop was worth my time
Thoughts on data best practices
Please rate your level of agreement with the following statements
Hackathons to develop lessons
- Genomics
project organization, command line, cloud
computing, using bioinformatics tools, data analysis and
visualization
- CSHL, iPlant, SESYNC, iDigBio
- Geospatial data
Working with geospatial data
Hackathon at NEON – Sept/Oct (Leah Wasser)
- Social sciences
Working with data from social sciences
Hackathon at Berkeley – July (Dav Clark)
What Students Learn
• Capstone lesson
Guiding Data Carpentry
Steering Committee:
Karen Cranston (NESCent / OpenTree of Life)
Hilmar Lapp (NESCent / Duke)
Aleksandra Pawlik (Software Sustainability Institute)
Karthik Ram (rOpenSci / Berkeley Institute of Data Science Fellow)
Tracy Teal (Data Carpentry / Michigan State)
Ethan White (University of Florida / Moore DDD Investigator)
Greg Wilson (Software Carpentry)
Volunteer instructors and materials developers
Mike Smorul (SESYNC), Mary Shelly (SESYNC), Jason Williams
(iPlant), Leah Wasser (NEON), Deb Paul (iDigBio), Francois (iDigBio),
Ben Marwick (University of Washington), Dav Clark (Berkeley)
Data Carpentry support

Weitere ähnliche Inhalte

Was ist angesagt?

Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...Steven Miller
 
Internship PPT- Aastha,274187
Internship PPT- Aastha,274187Internship PPT- Aastha,274187
Internship PPT- Aastha,274187Aastha Bhatia
 
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Benjamin Nussbaum
 
NSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for LibrariansNSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for LibrariansAndrew Sallans
 
Better Insights from Your Master Data - Graph Database LA Meetup
Better Insights from Your Master Data - Graph Database LA MeetupBetter Insights from Your Master Data - Graph Database LA Meetup
Better Insights from Your Master Data - Graph Database LA MeetupBenjamin Nussbaum
 
Collaborative Solutions eHealth Event - Applied Information Sciences
Collaborative Solutions eHealth Event - Applied Information SciencesCollaborative Solutions eHealth Event - Applied Information Sciences
Collaborative Solutions eHealth Event - Applied Information SciencesCollaborative Solutions
 
X api chinese cop monthly meeting - march 2016
X api chinese cop monthly meeting - march 2016X api chinese cop monthly meeting - march 2016
X api chinese cop monthly meeting - march 2016Jessie Chuang
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016Jessie Chuang
 
Blended learning and flipped classrooms for data science at Dallas Startup Week
Blended learning and flipped classrooms for data science at Dallas Startup WeekBlended learning and flipped classrooms for data science at Dallas Startup Week
Blended learning and flipped classrooms for data science at Dallas Startup WeekStartupWeekDallas
 
Research Data Management: Approaches to Institutional Policy
Research Data Management: Approaches to Institutional PolicyResearch Data Management: Approaches to Institutional Policy
Research Data Management: Approaches to Institutional PolicyRobin Rice
 
Dataiku - data driven nyc - april 2016 - the solitude of the data team m...
Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team m...Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team m...
Dataiku - data driven nyc - april 2016 - the solitude of the data team m...Dataiku
 

Was ist angesagt? (17)

Beyond WIOA
Beyond WIOABeyond WIOA
Beyond WIOA
 
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
 
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
 
Internship PPT- Aastha,274187
Internship PPT- Aastha,274187Internship PPT- Aastha,274187
Internship PPT- Aastha,274187
 
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
 
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
 
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
 
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
 
NSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for LibrariansNSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for Librarians
 
Preparing Your Research Data for the Future - 2015-03-02 - University of Oxfo...
Preparing Your Research Data for the Future - 2015-03-02 - University of Oxfo...Preparing Your Research Data for the Future - 2015-03-02 - University of Oxfo...
Preparing Your Research Data for the Future - 2015-03-02 - University of Oxfo...
 
Better Insights from Your Master Data - Graph Database LA Meetup
Better Insights from Your Master Data - Graph Database LA MeetupBetter Insights from Your Master Data - Graph Database LA Meetup
Better Insights from Your Master Data - Graph Database LA Meetup
 
Collaborative Solutions eHealth Event - Applied Information Sciences
Collaborative Solutions eHealth Event - Applied Information SciencesCollaborative Solutions eHealth Event - Applied Information Sciences
Collaborative Solutions eHealth Event - Applied Information Sciences
 
X api chinese cop monthly meeting - march 2016
X api chinese cop monthly meeting - march 2016X api chinese cop monthly meeting - march 2016
X api chinese cop monthly meeting - march 2016
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016
 
Blended learning and flipped classrooms for data science at Dallas Startup Week
Blended learning and flipped classrooms for data science at Dallas Startup WeekBlended learning and flipped classrooms for data science at Dallas Startup Week
Blended learning and flipped classrooms for data science at Dallas Startup Week
 
Research Data Management: Approaches to Institutional Policy
Research Data Management: Approaches to Institutional PolicyResearch Data Management: Approaches to Institutional Policy
Research Data Management: Approaches to Institutional Policy
 
Dataiku - data driven nyc - april 2016 - the solitude of the data team m...
Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team m...Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team m...
Dataiku - data driven nyc - april 2016 - the solitude of the data team m...
 

Ähnlich wie Data carpentry ndic-2015-05-05

Data carpentry replicathon_2017-03-24
Data carpentry replicathon_2017-03-24Data carpentry replicathon_2017-03-24
Data carpentry replicathon_2017-03-24tracykteal
 
Using MS Power BI to create full, interactive reports using Brightspace Data ...
Using MS Power BI to create full, interactive reports using Brightspace Data ...Using MS Power BI to create full, interactive reports using Brightspace Data ...
Using MS Power BI to create full, interactive reports using Brightspace Data ...D2L Barry
 
Data and Software Carpentry Science Gateways webinar 2017-05-10
Data and Software Carpentry Science Gateways webinar 2017-05-10Data and Software Carpentry Science Gateways webinar 2017-05-10
Data and Software Carpentry Science Gateways webinar 2017-05-10tracykteal
 
Data Carpentry NSBE Informational Webinar
Data Carpentry NSBE Informational WebinarData Carpentry NSBE Informational Webinar
Data Carpentry NSBE Informational Webinartracykteal
 
Data scientist enablement dse 400 week 8 roadmap
Data scientist enablement   dse 400   week 8 roadmap Data scientist enablement   dse 400   week 8 roadmap
Data scientist enablement dse 400 week 8 roadmap Dr. Mohan K. Bavirisetty
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeArushi Prakash, Ph.D.
 
fINAL Lesson_1_Course_Introduction_v1.pptx
fINAL Lesson_1_Course_Introduction_v1.pptxfINAL Lesson_1_Course_Introduction_v1.pptx
fINAL Lesson_1_Course_Introduction_v1.pptxdataKarthik
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Jisc learning analytics service overview Aug 2018
Jisc learning analytics service overview Aug 2018Jisc learning analytics service overview Aug 2018
Jisc learning analytics service overview Aug 2018Paul Bailey
 
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)Qazi Maaz Arshad
 
Bioinformatic core facilities discussion
Bioinformatic core facilities discussionBioinformatic core facilities discussion
Bioinformatic core facilities discussionJennifer Shelton
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and PlacementAkhilGGM
 
20221121_OABE_DAFWB_JBiernaux.pptx
20221121_OABE_DAFWB_JBiernaux.pptx20221121_OABE_DAFWB_JBiernaux.pptx
20221121_OABE_DAFWB_JBiernaux.pptxOpenAccessBelgium
 
The DCC: Helping you curate your reputation
The DCC: Helping you curate your reputationThe DCC: Helping you curate your reputation
The DCC: Helping you curate your reputationJisc
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryMark Constable
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 

Ähnlich wie Data carpentry ndic-2015-05-05 (20)

Data carpentry replicathon_2017-03-24
Data carpentry replicathon_2017-03-24Data carpentry replicathon_2017-03-24
Data carpentry replicathon_2017-03-24
 
Using MS Power BI to create full, interactive reports using Brightspace Data ...
Using MS Power BI to create full, interactive reports using Brightspace Data ...Using MS Power BI to create full, interactive reports using Brightspace Data ...
Using MS Power BI to create full, interactive reports using Brightspace Data ...
 
Data and Software Carpentry Science Gateways webinar 2017-05-10
Data and Software Carpentry Science Gateways webinar 2017-05-10Data and Software Carpentry Science Gateways webinar 2017-05-10
Data and Software Carpentry Science Gateways webinar 2017-05-10
 
Data Carpentry NSBE Informational Webinar
Data Carpentry NSBE Informational WebinarData Carpentry NSBE Informational Webinar
Data Carpentry NSBE Informational Webinar
 
Data scientist enablement dse 400 week 8 roadmap
Data scientist enablement   dse 400   week 8 roadmap Data scientist enablement   dse 400   week 8 roadmap
Data scientist enablement dse 400 week 8 roadmap
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science Resume
 
fINAL Lesson_1_Course_Introduction_v1.pptx
fINAL Lesson_1_Course_Introduction_v1.pptxfINAL Lesson_1_Course_Introduction_v1.pptx
fINAL Lesson_1_Course_Introduction_v1.pptx
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Jisc learning analytics service overview Aug 2018
Jisc learning analytics service overview Aug 2018Jisc learning analytics service overview Aug 2018
Jisc learning analytics service overview Aug 2018
 
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
 
Bioinformatic core facilities discussion
Bioinformatic core facilities discussionBioinformatic core facilities discussion
Bioinformatic core facilities discussion
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and Placement
 
20221121_OABE_DAFWB_JBiernaux.pptx
20221121_OABE_DAFWB_JBiernaux.pptx20221121_OABE_DAFWB_JBiernaux.pptx
20221121_OABE_DAFWB_JBiernaux.pptx
 
The DCC: Helping you curate your reputation
The DCC: Helping you curate your reputationThe DCC: Helping you curate your reputation
The DCC: Helping you curate your reputation
 
Data Analytics: From Basic Skills to Executive Decision-Making
Data Analytics: From Basic Skills to Executive Decision-MakingData Analytics: From Basic Skills to Executive Decision-Making
Data Analytics: From Basic Skills to Executive Decision-Making
 
Data Science.pptx
Data Science.pptxData Science.pptx
Data Science.pptx
 
Eportfolio Feasability Project
Eportfolio Feasability ProjectEportfolio Feasability Project
Eportfolio Feasability Project
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project Delivery
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 

Kürzlich hochgeladen

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 

Kürzlich hochgeladen (20)

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 

Data carpentry ndic-2015-05-05

  • 1. Data Carpentry: Enabling Researchers to Work More Effectively with Data Tracy K. Teal, PhD Data Carpentry Project Lead Assistant Professor, BEACON, Michigan State University @datacarpentry http://datacarpentry.org
  • 2. Training is a missing piece between data collection & data-driven discovery Training
  • 4. Large scale data is being generated in all domains
  • 5. As well as in the non-academic sector
  • 8. Training is a missing piece between data collection & data-driven discovery Training
  • 9. Biggest Bioinformatics Difficulty Most useful thing BRAEMBL could do Survey by Bioinformatics Resource Australia – EMBL Researchers view the major limiting factor in research progress as a lack of expertise in how to handle and analyze data http://braembl.org.au/news/braembl-community-survey-report-2013
  • 10. Data Carpentry is filling that training gap Our mission is to provide researchers high-quality, domain-specific training covering the full lifecycle of data-driven research.
  • 11. We’re here to help (the logo is a saw)
  • 12. • Training focused on data - teaching how to manage and analyze data in an effective and reproducible way. • Domain specific by design – currently have lessons in ecology and are developing lessons for genomics, geosciences and social sciences. • Initial focus is on novices - there are no prerequisites, and no prior knowledge computational experience is assumed. We plan to expand to more advanced topics.
  • 13. Grassroots training effort - Developed by practitioners for practitioners - Identify skill needs in data management and analysis in given domains - Collaboratively and iteratively developed openly licensed (CC-BY) training materials - Organize and deliver two-day, intensive hands-on workshops in fundamental data analysis skills using a pool of volunteer helpers and instructors
  • 14. Grassroots training effort - Developed by practitioners for practitioners - Identify skill needs in data management and analysis in given domains - Collaboratively and iteratively developed openly licensed (CC-BY) training materials - Organize and deliver two-day, intensive hands-on workshops in fundamental data analysis skills using a pool of volunteer helpers and instructors
  • 15. Grassroots training effort - Developed by practitioners for practitioners - Identify skill needs in data management and analysis in given domains - Collaboratively and iteratively developed openly licensed (CC-BY) training materials - Organize and deliver two-day, intensive hands-on workshops in fundamental data analysis skills using a pool of volunteer helpers and instructors
  • 16. Software Carpentry Since January 2013 With the help of the Mozilla Science Lab scaled to teach North American Workshops 2012- 2014 Greg Wilson founded in 1998 • Over 270 two-day workshops • For over 8300 learners • Taught by over 200 volunteers • In over 20 countries Now its own non-profit the Software Carpentry Foundation
  • 17. Data Carpentry workshops Goals: We can’t teach everything in two days, but the goal is to teach foundational skills to reduce the activation energy for getting started and know what’s possible Curriculum: The data lifecycle from data organization to analysis and visualization
  • 18. Format - Two days - Hands on - Qualified instructors - Helpers - Sticky notes! Data Carpentry workshops
  • 19. Demand is high Workshops internationally Started in November, 2014; since Jan 2015 have taught 10 workshops and have more than 24 scheduled for this year Interest from broad domains – biology, genomics, social science, digital humanities, libraries, geosciences
  • 20. Curriculum for ecology The data lifecycle from data organization to analysis and visualization • Data organization in spreadsheets • OpenRefine for data cleaning • R for data analysis and visualization • SQL
  • 22. People are learning things! 0 5 10 15 Very Low Low Neither High Very High Before the workshop Freq Level of data management and analysis skills prior to the workshop 0 5 10 15 About the same Somewhat higher Higher Much higher After the workshop Freq Rate your level of data management and analysis skills following the workshop
  • 23. People are learning things! Compared to before the workshop, I have a better understanding of how to
  • 24. They feel the workshop was worthwhile How much practical knowledge did you gain from this workshop? This workshop was worth my time
  • 25. Thoughts on data best practices Please rate your level of agreement with the following statements
  • 26. Hackathons to develop lessons - Genomics project organization, command line, cloud computing, using bioinformatics tools, data analysis and visualization - CSHL, iPlant, SESYNC, iDigBio - Geospatial data Working with geospatial data Hackathon at NEON – Sept/Oct (Leah Wasser) - Social sciences Working with data from social sciences Hackathon at Berkeley – July (Dav Clark)
  • 27. What Students Learn • Capstone lesson
  • 28. Guiding Data Carpentry Steering Committee: Karen Cranston (NESCent / OpenTree of Life) Hilmar Lapp (NESCent / Duke) Aleksandra Pawlik (Software Sustainability Institute) Karthik Ram (rOpenSci / Berkeley Institute of Data Science Fellow) Tracy Teal (Data Carpentry / Michigan State) Ethan White (University of Florida / Moore DDD Investigator) Greg Wilson (Software Carpentry) Volunteer instructors and materials developers Mike Smorul (SESYNC), Mary Shelly (SESYNC), Jason Williams (iPlant), Leah Wasser (NEON), Deb Paul (iDigBio), Francois (iDigBio), Ben Marwick (University of Washington), Dav Clark (Berkeley)

Hinweis der Redaktion

  1. The missing step between data collection and research progress is a lack of training for scientists in crucial skills for effectively managing and analyzing large amounts of data.
  2. This isn’t an audience I need to convince of the current and increasing velocity, volume and variety of data, and we’re all here to discuss the challenges and opportunities this data generation is presenting. Why are we all here though, why is now different than say when microscopes first were invented or telescopes had greater capacity. Deception -> Disruption (Phil Bourne)
  3. We’re all here, and these conversations are getting broader and there’s more interest and investment in this issue, is because what’s different about now is that this data is being generated in all domains of research Informing every scholarly endeavor
  4. As well as in the non-academic sector. This lends itself to broad alignment on these issues in a way that we haven’t seen before, and also adds an urgency to the challenges. Industry might see how to turn this data in to value for the company, biomedical researchers see opportunities to save lives, geoscientists to address climate change. People think this data has the potential to address important issues affecting our lives and society.
  5. Now what we have is a lot of data potential. The data available has the potential to answer new research questions, suggest solutions to societal challenges or save lifes Now we have two types of researchers – those who are excited about this data and are generating it themselves or engaging with public data efforts, and those who don’t yet see a use for this data. But that slice of the pie who won’t be able to derive value or discovery from these digital data collections is starting to become smaller. .
  6. However data is not information or scientific insight on its own and we need to unlock the potential of that data – some of that is tools and software, databases and infrastructure. But the key to that is the researchers themselves and their ability to conduct this research.
  7. Researchers need to be trained in how to effectively manage and use this data. Researchers don’t know how to address the questions they want to ask or aren’t even aware of the types of questions they can ask with this data. Story about genomics researcher It’s no longer the case where we’re saying ‘researchers should’. Researcher themselves want this training, feel limited by their own current data analysis skills. You likely have anecdotal evidence of this in your own community or domain
  8. , but for instance … Biomedical researchers in the UK had similar sentiments in a survey https://storify.com/biocrusoe/elixir-uk-node-meeting-2014-10-14 And an internal study at PLOS showed a lack of data management expertise has a dramatic impact on the ability of researchers to share their data
  9. We need more Meridith’s Not everyone can be a Meridith (15 year old girl)
  10. Deliverables are lessons and workshops
  11. Deliverables are lessons and workshops
  12. Deliverables are lessons and workshops, but another very positive outcome has been the pool of instructors
  13. Materials and workshop model founded on successful SWC model Workshops fill gaps
  14. More than a focus on particular analysis techniques, workshops are focused on data management and what might be called data wrangling, gettting your data in to and out of analysis programs, formatting the data for analysis, conducting visualization We want to teach not only skills but affect attitudes – emphasizing importance and value of these approaches for effective and reproducible research
  15. Truly collaborative effort