SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Downloaden Sie, um offline zu lesen
Building communities around open-
source scientic software
Karen Cranston
National Evolutionary Synthesis Center (NESCent)
@kcranstn
http://www.slideshare.net/kcranstn
NESCent
National Evolutionary Synthesis Center
www.nescent.org
eldwork
labwork
method development
meta-analysisdata synthesis
Species A (mm^2) F (mm^2/
mm^2)
N (mm^-2) S (mm^4)
Abelia biflora
Abelia dielsii
Abelia integrifolia
Abelia mosanensis
Abelia serrata
Abelia spathulata
Abutilon fruticosum
Abutilon pannosum
Acacia albida
Acacia ataxacantha
Acacia borleae
Acacia burkei
Acacia caffra
0.002375829 0.924197654 389.0 6.11E-06
0.00115375 0.357418211 331.0 3.49E-06
0.001134115 0.240432369 212.0 5.35E-06
0.000855299 0.632065665 739.0 1.16E-06
0.000706858 0.206402637 292.0 2.42E-06
0.000804248 0.230819095 287.0 2.80E-06
0.001452201 0.137959114 95.0 1.53E-05
0.003117245 0.124689812 40.0 7.79E-05
0.012271846 0.049087385 4.0 0.003067962
0.013069811 0.169907541 13.0 0.00100537
0.004071504 0.061072561 15.0 0.000271434
0.008992024 0.053952141 6.0 0.001498671
0.010207035 0.214347725 21.0 0.000486049
+
trait data about species evolutionary trees
Outcomes: Community
Brian O'Meara, Michael Alfaro, Charles Bell, Ben Bolker, Marguerite Butler, Peter Cowan, Damien de Vienne, Richard
Desper, Joe Felsenstein, Luke Harmon, Christoph Heibl, Andrew Hipp, Gene Hunt, Thibaut Jombart, Steve Kembel, Hilmar
Lapp, Scott Loarie, Wayne Maddison, Peter Midford, David Orme, Emmanuel Paradis, Sam Price, Dan Rabosky, Brian
Sidlauskas, Stacey Smith, Dave Swofford, Todd Vision, Peter Waddell, Amy Zanne, Derrick Zwickl [bold indicates organizer]
Comparative methods in hackathon
Rationale Work at hackathon (Dec. 10-14, 2007)
The R statistical analysis package has emerged as a popular platform for
implementation of powerful comparative phylogenetic methods to understand the
evolution of organismal traits and diversication. It includes methods such as
independent contrasts, ancestral state estimation, various models of continuous
and discrete trait evolution, lineage through time plots, diversication tests,
generalized estimating equations, tree plotting, and more. This event was designed to bring
together active R developers as well as end-users working on the integration of comparative
phylogenetic methods within R to actively address issues of data exchange standards, code
interoperability, usability, documentation quality, and the breadth of functionality for comparative
methods available within R. The idea originated from a whitepaper submitted by NESCent
postdocs Amy Zanne and Sam Price.
•30 developers and users worked on
programming & writing documentation
•Split into subgroups on diversification,
divergence times, documentation, class
design, Mesquite-R interaction, input/
output, and trait evolution
•Package source code stored on shared repository hosted at R-forge (“PhyloConductor”)
Hackathon participants (red were flown to NESCent, purple participated remotely). Map from Google Maps
•Designed and began implementing a new S4 class for data and trees
•Ran “bootcamps” for developers on numerical optimization and S4 coding
•Used the Nexus Class Library (Lewis & Holder) and RCpp (Samperi) for reading and
interpreting Nexus tree and data les
•Began work on R tutorials
•Tested existing methods in R, identifying errors
•Developed ways for R to call Mesquite and Mesquite to call R
0
150
300
12/10 12/11 12/12 12/13 12/14 12/15
Commits
•R-Phylo Wiki (http://www.r-phylo.org): Tutorials and overview of
available analyses and packages from the hackathon
have been placed on a public website for all to use
and improve. It’s had >7,000 page visits from >30
countries and >600 edits since it went live in March
2008.
•R-sig-phylo mailing list (https://stat.ethz.ch/mailman/listinfo/r-sig-
phylo): A mailing list for users of R for comparative methods and
phylogenetics. Over 100 messages in its rst four months.
•Comparative methods in R user tutorials planned for 2009 Society
for Integrative and Comparative Biology and Evolution meetings.
•Addition of R track to NESCent summer course in
phyloinformatics, featuring software developed at hackathon and
taught by hackathon participant Marguerite Butler.
•Proposal to NSF for summer course in R for phyloinformatics.
•Ongoing collaborations between hackathon participants.
•Two Google Summer of Code projects to sponsor student
NESCent informatics
Incompatible tree
formats are used in
different R packages
Package Function
geiger1.0-9.1 sim.char
ouch1.2-4 brown.dev
picante evolve.brownian
ape2.01 evolve.phylo
Redundancy (at least four functions to
evolve traits up the tree using simple
Brownian motion)
Can be intimidating to beginners
Coding at hackathon
The US National Evolutionary Synthesis
Center (http://www.nescent.org)
encourages synthetic, interdisciplinary,
and transformative research in evolutionary
biology. NESCent, a collaborative effort of
Duke, NC State University and UNC Chapel
Hill, is located in Durham NC and is supported
by the National Science Foundation
(EF-0423641).
A major goal of NESCent's Informatics branch
is to promote community-driven, collaborative
open-source software development. This is
achieved through hackathons, internships
(such as the Google Summer of Code), summer
courses, conference workshops, and by
externally funded collaborations for
the development and support of
Outcomes: Software
•Phylobase (http://r-forge.r-project.org/projects/phylobase/): New
package for phylogenetic trees and data. Can load trees and data
from Nexus les, output to other tree formats, coordinate pruning of
taxa from data and tree, traverse tree, handle DNA, morphological,
and continuous data types. Work is ongoing (below) to enhance tree
plotting and other functions. As with all hackathon products, new
developers are welcome to join to further improve the code (one
already has).
URL: http://hackathon.nescent.org/R_Hackathon_1 email: hackathon2@nescent.org
0
50
100
150
200
12/16 12/30 1/13 1/27 2/10 2/24 3/9 3/23 4/6 4/20 5/4 5/18 6/1
Commits
•Movement of existing packages to source code repositories
allowing more collaborative development (i.e., Picante package has
new Google Summer of Code 2008 developer Matthew Helmus)
•R-Mesquite interaction: Code written to allow Mesquite (Maddison
& Maddison, 2007) to call R packages (such as OUCH (Butler &
Coding for PhyloBase
NaturePrecedings:doi:10.1038/npre.2008.2126.1:Posted28Jul2008
O’Meara et al. Nature Preceedings. 2008 http://dx.doi.org/10.1038/npre.2008.2126.1
R-sig-phylo mailing list
32 R packages for comparative biology;
maintained by a hackathon participant
Informatics
team
Evolutionary
biologists
computational skills
domain knowledge
NESCent
National Evolutionary Synthesis Center
www.nescent.orgwww.nescent.org
short bootcamps teaching
computational skills to
domain scientists
bringing students into open-
source programming
communities
A grassroots approach to software sustainability. Karen Cranston, Todd Vision, Brian
O'Meara, Hilmar Lapp. http://dx.doi.org/10.6084/m9.gshare.790739

Weitere ähnliche Inhalte

Ähnlich wie Building communities around open-source scientific software

Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
 
NeCTAR Presentation
NeCTAR PresentationNeCTAR Presentation
NeCTAR PresentationCybera Inc.
 
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neuroscience Information Framework
 
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...Raffaele Montella
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseYongyao Jiang
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021Gérard Dupont
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
Demystifying Data Science & Analytics - 757ColorCoded 2019
Demystifying Data Science & Analytics - 757ColorCoded 2019Demystifying Data Science & Analytics - 757ColorCoded 2019
Demystifying Data Science & Analytics - 757ColorCoded 2019Guillermo A. Fisher
 
Humanities Networked Infrastructure (HuNI)
Humanities Networked Infrastructure (HuNI)Humanities Networked Infrastructure (HuNI)
Humanities Networked Infrastructure (HuNI)Deb Verhoeven
 
Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Daniel S. Katz
 
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...Ilkay Altintas, Ph.D.
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyNeil Chue Hong
 
160606 data lifecycle project outline
160606 data lifecycle project outline160606 data lifecycle project outline
160606 data lifecycle project outlineIan Duncan
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceGigaScience, BGI Hong Kong
 
Open source ai_technical_trend
Open source ai_technical_trendOpen source ai_technical_trend
Open source ai_technical_trendMario Cho
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Eiji Sekiya
 
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersAlan Sill
 
Data coffee - Support vector machine usage with complex data
Data coffee - Support vector machine usage with complex dataData coffee - Support vector machine usage with complex data
Data coffee - Support vector machine usage with complex dataDr. Branislav MajernĂ­k
 

Ähnlich wie Building communities around open-source scientific software (20)

Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
 
NeCTAR Presentation
NeCTAR PresentationNeCTAR Presentation
NeCTAR Presentation
 
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
 
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary Defense
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Demystifying Data Science & Analytics - 757ColorCoded 2019
Demystifying Data Science & Analytics - 757ColorCoded 2019Demystifying Data Science & Analytics - 757ColorCoded 2019
Demystifying Data Science & Analytics - 757ColorCoded 2019
 
Humanities Networked Infrastructure (HuNI)
Humanities Networked Infrastructure (HuNI)Humanities Networked Infrastructure (HuNI)
Humanities Networked Infrastructure (HuNI)
 
Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)
 
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
 
Data mining weka
Data mining wekaData mining weka
Data mining weka
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
 
160606 data lifecycle project outline
160606 data lifecycle project outline160606 data lifecycle project outline
160606 data lifecycle project outline
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Open source ai_technical_trend
Open source ai_technical_trendOpen source ai_technical_trend
Open source ai_technical_trend
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
 
Ji cv6n1
Ji cv6n1Ji cv6n1
Ji cv6n1
 
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for Developers
 
Data coffee - Support vector machine usage with complex data
Data coffee - Support vector machine usage with complex dataData coffee - Support vector machine usage with complex data
Data coffee - Support vector machine usage with complex data
 

Mehr von Karen Cranston

WSSSPE: Building communities
WSSSPE: Building communitiesWSSSPE: Building communities
WSSSPE: Building communitiesKaren Cranston
 
Cranston Evolution 2013
Cranston Evolution 2013Cranston Evolution 2013
Cranston Evolution 2013Karen Cranston
 
Open Tree of Life @NSF
Open Tree of Life @NSFOpen Tree of Life @NSF
Open Tree of Life @NSFKaren Cranston
 
Freeing scientific data using CC0
Freeing scientific data using CC0Freeing scientific data using CC0
Freeing scientific data using CC0Karen Cranston
 
If this is the future, where is my tree of life?
If this is the future, where is my tree of life?If this is the future, where is my tree of life?
If this is the future, where is my tree of life?Karen Cranston
 
Phylotastic @iEvoBio
Phylotastic @iEvoBioPhylotastic @iEvoBio
Phylotastic @iEvoBioKaren Cranston
 
Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012Karen Cranston
 
OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012Karen Cranston
 

Mehr von Karen Cranston (8)

WSSSPE: Building communities
WSSSPE: Building communitiesWSSSPE: Building communities
WSSSPE: Building communities
 
Cranston Evolution 2013
Cranston Evolution 2013Cranston Evolution 2013
Cranston Evolution 2013
 
Open Tree of Life @NSF
Open Tree of Life @NSFOpen Tree of Life @NSF
Open Tree of Life @NSF
 
Freeing scientific data using CC0
Freeing scientific data using CC0Freeing scientific data using CC0
Freeing scientific data using CC0
 
If this is the future, where is my tree of life?
If this is the future, where is my tree of life?If this is the future, where is my tree of life?
If this is the future, where is my tree of life?
 
Phylotastic @iEvoBio
Phylotastic @iEvoBioPhylotastic @iEvoBio
Phylotastic @iEvoBio
 
Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012
 
OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012
 

KĂźrzlich hochgeladen

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 

KĂźrzlich hochgeladen (20)

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Building communities around open-source scientific software

  • 1. Building communities around open- source scientic software Karen Cranston National Evolutionary Synthesis Center (NESCent) @kcranstn http://www.slideshare.net/kcranstn
  • 2. NESCent National Evolutionary Synthesis Center www.nescent.org eldwork labwork method development meta-analysisdata synthesis
  • 3.
  • 4. Species A (mm^2) F (mm^2/ mm^2) N (mm^-2) S (mm^4) Abelia biflora Abelia dielsii Abelia integrifolia Abelia mosanensis Abelia serrata Abelia spathulata Abutilon fruticosum Abutilon pannosum Acacia albida Acacia ataxacantha Acacia borleae Acacia burkei Acacia caffra 0.002375829 0.924197654 389.0 6.11E-06 0.00115375 0.357418211 331.0 3.49E-06 0.001134115 0.240432369 212.0 5.35E-06 0.000855299 0.632065665 739.0 1.16E-06 0.000706858 0.206402637 292.0 2.42E-06 0.000804248 0.230819095 287.0 2.80E-06 0.001452201 0.137959114 95.0 1.53E-05 0.003117245 0.124689812 40.0 7.79E-05 0.012271846 0.049087385 4.0 0.003067962 0.013069811 0.169907541 13.0 0.00100537 0.004071504 0.061072561 15.0 0.000271434 0.008992024 0.053952141 6.0 0.001498671 0.010207035 0.214347725 21.0 0.000486049 + trait data about species evolutionary trees
  • 5. Outcomes: Community Brian O'Meara, Michael Alfaro, Charles Bell, Ben Bolker, Marguerite Butler, Peter Cowan, Damien de Vienne, Richard Desper, Joe Felsenstein, Luke Harmon, Christoph Heibl, Andrew Hipp, Gene Hunt, Thibaut Jombart, Steve Kembel, Hilmar Lapp, Scott Loarie, Wayne Maddison, Peter Midford, David Orme, Emmanuel Paradis, Sam Price, Dan Rabosky, Brian Sidlauskas, Stacey Smith, Dave Swofford, Todd Vision, Peter Waddell, Amy Zanne, Derrick Zwickl [bold indicates organizer] Comparative methods in hackathon Rationale Work at hackathon (Dec. 10-14, 2007) The R statistical analysis package has emerged as a popular platform for implementation of powerful comparative phylogenetic methods to understand the evolution of organismal traits and diversication. It includes methods such as independent contrasts, ancestral state estimation, various models of continuous and discrete trait evolution, lineage through time plots, diversication tests, generalized estimating equations, tree plotting, and more. This event was designed to bring together active R developers as well as end-users working on the integration of comparative phylogenetic methods within R to actively address issues of data exchange standards, code interoperability, usability, documentation quality, and the breadth of functionality for comparative methods available within R. The idea originated from a whitepaper submitted by NESCent postdocs Amy Zanne and Sam Price. •30 developers and users worked on programming & writing documentation •Split into subgroups on diversication, divergence times, documentation, class design, Mesquite-R interaction, input/ output, and trait evolution •Package source code stored on shared repository hosted at R-forge (“PhyloConductor”) Hackathon participants (red were flown to NESCent, purple participated remotely). Map from Google Maps •Designed and began implementing a new S4 class for data and trees •Ran “bootcamps” for developers on numerical optimization and S4 coding •Used the Nexus Class Library (Lewis & Holder) and RCpp (Samperi) for reading and interpreting Nexus tree and data les •Began work on R tutorials •Tested existing methods in R, identifying errors •Developed ways for R to call Mesquite and Mesquite to call R 0 150 300 12/10 12/11 12/12 12/13 12/14 12/15 Commits •R-Phylo Wiki (http://www.r-phylo.org): Tutorials and overview of available analyses and packages from the hackathon have been placed on a public website for all to use and improve. It’s had >7,000 page visits from >30 countries and >600 edits since it went live in March 2008. •R-sig-phylo mailing list (https://stat.ethz.ch/mailman/listinfo/r-sig- phylo): A mailing list for users of R for comparative methods and phylogenetics. Over 100 messages in its rst four months. •Comparative methods in R user tutorials planned for 2009 Society for Integrative and Comparative Biology and Evolution meetings. •Addition of R track to NESCent summer course in phyloinformatics, featuring software developed at hackathon and taught by hackathon participant Marguerite Butler. •Proposal to NSF for summer course in R for phyloinformatics. •Ongoing collaborations between hackathon participants. •Two Google Summer of Code projects to sponsor student NESCent informatics Incompatible tree formats are used in different R packages Package Function geiger1.0-9.1 sim.char ouch1.2-4 brown.dev picante evolve.brownian ape2.01 evolve.phylo Redundancy (at least four functions to evolve traits up the tree using simple Brownian motion) Can be intimidating to beginners Coding at hackathon The US National Evolutionary Synthesis Center (http://www.nescent.org) encourages synthetic, interdisciplinary, and transformative research in evolutionary biology. NESCent, a collaborative effort of Duke, NC State University and UNC Chapel Hill, is located in Durham NC and is supported by the National Science Foundation (EF-0423641). A major goal of NESCent's Informatics branch is to promote community-driven, collaborative open-source software development. This is achieved through hackathons, internships (such as the Google Summer of Code), summer courses, conference workshops, and by externally funded collaborations for the development and support of Outcomes: Software •Phylobase (http://r-forge.r-project.org/projects/phylobase/): New package for phylogenetic trees and data. Can load trees and data from Nexus les, output to other tree formats, coordinate pruning of taxa from data and tree, traverse tree, handle DNA, morphological, and continuous data types. Work is ongoing (below) to enhance tree plotting and other functions. As with all hackathon products, new developers are welcome to join to further improve the code (one already has). URL: http://hackathon.nescent.org/R_Hackathon_1 email: hackathon2@nescent.org 0 50 100 150 200 12/16 12/30 1/13 1/27 2/10 2/24 3/9 3/23 4/6 4/20 5/4 5/18 6/1 Commits •Movement of existing packages to source code repositories allowing more collaborative development (i.e., Picante package has new Google Summer of Code 2008 developer Matthew Helmus) •R-Mesquite interaction: Code written to allow Mesquite (Maddison & Maddison, 2007) to call R packages (such as OUCH (Butler & Coding for PhyloBase NaturePrecedings:doi:10.1038/npre.2008.2126.1:Posted28Jul2008 O’Meara et al. Nature Preceedings. 2008 http://dx.doi.org/10.1038/npre.2008.2126.1
  • 7. 32 R packages for comparative biology; maintained by a hackathon participant
  • 8. Informatics team Evolutionary biologists computational skills domain knowledge NESCent National Evolutionary Synthesis Center www.nescent.orgwww.nescent.org
  • 9. short bootcamps teaching computational skills to domain scientists bringing students into open- source programming communities
  • 10.
  • 11. A grassroots approach to software sustainability. Karen Cranston, Todd Vision, Brian O'Meara, Hilmar Lapp. http://dx.doi.org/10.6084/m9.gshare.790739