SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Andrea de Souza
Director, Informatics, Data Analysis & Finance
Center for the Science of Therapeutics
May 29, 2013
BioAssay Research Database
Direct Contributors
NIH Molecular Libraries – Glenn McFadden, Ajay Pillai
NIH Chemical Genomics Center – Chris Austin (PI), John Braisted, Marc
Ferrer, Rajarshi Guha, Ajit Jadhav, Dac-Trung Nguyen, Tyler Peryea, Noel
Southall, Henrike Veith
Broad Institute – Benjamin Alexander, Jacob Asiedu, Kay Aubrey, Joshua
Bittker, Steve Brudz, Simon Chatwin, Paul Clemons, Vlado Dancik, Siva
Dandapani, Andrea de Souza, Dan Durkin, David Lahr, Jeri Levine, Judy
McGloughlin, Phil Montgomery, Jose Perez, Stuart Schreiber (PI), Gil
Walzer, Xiaorong Xiang
University of New Mexico – Cristian Bologa, Steve Mathias, Tudor Oprea,
Larry Sklar (PI), Oleg Ursu, Anna Waller, Jeremy Yang
University of Miami – Saminda Abeyruwan, Hande Küküc, Vance
Lemmon, Ahsan Mir, Magdalena Przydzial, Kunie Sakurai, Stephan
Schürer, Uma Vempati, Ubbo Visser
Vanderbilt University – Eric Dawson, Bill Graham, Craig Lindsley (PI),
Shaun Stauffer
Sanford-Burnham Medical Research Institute – “T.C.” Chung, Jena
Diwan, Michael Hedrick, Gavin Magnuson, Siobhan Malany, Ian Pass,
Anthony Pinkerton, Derek Stonich, John Reed (PI)
Scripps Research Institute – Yasel Cruz, Mark Southern,
Hugh Rosen (PI)
BARD: BioAssay Research Database
Mission: Enable biomedical researchers and cheminformatic
scientists to effectively use MLP data to generate new
hypotheses
• Unique collaboration amongst 7 NIH & academic centers
• Develop and adopt an Assay Definition Standard (ADS)
• Provide tools for assay registration, querying &
visualization
o Deploy predictive models
o Foster new methods to interpret chemical biology data
o Enable private data sharing
• Developed as an open-source, industrial-strength
platform to support public translational research
BARD: BioAssay Research Database
Mission: Enable biomedical researchers and cheminformatic
scientists to effectively use MLP data to generate new
hypotheses
Team Science
• Provide tools for assay registration and data querying &
visualization
o Deploy predictive models
o Foster new methods to interpret chemical biology data
o Enable private data sharing
• Developed as an open-source, industrial-strength platform to
Research Data Management
Technology
Predictive Models
The BARD platform will support public translational research
Research Data Management
The Value of Context
The Value of Context
Research Data Management
PubChem BioAssay
PubChem BioAssay and BARD
structure the data
PubChem BioAssay and BARD
PubChem BARD
Missing or fuzzy assay definitions,
experiments and project concepts
Introduce assay definitions,
experiments and projects
‘Column header’ centric with
concentration details embedded
Result types and concentrations as
experimental variables
Extensive use of unstructured text Transition to structured use of
common language
PubChem
MLP-BioAssay
structure
the data
Entrez
Uniprot
Gene Ontology Gene Ontology
DiseaseOntology
BioAssay Ontology BioAssay Ontology BioAssay Ontology BioAssay Ontology
UnitOntology
Uniprot Uniprot
UnitOntology
BARD Dictionary & Term Hierarchy
ChemicalOntology
BARD Assay Definition Hierarchy
• Annotate all assays to a minimum standard
• Integrate and extend ontologies
• Enable assay registration
• Represent assays, results, experiments using ADS
• Exchange information in ADS via ADF
Structuring the Data
BARD Technology Components
Define & Register
Assays
Data Dictionary – std terms
Catalog of Assay Protocols
High Quality Data &
Result Deposition
Calculations & Results
Project-experiment association
Query & Interpret
Information
Intuitive Guided Queries
Cross Assay & SAR centric views
Advance applications
EnableHypothesisGeneration
Novice Expert
BARD Technology Components
Define & Register
Assays
Data Dictionary – std terms
Catalog of Assay Protocols
High Quality Data &
Result Deposition
Calculations & Results
Project-experiment association
Query & Interpret
Information
Intuitive Guided Queries
Cross Assay & SAR centric views
Advance applications
EnableHypothesisGeneration
Novice Expert
Web Client
Filter on annotations, such as
detection method type
Google-like searching of: 4,000+ assays, 35M+ compounds, 300+ projects
Save items of
interest for further
analysis
Amazon-like Query Cart
Web Client - Project Specific Views
Web Client – Probe Development Workflow
Sunburst Visualization
Molecular activity against target classes
Target classifications from PantherDB
PANTHER in 2013: modeling the evolution of gene function,
and other gene attributes, in the context of phylogenetic trees.
Huaiyu Mi, Anushya Muruganujan and Paul D. Thomas
Nucl. Acids Res. (2012) doi: 10.1093/nar/gks1118
Jersey
D3.js
Web Query & Desktop ClientsData Warehouse & REST API Catalog of Assay Protocols
Commercial License
MySQL support for
CAP coming soon
As open source as possible
JGoodies
Chemaxon Usage in BARD
UNM Promiscuity Plugin
JChem for scaffold decomposition
REST API & Warehouse
JChem for rendering structures and
molecule fingerprint generation
http://bard.nih.gov/api/latest/compounds/6915727/image?s=200
http://bard.nih.gov/api/latest/compounds/?filter=n1cccc2ccccc12%5Bstructure%5D&type=sim&cutoff=0.9&expand=true
http://bard.nih.gov/api/latest/plugins/badapple/prom/cid/6915727?expand=true
Chemaxon Usage in BARD
Web Query Client
JChem for rendering structures
Desktop Client
JChem for rendering structures,
molecule import & export
Marvin for drawing query structures
• BioActivity Data Associative
Promiscuity Pattern Learning Engine
• Associations via scaffolds for chemical
space navigation
Example URI* description
<base>/badapple/prom/cid/7
52424
For compound with specified ID,
return scaffold IDs and scores.
<base>/badapple/prom/cid/7
52424?expand=true
Additional statistics, scaffold smiles,
and inDrug flag.
<base>/badapple/prom/scafid
/233
For scaffold with specified ID, return
statistics and smiles.
Predictive Models
Predictive Models
• Predicts CYP450 isoforms
metabolism sites with 2D
structures
• Patrik Rydberg et. al
• Released under LGPL
• BARD plugin
– Summary HTML view
– Data view
Navigating the Maze
Long-Term Path Forward
MLP
TBD
NCI-
60
TBD
Datasets
CAP Web
Query
Desktop
APIs
Tools
BAD
Apple
CYP450
TBD
TBD
Methods Data Analysis
Workflow 1
Workflow 2
Workflow 3
as a Platform
Sustained Community Engagement
ADS

Weitere ähnliche Inhalte

Was ist angesagt?

Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244
Yasel Cruz
 
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
Maulik Kamdar
 

Was ist angesagt? (20)

Research data and scholarly publications: going from casual acquaintances to ...
Research data and scholarly publications: going from casual acquaintances to ...Research data and scholarly publications: going from casual acquaintances to ...
Research data and scholarly publications: going from casual acquaintances to ...
 
The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, Bonn
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
Pub med
Pub medPub med
Pub med
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244
 
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
 
Pulverer-embo-source data-nfdp13
Pulverer-embo-source data-nfdp13Pulverer-embo-source data-nfdp13
Pulverer-embo-source data-nfdp13
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
2016 Bio-IT World Cell Line Coordination 2016-04-06v1
2016 Bio-IT World Cell Line Coordination 2016-04-06v12016 Bio-IT World Cell Line Coordination 2016-04-06v1
2016 Bio-IT World Cell Line Coordination 2016-04-06v1
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019
 
Search Interface Feature Evaluation in Biosciences
Search Interface Feature Evaluation in BiosciencesSearch Interface Feature Evaluation in Biosciences
Search Interface Feature Evaluation in Biosciences
 
Search Interface Feature Evaluation
Search Interface Feature EvaluationSearch Interface Feature Evaluation
Search Interface Feature Evaluation
 
Gaining credit for sharing research data
Gaining credit for sharing research dataGaining credit for sharing research data
Gaining credit for sharing research data
 
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
 

Andere mochten auch

EUGM 2013 - Sergio H. Rotstein (Pfizer): What about the “big guys”? The emerg...
EUGM 2013 - Sergio H. Rotstein (Pfizer): What about the “big guys”? The emerg...EUGM 2013 - Sergio H. Rotstein (Pfizer): What about the “big guys”? The emerg...
EUGM 2013 - Sergio H. Rotstein (Pfizer): What about the “big guys”? The emerg...
ChemAxon
 
EUGM 2013 - Christopher Southan (TW2Informatics): Chemicalize.org, SureChemOp...
EUGM 2013 - Christopher Southan (TW2Informatics): Chemicalize.org, SureChemOp...EUGM 2013 - Christopher Southan (TW2Informatics): Chemicalize.org, SureChemOp...
EUGM 2013 - Christopher Southan (TW2Informatics): Chemicalize.org, SureChemOp...
ChemAxon
 
EUGM 2013 - Bernd Rupp (FMP) Chemical Information systems: From compound coll...
EUGM 2013 - Bernd Rupp (FMP) Chemical Information systems: From compound coll...EUGM 2013 - Bernd Rupp (FMP) Chemical Information systems: From compound coll...
EUGM 2013 - Bernd Rupp (FMP) Chemical Information systems: From compound coll...
ChemAxon
 

Andere mochten auch (20)

EUGM 2013 - Andras Stracz (ChemAxon) - ChemAxon Plexus: A desktop application...
EUGM 2013 - Andras Stracz (ChemAxon) - ChemAxon Plexus: A desktop application...EUGM 2013 - Andras Stracz (ChemAxon) - ChemAxon Plexus: A desktop application...
EUGM 2013 - Andras Stracz (ChemAxon) - ChemAxon Plexus: A desktop application...
 
EUGM 2013 - Roland Knispel (ChemAxon) - Biologics at ChemAxon From Old Powerh...
EUGM 2013 - Roland Knispel (ChemAxon) - Biologics at ChemAxon From Old Powerh...EUGM 2013 - Roland Knispel (ChemAxon) - Biologics at ChemAxon From Old Powerh...
EUGM 2013 - Roland Knispel (ChemAxon) - Biologics at ChemAxon From Old Powerh...
 
EUGM 2013 - Gabor Guta (ChemAxon) - JChem Web Services
EUGM 2013 - Gabor Guta (ChemAxon) - JChem Web ServicesEUGM 2013 - Gabor Guta (ChemAxon) - JChem Web Services
EUGM 2013 - Gabor Guta (ChemAxon) - JChem Web Services
 
EUGM 2013 - David Deng, Daniel Bonniot (ChemAxon) - What’s New with Naming
EUGM 2013 - David Deng, Daniel Bonniot (ChemAxon) - What’s New with NamingEUGM 2013 - David Deng, Daniel Bonniot (ChemAxon) - What’s New with Naming
EUGM 2013 - David Deng, Daniel Bonniot (ChemAxon) - What’s New with Naming
 
EUGM 2013 - Attila Szabo (ChemAxon) - Collaborate and search in SharePoint wi...
EUGM 2013 - Attila Szabo (ChemAxon) - Collaborate and search in SharePoint wi...EUGM 2013 - Attila Szabo (ChemAxon) - Collaborate and search in SharePoint wi...
EUGM 2013 - Attila Szabo (ChemAxon) - Collaborate and search in SharePoint wi...
 
EUGM 2013 - Miklos Szabo (ChemAxon) - Recent Successful Discovery Strategies ...
EUGM 2013 - Miklos Szabo (ChemAxon) - Recent Successful Discovery Strategies ...EUGM 2013 - Miklos Szabo (ChemAxon) - Recent Successful Discovery Strategies ...
EUGM 2013 - Miklos Szabo (ChemAxon) - Recent Successful Discovery Strategies ...
 
EUGM 2013 - Odon Farkas (Eotvos University) - Conformation search via cool dy...
EUGM 2013 - Odon Farkas (Eotvos University) - Conformation search via cool dy...EUGM 2013 - Odon Farkas (Eotvos University) - Conformation search via cool dy...
EUGM 2013 - Odon Farkas (Eotvos University) - Conformation search via cool dy...
 
EUGM 2013 - Eufrozina Hoffmann (ChemAxon): Marvin extending the scope of usab...
EUGM 2013 - Eufrozina Hoffmann (ChemAxon): Marvin extending the scope of usab...EUGM 2013 - Eufrozina Hoffmann (ChemAxon): Marvin extending the scope of usab...
EUGM 2013 - Eufrozina Hoffmann (ChemAxon): Marvin extending the scope of usab...
 
EUGM 2013 - Dragos Horváth (Labooratoire de Chemoinformatique Univ Strasbourg...
EUGM 2013 - Dragos Horváth (Labooratoire de Chemoinformatique Univ Strasbourg...EUGM 2013 - Dragos Horváth (Labooratoire de Chemoinformatique Univ Strasbourg...
EUGM 2013 - Dragos Horváth (Labooratoire de Chemoinformatique Univ Strasbourg...
 
EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...
EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...
EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...
 
EUGM 2013 - Sergio H. Rotstein (Pfizer): What about the “big guys”? The emerg...
EUGM 2013 - Sergio H. Rotstein (Pfizer): What about the “big guys”? The emerg...EUGM 2013 - Sergio H. Rotstein (Pfizer): What about the “big guys”? The emerg...
EUGM 2013 - Sergio H. Rotstein (Pfizer): What about the “big guys”? The emerg...
 
EUGM 2013 - Peter Englert, Peter Kovacs (ChemAxon) - The Next Generation of M...
EUGM 2013 - Peter Englert, Peter Kovacs (ChemAxon) - The Next Generation of M...EUGM 2013 - Peter Englert, Peter Kovacs (ChemAxon) - The Next Generation of M...
EUGM 2013 - Peter Englert, Peter Kovacs (ChemAxon) - The Next Generation of M...
 
EUGM 2013 - Steve Hajkowski (Thomson Reuters): Patent analytics - what can Ma...
EUGM 2013 - Steve Hajkowski (Thomson Reuters): Patent analytics - what can Ma...EUGM 2013 - Steve Hajkowski (Thomson Reuters): Patent analytics - what can Ma...
EUGM 2013 - Steve Hajkowski (Thomson Reuters): Patent analytics - what can Ma...
 
EUGM 2013 - Michael Dippolito (Deltasoft): Great Migrations! – Approaches to ...
EUGM 2013 - Michael Dippolito (Deltasoft): Great Migrations! – Approaches to ...EUGM 2013 - Michael Dippolito (Deltasoft): Great Migrations! – Approaches to ...
EUGM 2013 - Michael Dippolito (Deltasoft): Great Migrations! – Approaches to ...
 
EUGM 2013 - Christopher Southan (TW2Informatics): Chemicalize.org, SureChemOp...
EUGM 2013 - Christopher Southan (TW2Informatics): Chemicalize.org, SureChemOp...EUGM 2013 - Christopher Southan (TW2Informatics): Chemicalize.org, SureChemOp...
EUGM 2013 - Christopher Southan (TW2Informatics): Chemicalize.org, SureChemOp...
 
EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...
EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...
EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...
 
EUGM 2013 - Gyorgy Pirok (ChemAxon) - Prediction of Xenobiotic Metabolism
EUGM 2013 - Gyorgy Pirok (ChemAxon) - Prediction of Xenobiotic MetabolismEUGM 2013 - Gyorgy Pirok (ChemAxon) - Prediction of Xenobiotic Metabolism
EUGM 2013 - Gyorgy Pirok (ChemAxon) - Prediction of Xenobiotic Metabolism
 
EUGM 2013 - Ian Berry, Bob Marmon (Evotec): Classification and analysis of 21...
EUGM 2013 - Ian Berry, Bob Marmon (Evotec): Classification and analysis of 21...EUGM 2013 - Ian Berry, Bob Marmon (Evotec): Classification and analysis of 21...
EUGM 2013 - Ian Berry, Bob Marmon (Evotec): Classification and analysis of 21...
 
EUGM 2013 - Bernd Rupp (FMP) Chemical Information systems: From compound coll...
EUGM 2013 - Bernd Rupp (FMP) Chemical Information systems: From compound coll...EUGM 2013 - Bernd Rupp (FMP) Chemical Information systems: From compound coll...
EUGM 2013 - Bernd Rupp (FMP) Chemical Information systems: From compound coll...
 
EUGM 2013 - Jon Patterson (ChemAxon) ChemAxon Platform for Scientists
EUGM 2013 - Jon Patterson (ChemAxon) ChemAxon Platform for Scientists EUGM 2013 - Jon Patterson (ChemAxon) ChemAxon Platform for Scientists
EUGM 2013 - Jon Patterson (ChemAxon) ChemAxon Platform for Scientists
 

Ähnlich wie EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD” file for bioassay definitions and data: Building the BioAssay Research Database

The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research Database
Rajarshi Guha
 
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biologHowe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Eleanor Howe
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
Helena Deus
 
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
European School of Oncology
 

Ähnlich wie EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD” file for bioassay definitions and data: Building the BioAssay Research Database (20)

FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research Database
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
Va sla nov 15 final
Va sla nov 15 finalVa sla nov 15 final
Va sla nov 15 final
 
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical Informatics
 
White_matter_Ouellette_2022-06-07.pdf
White_matter_Ouellette_2022-06-07.pdfWhite_matter_Ouellette_2022-06-07.pdf
White_matter_Ouellette_2022-06-07.pdf
 
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biologHowe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
 
Online Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery SystemsOnline Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery Systems
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
Wilson-npg-scientific data-nfdp13
Wilson-npg-scientific data-nfdp13Wilson-npg-scientific data-nfdp13
Wilson-npg-scientific data-nfdp13
 
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATOMetadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
 
Clinical Data Publishing at Scientific Data
Clinical Data Publishing at Scientific DataClinical Data Publishing at Scientific Data
Clinical Data Publishing at Scientific Data
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
 
Towards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imagingTowards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imaging
 

Mehr von ChemAxon

Translating data to predictive models
Translating data to predictive modelsTranslating data to predictive models
Translating data to predictive models
ChemAxon
 

Mehr von ChemAxon (20)

Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?
Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?
Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?
 
Chemaxon EU UGM 2022 | Translating data to predictive models
Chemaxon EU UGM 2022 | Translating data to predictive modelsChemaxon EU UGM 2022 | Translating data to predictive models
Chemaxon EU UGM 2022 | Translating data to predictive models
 
Translating data to predictive models
Translating data to predictive modelsTranslating data to predictive models
Translating data to predictive models
 
Efficient biomolecular structural data handling and analysis - Webinar with D...
Efficient biomolecular structural data handling and analysis - Webinar with D...Efficient biomolecular structural data handling and analysis - Webinar with D...
Efficient biomolecular structural data handling and analysis - Webinar with D...
 
Biomolecule structural data management
Biomolecule structural data managementBiomolecule structural data management
Biomolecule structural data management
 
Cheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first release
Cheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first releaseCheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first release
Cheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first release
 
Enhanced stereochemistry representation
Enhanced stereochemistry representation Enhanced stereochemistry representation
Enhanced stereochemistry representation
 
Intellectual property (IP) intelligence solutions designed for the way resear...
Intellectual property (IP) intelligence solutions designed for the way resear...Intellectual property (IP) intelligence solutions designed for the way resear...
Intellectual property (IP) intelligence solutions designed for the way resear...
 
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
 
Patent Data for Artificial Intelligence based Drug Discovery
Patent Data for Artificial Intelligence based Drug DiscoveryPatent Data for Artificial Intelligence based Drug Discovery
Patent Data for Artificial Intelligence based Drug Discovery
 
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
 
Research data management on the cloud
Research data management on the cloudResearch data management on the cloud
Research data management on the cloud
 
Cheminfo Stories APAC 2020 - Introducing Design Hub & Compound Registration
Cheminfo Stories APAC 2020 - Introducing Design Hub & Compound RegistrationCheminfo Stories APAC 2020 - Introducing Design Hub & Compound Registration
Cheminfo Stories APAC 2020 - Introducing Design Hub & Compound Registration
 
Cheminfo Stories APAC 2020 - JChem Engines introduction
Cheminfo Stories APAC 2020 - JChem Engines introduction Cheminfo Stories APAC 2020 - JChem Engines introduction
Cheminfo Stories APAC 2020 - JChem Engines introduction
 
Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...
Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...
Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...
 
Cheminfo Stories APAC 2020 -- Markush technology
Cheminfo Stories APAC 2020 -- Markush technology Cheminfo Stories APAC 2020 -- Markush technology
Cheminfo Stories APAC 2020 -- Markush technology
 
JChem Microservices
JChem MicroservicesJChem Microservices
JChem Microservices
 
Migration from joc to jpc or choral
Migration from joc to jpc or choralMigration from joc to jpc or choral
Migration from joc to jpc or choral
 
ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5
ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5
ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5
 
Chemicalize Pro - Cheminfo Stories 2020 Day 5
Chemicalize Pro - Cheminfo Stories 2020 Day 5Chemicalize Pro - Cheminfo Stories 2020 Day 5
Chemicalize Pro - Cheminfo Stories 2020 Day 5
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD” file for bioassay definitions and data: Building the BioAssay Research Database

  • 1. Andrea de Souza Director, Informatics, Data Analysis & Finance Center for the Science of Therapeutics May 29, 2013 BioAssay Research Database
  • 2. Direct Contributors NIH Molecular Libraries – Glenn McFadden, Ajay Pillai NIH Chemical Genomics Center – Chris Austin (PI), John Braisted, Marc Ferrer, Rajarshi Guha, Ajit Jadhav, Dac-Trung Nguyen, Tyler Peryea, Noel Southall, Henrike Veith Broad Institute – Benjamin Alexander, Jacob Asiedu, Kay Aubrey, Joshua Bittker, Steve Brudz, Simon Chatwin, Paul Clemons, Vlado Dancik, Siva Dandapani, Andrea de Souza, Dan Durkin, David Lahr, Jeri Levine, Judy McGloughlin, Phil Montgomery, Jose Perez, Stuart Schreiber (PI), Gil Walzer, Xiaorong Xiang University of New Mexico – Cristian Bologa, Steve Mathias, Tudor Oprea, Larry Sklar (PI), Oleg Ursu, Anna Waller, Jeremy Yang University of Miami – Saminda Abeyruwan, Hande Küküc, Vance Lemmon, Ahsan Mir, Magdalena Przydzial, Kunie Sakurai, Stephan Schürer, Uma Vempati, Ubbo Visser Vanderbilt University – Eric Dawson, Bill Graham, Craig Lindsley (PI), Shaun Stauffer Sanford-Burnham Medical Research Institute – “T.C.” Chung, Jena Diwan, Michael Hedrick, Gavin Magnuson, Siobhan Malany, Ian Pass, Anthony Pinkerton, Derek Stonich, John Reed (PI) Scripps Research Institute – Yasel Cruz, Mark Southern, Hugh Rosen (PI)
  • 3. BARD: BioAssay Research Database Mission: Enable biomedical researchers and cheminformatic scientists to effectively use MLP data to generate new hypotheses • Unique collaboration amongst 7 NIH & academic centers • Develop and adopt an Assay Definition Standard (ADS) • Provide tools for assay registration, querying & visualization o Deploy predictive models o Foster new methods to interpret chemical biology data o Enable private data sharing • Developed as an open-source, industrial-strength platform to support public translational research
  • 4. BARD: BioAssay Research Database Mission: Enable biomedical researchers and cheminformatic scientists to effectively use MLP data to generate new hypotheses Team Science • Provide tools for assay registration and data querying & visualization o Deploy predictive models o Foster new methods to interpret chemical biology data o Enable private data sharing • Developed as an open-source, industrial-strength platform to Research Data Management Technology Predictive Models The BARD platform will support public translational research
  • 5. Research Data Management The Value of Context
  • 6. The Value of Context Research Data Management
  • 8. PubChem BioAssay and BARD structure the data
  • 9. PubChem BioAssay and BARD PubChem BARD Missing or fuzzy assay definitions, experiments and project concepts Introduce assay definitions, experiments and projects ‘Column header’ centric with concentration details embedded Result types and concentrations as experimental variables Extensive use of unstructured text Transition to structured use of common language PubChem MLP-BioAssay structure the data
  • 10. Entrez Uniprot Gene Ontology Gene Ontology DiseaseOntology BioAssay Ontology BioAssay Ontology BioAssay Ontology BioAssay Ontology UnitOntology Uniprot Uniprot UnitOntology BARD Dictionary & Term Hierarchy ChemicalOntology BARD Assay Definition Hierarchy • Annotate all assays to a minimum standard • Integrate and extend ontologies • Enable assay registration • Represent assays, results, experiments using ADS • Exchange information in ADS via ADF Structuring the Data
  • 11. BARD Technology Components Define & Register Assays Data Dictionary – std terms Catalog of Assay Protocols High Quality Data & Result Deposition Calculations & Results Project-experiment association Query & Interpret Information Intuitive Guided Queries Cross Assay & SAR centric views Advance applications EnableHypothesisGeneration Novice Expert
  • 12. BARD Technology Components Define & Register Assays Data Dictionary – std terms Catalog of Assay Protocols High Quality Data & Result Deposition Calculations & Results Project-experiment association Query & Interpret Information Intuitive Guided Queries Cross Assay & SAR centric views Advance applications EnableHypothesisGeneration Novice Expert
  • 13. Web Client Filter on annotations, such as detection method type Google-like searching of: 4,000+ assays, 35M+ compounds, 300+ projects Save items of interest for further analysis Amazon-like Query Cart
  • 14. Web Client - Project Specific Views
  • 15. Web Client – Probe Development Workflow
  • 16. Sunburst Visualization Molecular activity against target classes Target classifications from PantherDB PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Huaiyu Mi, Anushya Muruganujan and Paul D. Thomas Nucl. Acids Res. (2012) doi: 10.1093/nar/gks1118
  • 17. Jersey D3.js Web Query & Desktop ClientsData Warehouse & REST API Catalog of Assay Protocols Commercial License MySQL support for CAP coming soon As open source as possible JGoodies
  • 18. Chemaxon Usage in BARD UNM Promiscuity Plugin JChem for scaffold decomposition REST API & Warehouse JChem for rendering structures and molecule fingerprint generation http://bard.nih.gov/api/latest/compounds/6915727/image?s=200 http://bard.nih.gov/api/latest/compounds/?filter=n1cccc2ccccc12%5Bstructure%5D&type=sim&cutoff=0.9&expand=true http://bard.nih.gov/api/latest/plugins/badapple/prom/cid/6915727?expand=true
  • 19. Chemaxon Usage in BARD Web Query Client JChem for rendering structures Desktop Client JChem for rendering structures, molecule import & export Marvin for drawing query structures
  • 20. • BioActivity Data Associative Promiscuity Pattern Learning Engine • Associations via scaffolds for chemical space navigation Example URI* description <base>/badapple/prom/cid/7 52424 For compound with specified ID, return scaffold IDs and scores. <base>/badapple/prom/cid/7 52424?expand=true Additional statistics, scaffold smiles, and inDrug flag. <base>/badapple/prom/scafid /233 For scaffold with specified ID, return statistics and smiles. Predictive Models
  • 21. Predictive Models • Predicts CYP450 isoforms metabolism sites with 2D structures • Patrik Rydberg et. al • Released under LGPL • BARD plugin – Summary HTML view – Data view
  • 23. Long-Term Path Forward MLP TBD NCI- 60 TBD Datasets CAP Web Query Desktop APIs Tools BAD Apple CYP450 TBD TBD Methods Data Analysis Workflow 1 Workflow 2 Workflow 3 as a Platform Sustained Community Engagement ADS