SlideShare a Scribd company logo
1 of 58
Computing on the shoulders
of giants:
how existing knowledge is represented
and applied in bioinformatics
Benjamin Good
bgood@scripps.edu
Assistant Professor of the Department of
Molecular and Experimental Medicine
Specialty: artificial intelligence, crowdsourcing
The more you can ‘know’ the
better a scientist you can become
(Derived from Alex Pico, WikiPathways)
ideas Knowledge
data
Too much to know
• PubMed lists > 1 million
articles published each year
(more than 2 per minute)
• Your capacity to read and
comprehend is limiting
Knowledge
Knowledge representation
ideas Knowledge
data
?!
Goals for representing knowledge (outline)
• Make things (articles, genes,
antibodies, etc.) easier to find
• Answer questions
• Generate hypotheses
Controlled vocabularies (MeSH)
Ontologies (Gene Ontology)
knowledge graphs on the Web:
the SPARQL query language
knowledge plus computation =
inference, the ABC model
Part 1:
Medical Subject Headings
(MeSH)
Finding what to read:
controlled vocabularies for indexing PubMed
• What happens when you search PubMed?
MeSH controlled vocabulary (AKA ‘thesaurus’)
• Descriptor Unique ID: D013575
• Definition: A transient loss of
consciousness and postural tone
caused by diminished blood flow
to the brain…
• Entry Terms: Syncopes, Fainting,
Syncopal Vertigo, Presyncope,
Drop Attack, Carotid Sinus
Syncope,…
• Relations to other terms
Syncope
Neurocognitive disorders
Consciousness
disorders
Mental disorders
Narrower
Vasovagal
Syncope
Broader
39,186
199,545
1,030,165 articles
1616
11,287
MeSH: medical subject headings
• >27,000 descriptors
• >87,000 entry terms
• 16 hierarchical trees
• Constantly being revised
Demo and play time
• View and explore the MeSH trees:
• https://www.nlm.nih.gov/mesh/2016/mesh_browser/MeSHtree.html
• Use MeSH to query PubMed
• Go to: http://www.ncbi.nlm.nih.gov/mesh
• Search for the term ’fainting’
• click ‘Add to search builder’
• click search PubMed
• click back, search for other things..
Query demos
• Query expansion
• Hand Bones [Mesh]
• Hand Bones [Mesh:NoExp]
• Boolean operators
• cardiac hypertrophy and use rodents besides mice and rats in their experiments
• ("Cardiomegaly"[Mesh])
• AND "Rodentia"[Mesh]
• NOT "Mice"[Mesh] NOT "Rats"[Mesh]
• Article type filter
• Review papers about cardiac hypertrophy
• Cardiomegaly [MeSH] AND Review[ptyp]
• Try with http://www.ncbi.nlm.nih.gov/pubmed/advanced
Questions about MeSH ?
• Good 3 minute tutorial video on practical use:
http://www.youtube.com/watch?v=uyF8uQY9wys
Part 2: Ontology
“Ontology”
• The word comes from philosophy:
• “the branch of metaphysics dealing with the nature of being”
• In practice they are:
• A set of concepts, definitions and inter-relationships.
• (The dividing line between “controlled vocabulary”, “thesaurus”, “ontology” is
hazy and not terribly important for practical purposes.)
• We have hundreds of ontologies in biology, e.g. see:
• http://www.obofoundry.org (100+)
• http://bioportal.bioontology.org (500+)
The Gene Ontology
Ashburner et al., Nat Genet. 2000 May;25(1):25-9.
Started in 1999
As a collaboration between 3 Model Organism Databases
Slide credit: Mélanie Courtot, Ph.D.
• A way to capture biological
knowledge for individual gene
products in a computable form
• A set of concepts and their
relationships to each other
arranged as a hierarchy
http://www.ebi.ac.uk/QuickGO
Less specific concepts
More specific concepts
The Gene Ontology
Slide credit: Mélanie Courtot, Ph.D.
1. Molecular Function
An elemental activity or task or job
• protein kinase activity
• insulin receptor activity
3. Cellular Component
Where a gene product is located
• mitochondrion
• mitochondrial matrix
• mitochondrial inner membrane
2. Biological Process
A commonly recognized series of events
• cell division
The GO branches
Slide credit: Mélanie Courtot, Ph.D.
Building the GO
(now covering more than 40,000 terms)
• GO editorial team based at the European Bioinformatics Institute
• Submission via GitHub, https://github.com/geneontology/
• Submissions via TermGenie, http://go.termgenie.org
• In principal, anyone can suggest a change to the ontology, but the GO
editors make the decisions about what goes in.
Slide credit: Mélanie Courtot, Ph.D.
Using the GO to describe gene products
gene ->
GO term
associated
genes
GO
Database
genome and protein
databases
Slide credit: Mélanie Courtot, Ph.D.
Contributors
Slide credit: Mélanie Courtot, Ph.D.
…a statement that a gene product;
P00505
Accession Name GO ID GO term name Reference Evidence
code
IDAPMID:2731362aspartate transaminase activityGO:0004069GOT2
A GO annotation is …
Slide credit: Mélanie Courtot, Ph.D.
…a statement that a gene product;
P00505
Accession Name GO ID GO term name Reference Evidence
code
IDAPMID:2731362aspartate transaminase activityGO:0004069GOT2
A GO annotation is …
has a particular molecular function
or is involved in a particular biological process
or is located within a certain cellular component
1.
Slide credit: Mélanie Courtot, Ph.D.
…a statement that a gene product;
P00505
Accession Name GO ID GO term name Reference Evidence
code
IDAPMID:2731362aspartate transaminase activityGO:0004069GOT2
A GO annotation is …
has a particular molecular function
or is involved in a particular biological process
or is located within a certain cellular component
1.
2. as described in a particular reference
Slide credit: Mélanie Courtot, Ph.D.
…a statement that a gene product;
P00505
Accession Name GO ID GO term name Reference Evidence
code
IDAPMID:2731362aspartate transaminase activityGO:0004069GOT2
A GO annotation is …
has a particular molecular function
or is involved in a particular biological process
or is located within a certain cellular component
1.
2. as described in a particular reference
3. according to a particular method
Slide credit: Mélanie Courtot, Ph.D.
Experimental
data
Computational
analysis
Author statements/
curator inference
Kinds of evidence for GO annotations by curators
Slide credit: Mélanie Courtot, Ph.D. http://geneontology.org/page/guide-go-evidence-codes
Inferred from Experiment (EXP)
Inferred from Direct Assay (IDA)
Inferred from Physical Interaction (IPI)
Inferred from Mutant Phenotype (IMP)
Inferred from Genetic Interaction (IGI)
Inferred from Expression Pattern (IEP)
Inferred from Sequence or structural Similarity (ISS)
Inferred from Sequence Orthology (ISO)
Inferred from Sequence Alignment (ISA)
Inferred from Sequence Model (ISM)
Inferred from Genomic Context (IGC)
Inferred from Biological aspect of Ancestor (IBA)
Inferred from Biological aspect of Descendant (IBD)
Inferred from Key Residues (IKR)
Inferred from Rapid Divergence(IRD)
Inferred from Reviewed Computational Analysis (RCA)
Traceable Author Statement (TAS)
Non-traceable Author Statement (NAS)
Inferred by Curator (IC)
No biological Data available (ND) evidence code
Inferred from Electronic Annotation (IEA)
http://geneontology.org/page/guide-go-evidence-codes
The one evidence code used for completely
automated annotation
Manual annotations
• Time-consuming process
producing lower numbers of
annotations (~2,800 taxons
covered)
• More specific GO terms
• Manual annotation is
essential for creating
predictions
Aleksandra
Shypitsyna
Elena
Speretta
Alex
Holmes
Tony
Sawford
Slide credit: Mélanie Courtot, Ph.D.
Electronic Annotations (IEA)
• Quick way of producing large numbers of
annotations
• Annotations use less-specific GO terms
• Only source of annotation for ~438,000 non-model
organism species
orthology taxon
constraints
Slide credit: Mélanie Courtot, Ph.D.
* Includes manual annotations integrated from
external model organism and specialist groups
2,752,604Manual annotations*
269,207,317Electronic annotations
A public resource of data and tools
Number of annotations in UniProt-GOA database (March
2016)
http://www.ebi.ac.uk/GOA
https://www.ebi.ac.uk/QuickGO/
Slide credit: Mélanie Courtot, Ph.D.
Slide credit: Mélanie Courtot, Ph.D. & IKEA
Using AMIGO2:
http://amigo.geneontology.org
• Find the Gene Ontology term for Nucleus
• Find its child term Pronucleus
• Find a C. Elegans gene associated with this term and find the PubMed
id of the reference supporting the annotation
• Repeat for a human gene, what is the evidence for the annotation?
http://geneontology.org/page/guide-go-evidence-codes
Gene Set Enrichment Analysis
(previously covered)
Questions about GO or other ontologies?
Part 3: Knowledge graphs
Knowledge Graphs
• Also called “knowledge bases”
to distinguish them from
databases.
• An integrated collection of
assertions or claims represented
in something that can be
visualized as a graph and is
technically very much like a
database.
RNASeq reads
Gene X is expressed
Drug A caused Gene
X to be expressed
Knowing what to do with
Drug A..
Example knowledge graphs
• Wikidata: The structured equivalent of Wikipedia
• http://wikidata.org
• UniProt Knowledge Base: Manually curated Protein
knowledge base
• http://www.uniprot.org/uniprot/
• Microsoft Knowledge Graph (“Satori”)
• Google Knowledge Graph
Example: “Google Knowledge Graph” (GKG)
Vemurafenib
405,000 results
1 infobox
1 node in GKG
https://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html
Why Knowledge Graphs?
?• Answer explicit questions
• Uncover implicit relations
Implicit relations for hypothesis generation
ABC model
Swanson (1986) Fish oil, Raynaud’s syndrome and undiscovered public knowledge
http://muse.jhu.edu/article/403510/pdf
BA C
Raynaud’s
Syndrome
Dietary fish oil• platelet inhibition
• vasodilation
• lower blood viscosity
Co-occurs in an
article with
Co-occurs in an
article with
?
Open Discovery and Closed Discovery
• Open, you don’t know what C or B is (e.g. disease -> ?drug)
• Closed, you know what C is and are looking for B (e.g. disease – why? – drug)
BA C
?
Example question: drug repurposing
• For a given drug, what
diseases might it be
used to treat?
http://www.ncbi.nlm.nih.gov/pubmed/27189611
'RE:fine drugs': an interactive dashboard to access drug repurposing opportunities.
Implicit relations for hypothesis generation
ABC model for drug repurposing
BA C
drug diseasegenes
Physical
interaction
genetic
association
?
Questions on ABC model ?
Querying a knowledge graph with SPARQL
• “SPARQL protocol and RDF query
language”
• RDF: Resource Description
Framework (common standard for
storing knowledge graphs)
• A SPARQL query = a partially
completed graph
• ?’s show what you are looking for
• rest constrains the search
http://www.w3.org/TR/rdf-sparql-query/
?disease
Asking for
Constraints
Metformin
treats
Result:
Metformin
Type 2
diabetes
treats
=
https://query.wikidata.org/
Metformin’s unique id: Q19484
Treats property id: P2175
Metformin ?disease
treats
Metformin’s unique id: Q19484
Treats property id: P2175
Metformin ?disease
treats
http://tinyurl.com/gwd6pep
Example question: drug repurposing
“For a given drug, what
diseases might it be
used to treat?”
?drug
?disease
interacts
with
protein
geneencoded by
genetic
association
treats??
Example question: repurposing Metformin
http://tinyurl.com/zem3oxz
Metformin
?disease
interacts
with
protein
SLC22A3encoded by genetic
association
treats??
Solute carrier
family 22
member 3
SLC22A3
prostate
cancer
Aside
• “Validating drug repurposing signals using electronic health records: a
case study of metformin associated with reduced cancer mortality”
• https://jamia.oxfordjournals.org/content/22/1/179
Example question: repurposing all drugs
http://tinyurl.com/hwm9388
?drug
?disease
interacts
with
protein
geneencoded by
genetic
association
treats??
Adding constraints
• Find drugs that may treat disease
• according to the drug->gene->disease model
• constrained to focus on cancers
• ?disease wdt:P279* wd:Q12078 .
• limited to genes related to cell proliferation
• ?gene_product wdt:P682 ?biological_process
• ?biological_process wdt:P279* wd:Q14818032
• http://tinyurl.com/j222k6g
Other patterns?
drug disease 2
disease 1
gene 1
biological
process
Is there a connecting path in the knowledge graph?
Is it meaningful?
gene 2
treats
has function
gwas
has function
gwas
treats ?
http://tinyurl.com/gpfr9kj
Beta result viewer,
http://jonaskress.github.io/ http://tinyurl.com/jmoczaq
SPARQL endpoints of interest
• Wikidata http://query.wikidata.org
• UniProt http://sparql.uniprot.org
• MeSH https://id.nlm.nih.gov/mesh/query
• EBI https://www.ebi.ac.uk/rdf/documentation/sparql-
endpoints
• Bio2RDF https://github.com/bio2rdf/bio2rdf-
scripts/wiki/Query-repository
2 problems with knowledge graphs
Not enough knowledge in the graph
text and data mining
crowdsourcing ?
http://i9606.blogspot.com/2010/05/gene-wiki-hairball-1.html
Too much knowledge in the graph
sorting algorithms
visualizations
Plan for Thursday / Homework
• Implement and apply an ABC Model style hypothesis generating
program
• Assignment: write the program, explain its logic, explain how you
used it to generate a hypothesis, explain the hypothesis
• A Jupyter notebook with Python code will be provided to get you
started
• If you do not want to program, there will be another option using
online tools.
Suggested Reading
• Ontology
• Biomedical Ontologies in Action: Role in Knowledge Management, Data Integration and
Decision Support
• http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2592252/
• Gene Ontology: tool for the unification of biology
• http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3037419/
• Knowledge-based hypothesis generation
• Fish oil, Raynaud’s syndrome and undiscovered public knowledge
• http://muse.jhu.edu/article/403510/pdf
• Knowledge discovery by automated identification and ranking of implicit relationships
• http://bioinformatics.oxfordjournals.org/content/20/3/389.full.pdf
• Text mining
• Literature mining for the biologist: from information retrieval to biological discovery
• http://www.nature.com/nrg/journal/v7/n2/full/nrg1768.html

More Related Content

What's hot

Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in UberonChris Mungall
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesMelanie Courtot
 
Increasing Reproducibility and Reliability of Novel Object Tests Through Stan...
Increasing Reproducibility and Reliability of Novel Object Tests Through Stan...Increasing Reproducibility and Reliability of Novel Object Tests Through Stan...
Increasing Reproducibility and Reliability of Novel Object Tests Through Stan...InsideScientific
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Chris Mungall
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
 
E-Utilities
E-UtilitiesE-Utilities
E-Utilitiesmkim8
 
ContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data SeminarContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data SeminarJenny Molloy
 
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_uploadProf. Wim Van Criekinge
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeGigaScience, BGI Hong Kong
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
 
Gene Ontology WormBase Workshop International Worm Meeting 2015
Gene Ontology WormBase Workshop International Worm Meeting 2015Gene Ontology WormBase Workshop International Worm Meeting 2015
Gene Ontology WormBase Workshop International Worm Meeting 2015raymond91105
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyChris Mungall
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Chris Mungall
 

What's hot (20)

A biologist in e-Science
A biologist in e-ScienceA biologist in e-Science
A biologist in e-Science
 
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in Uberon
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resources
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
Increasing Reproducibility and Reliability of Novel Object Tests Through Stan...
Increasing Reproducibility and Reliability of Novel Object Tests Through Stan...Increasing Reproducibility and Reliability of Novel Object Tests Through Stan...
Increasing Reproducibility and Reliability of Novel Object Tests Through Stan...
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
E-Utilities
E-UtilitiesE-Utilities
E-Utilities
 
ContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data SeminarContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data Seminar
 
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data deluge
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Gene Ontology WormBase Workshop International Worm Meeting 2015
Gene Ontology WormBase Workshop International Worm Meeting 2015Gene Ontology WormBase Workshop International Worm Meeting 2015
Gene Ontology WormBase Workshop International Worm Meeting 2015
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019
 

Viewers also liked

Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Benjamin Good
 
Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden Benjamin Good
 
Mark Hopper Product And Marketing Exec 2010
Mark Hopper Product And Marketing Exec 2010Mark Hopper Product And Marketing Exec 2010
Mark Hopper Product And Marketing Exec 2010Mark Hopper
 
The National Society For The Protection Of Hmmm
The National Society For The Protection Of HmmmThe National Society For The Protection Of Hmmm
The National Society For The Protection Of Hmmmguest0233e9d0
 
Short update on The Cure game first week
Short update on The Cure game first weekShort update on The Cure game first week
Short update on The Cure game first weekBenjamin Good
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbioBenjamin Good
 
Eishi Company Profile 修改好的
Eishi Company Profile 修改好的Eishi Company Profile 修改好的
Eishi Company Profile 修改好的eishimachinery
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidataBenjamin Good
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KBenjamin Good
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyCominvent AS
 
Resume 2009 Compatible V2 1
Resume 2009 Compatible V2 1 Resume 2009 Compatible V2 1
Resume 2009 Compatible V2 1 schelby
 
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsMicrotask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsBenjamin Good
 
Human Guided Forests (HGF)
Human Guided Forests (HGF)Human Guided Forests (HGF)
Human Guided Forests (HGF)Benjamin Good
 
Dagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søkDagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søkCominvent AS
 
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...Benjamin Good
 
Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Benjamin Good
 

Viewers also liked (20)

Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2
 
Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden
 
Gene wiki jamboree
Gene wiki jamboreeGene wiki jamboree
Gene wiki jamboree
 
Mark Hopper Product And Marketing Exec 2010
Mark Hopper Product And Marketing Exec 2010Mark Hopper Product And Marketing Exec 2010
Mark Hopper Product And Marketing Exec 2010
 
The National Society For The Protection Of Hmmm
The National Society For The Protection Of HmmmThe National Society For The Protection Of Hmmm
The National Society For The Protection Of Hmmm
 
Short update on The Cure game first week
Short update on The Cure game first weekShort update on The Cure game first week
Short update on The Cure game first week
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio
 
Eishi Company Profile 修改好的
Eishi Company Profile 修改好的Eishi Company Profile 修改好的
Eishi Company Profile 修改好的
 
2to3
2to32to3
2to3
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2K
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoy
 
Resume 2009 Compatible V2 1
Resume 2009 Compatible V2 1 Resume 2009 Compatible V2 1
Resume 2009 Compatible V2 1
 
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsMicrotask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
 
Human Guided Forests (HGF)
Human Guided Forests (HGF)Human Guided Forests (HGF)
Human Guided Forests (HGF)
 
genegames.org
genegames.orggenegames.org
genegames.org
 
Dagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søkDagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søk
 
2016 mem good
2016 mem good2016 mem good
2016 mem good
 
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
 
Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3
 

Similar to Too much to know: representing knowledge in bioinformatics

Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsmikaelhuss
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Amit Sheth
 
Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesLeighton Pritchard
 
Workshop Systematic Reviews Werkgroep Sociaal Wetenschappelijke Informatie
Workshop Systematic Reviews Werkgroep Sociaal Wetenschappelijke InformatieWorkshop Systematic Reviews Werkgroep Sociaal Wetenschappelijke Informatie
Workshop Systematic Reviews Werkgroep Sociaal Wetenschappelijke Informatiejstaaks
 
Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)jstaaks
 
NCBO haendel talk 2013
NCBO haendel talk 2013NCBO haendel talk 2013
NCBO haendel talk 2013mhaendel
 
schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies Simon Jupp
 
The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960mare34
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsDuncan Hull
 
Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014Mark Wilkinson
 
TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...Phoenix Bioinformatics
 
Ontology - and Reloaded and Revolutions
Ontology - and Reloaded and RevolutionsOntology - and Reloaded and Revolutions
Ontology - and Reloaded and RevolutionsJie Bao
 
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...Andrew Su
 
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 UpdateBioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Updatedolleyj
 
ICAR2016 TAIR talk
ICAR2016 TAIR talkICAR2016 TAIR talk
ICAR2016 TAIR talkDonghui Li
 

Similar to Too much to know: representing knowledge in bioinformatics (20)

Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
 
Chibucos annot go_final
Chibucos annot go_finalChibucos annot go_final
Chibucos annot go_final
 
Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In Sequences
 
Workshop Systematic Reviews Werkgroep Sociaal Wetenschappelijke Informatie
Workshop Systematic Reviews Werkgroep Sociaal Wetenschappelijke InformatieWorkshop Systematic Reviews Werkgroep Sociaal Wetenschappelijke Informatie
Workshop Systematic Reviews Werkgroep Sociaal Wetenschappelijke Informatie
 
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systemsA01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
 
Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)
 
NCBO haendel talk 2013
NCBO haendel talk 2013NCBO haendel talk 2013
NCBO haendel talk 2013
 
schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies
 
The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014
 
TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...
 
Ontology - and Reloaded and Revolutions
Ontology - and Reloaded and RevolutionsOntology - and Reloaded and Revolutions
Ontology - and Reloaded and Revolutions
 
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...
 
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 UpdateBioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
 
ICAR2016 TAIR talk
ICAR2016 TAIR talkICAR2016 TAIR talk
ICAR2016 TAIR talk
 
PhDc exam presentation
PhDc exam presentationPhDc exam presentation
PhDc exam presentation
 
2014 bangkok-talk
2014 bangkok-talk2014 bangkok-talk
2014 bangkok-talk
 
Prosdocimi ucb cdao
Prosdocimi ucb cdaoProsdocimi ucb cdao
Prosdocimi ucb cdao
 

More from Benjamin Good

Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledgeBenjamin Good
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsBenjamin Good
 
Pathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsPathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsBenjamin Good
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of FoodBenjamin Good
 
Gene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopGene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopBenjamin Good
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationBenjamin Good
 
Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Benjamin Good
 
Channeling Collaborative Spirit
Channeling Collaborative SpiritChanneling Collaborative Spirit
Channeling Collaborative SpiritBenjamin Good
 
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery (Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery Benjamin Good
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfBenjamin Good
 
Building a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBuilding a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBenjamin Good
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBenjamin Good
 
Serious games for bioinformatics education. ISMB 2014 education workshop
Serious games for bioinformatics education.  ISMB 2014 education workshopSerious games for bioinformatics education.  ISMB 2014 education workshop
Serious games for bioinformatics education. ISMB 2014 education workshopBenjamin Good
 
The Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival predictionThe Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival predictionBenjamin Good
 
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Benjamin Good
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationBenjamin Good
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingBenjamin Good
 

More from Benjamin Good (20)

Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledge
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity Models
 
Pathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsPathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMs
 
Knowledge Beacons
Knowledge BeaconsKnowledge Beacons
Knowledge Beacons
 
Science Game Lab
Science Game LabScience Game Lab
Science Game Lab
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of Food
 
Gene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopGene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshop
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocuration
 
Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016
 
Channeling Collaborative Spirit
Channeling Collaborative SpiritChanneling Collaborative Spirit
Channeling Collaborative Spirit
 
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery (Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
 
(Bio)Hackathons
(Bio)Hackathons(Bio)Hackathons
(Bio)Hackathons
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdf
 
Building a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBuilding a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen science
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
 
Serious games for bioinformatics education. ISMB 2014 education workshop
Serious games for bioinformatics education.  ISMB 2014 education workshopSerious games for bioinformatics education.  ISMB 2014 education workshop
Serious games for bioinformatics education. ISMB 2014 education workshop
 
The Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival predictionThe Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival prediction
 
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meeting
 

Recently uploaded

Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 

Recently uploaded (20)

9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 

Too much to know: representing knowledge in bioinformatics

  • 1. Computing on the shoulders of giants: how existing knowledge is represented and applied in bioinformatics Benjamin Good bgood@scripps.edu Assistant Professor of the Department of Molecular and Experimental Medicine Specialty: artificial intelligence, crowdsourcing
  • 2. The more you can ‘know’ the better a scientist you can become (Derived from Alex Pico, WikiPathways) ideas Knowledge data
  • 3. Too much to know • PubMed lists > 1 million articles published each year (more than 2 per minute) • Your capacity to read and comprehend is limiting Knowledge
  • 5. Goals for representing knowledge (outline) • Make things (articles, genes, antibodies, etc.) easier to find • Answer questions • Generate hypotheses Controlled vocabularies (MeSH) Ontologies (Gene Ontology) knowledge graphs on the Web: the SPARQL query language knowledge plus computation = inference, the ABC model
  • 6. Part 1: Medical Subject Headings (MeSH)
  • 7. Finding what to read: controlled vocabularies for indexing PubMed • What happens when you search PubMed?
  • 8.
  • 9. MeSH controlled vocabulary (AKA ‘thesaurus’) • Descriptor Unique ID: D013575 • Definition: A transient loss of consciousness and postural tone caused by diminished blood flow to the brain… • Entry Terms: Syncopes, Fainting, Syncopal Vertigo, Presyncope, Drop Attack, Carotid Sinus Syncope,… • Relations to other terms Syncope Neurocognitive disorders Consciousness disorders Mental disorders Narrower Vasovagal Syncope Broader 39,186 199,545 1,030,165 articles 1616 11,287
  • 10. MeSH: medical subject headings • >27,000 descriptors • >87,000 entry terms • 16 hierarchical trees • Constantly being revised
  • 11. Demo and play time • View and explore the MeSH trees: • https://www.nlm.nih.gov/mesh/2016/mesh_browser/MeSHtree.html • Use MeSH to query PubMed • Go to: http://www.ncbi.nlm.nih.gov/mesh • Search for the term ’fainting’ • click ‘Add to search builder’ • click search PubMed • click back, search for other things..
  • 12. Query demos • Query expansion • Hand Bones [Mesh] • Hand Bones [Mesh:NoExp] • Boolean operators • cardiac hypertrophy and use rodents besides mice and rats in their experiments • ("Cardiomegaly"[Mesh]) • AND "Rodentia"[Mesh] • NOT "Mice"[Mesh] NOT "Rats"[Mesh] • Article type filter • Review papers about cardiac hypertrophy • Cardiomegaly [MeSH] AND Review[ptyp] • Try with http://www.ncbi.nlm.nih.gov/pubmed/advanced
  • 13. Questions about MeSH ? • Good 3 minute tutorial video on practical use: http://www.youtube.com/watch?v=uyF8uQY9wys
  • 15. “Ontology” • The word comes from philosophy: • “the branch of metaphysics dealing with the nature of being” • In practice they are: • A set of concepts, definitions and inter-relationships. • (The dividing line between “controlled vocabulary”, “thesaurus”, “ontology” is hazy and not terribly important for practical purposes.) • We have hundreds of ontologies in biology, e.g. see: • http://www.obofoundry.org (100+) • http://bioportal.bioontology.org (500+)
  • 16. The Gene Ontology Ashburner et al., Nat Genet. 2000 May;25(1):25-9. Started in 1999 As a collaboration between 3 Model Organism Databases Slide credit: Mélanie Courtot, Ph.D.
  • 17. • A way to capture biological knowledge for individual gene products in a computable form • A set of concepts and their relationships to each other arranged as a hierarchy http://www.ebi.ac.uk/QuickGO Less specific concepts More specific concepts The Gene Ontology Slide credit: Mélanie Courtot, Ph.D.
  • 18. 1. Molecular Function An elemental activity or task or job • protein kinase activity • insulin receptor activity 3. Cellular Component Where a gene product is located • mitochondrion • mitochondrial matrix • mitochondrial inner membrane 2. Biological Process A commonly recognized series of events • cell division The GO branches Slide credit: Mélanie Courtot, Ph.D.
  • 19. Building the GO (now covering more than 40,000 terms) • GO editorial team based at the European Bioinformatics Institute • Submission via GitHub, https://github.com/geneontology/ • Submissions via TermGenie, http://go.termgenie.org • In principal, anyone can suggest a change to the ontology, but the GO editors make the decisions about what goes in. Slide credit: Mélanie Courtot, Ph.D.
  • 20. Using the GO to describe gene products gene -> GO term associated genes GO Database genome and protein databases Slide credit: Mélanie Courtot, Ph.D.
  • 22. …a statement that a gene product; P00505 Accession Name GO ID GO term name Reference Evidence code IDAPMID:2731362aspartate transaminase activityGO:0004069GOT2 A GO annotation is … Slide credit: Mélanie Courtot, Ph.D.
  • 23. …a statement that a gene product; P00505 Accession Name GO ID GO term name Reference Evidence code IDAPMID:2731362aspartate transaminase activityGO:0004069GOT2 A GO annotation is … has a particular molecular function or is involved in a particular biological process or is located within a certain cellular component 1. Slide credit: Mélanie Courtot, Ph.D.
  • 24. …a statement that a gene product; P00505 Accession Name GO ID GO term name Reference Evidence code IDAPMID:2731362aspartate transaminase activityGO:0004069GOT2 A GO annotation is … has a particular molecular function or is involved in a particular biological process or is located within a certain cellular component 1. 2. as described in a particular reference Slide credit: Mélanie Courtot, Ph.D.
  • 25. …a statement that a gene product; P00505 Accession Name GO ID GO term name Reference Evidence code IDAPMID:2731362aspartate transaminase activityGO:0004069GOT2 A GO annotation is … has a particular molecular function or is involved in a particular biological process or is located within a certain cellular component 1. 2. as described in a particular reference 3. according to a particular method Slide credit: Mélanie Courtot, Ph.D.
  • 26. Experimental data Computational analysis Author statements/ curator inference Kinds of evidence for GO annotations by curators Slide credit: Mélanie Courtot, Ph.D. http://geneontology.org/page/guide-go-evidence-codes Inferred from Experiment (EXP) Inferred from Direct Assay (IDA) Inferred from Physical Interaction (IPI) Inferred from Mutant Phenotype (IMP) Inferred from Genetic Interaction (IGI) Inferred from Expression Pattern (IEP) Inferred from Sequence or structural Similarity (ISS) Inferred from Sequence Orthology (ISO) Inferred from Sequence Alignment (ISA) Inferred from Sequence Model (ISM) Inferred from Genomic Context (IGC) Inferred from Biological aspect of Ancestor (IBA) Inferred from Biological aspect of Descendant (IBD) Inferred from Key Residues (IKR) Inferred from Rapid Divergence(IRD) Inferred from Reviewed Computational Analysis (RCA) Traceable Author Statement (TAS) Non-traceable Author Statement (NAS) Inferred by Curator (IC) No biological Data available (ND) evidence code
  • 27. Inferred from Electronic Annotation (IEA) http://geneontology.org/page/guide-go-evidence-codes The one evidence code used for completely automated annotation
  • 28. Manual annotations • Time-consuming process producing lower numbers of annotations (~2,800 taxons covered) • More specific GO terms • Manual annotation is essential for creating predictions Aleksandra Shypitsyna Elena Speretta Alex Holmes Tony Sawford Slide credit: Mélanie Courtot, Ph.D.
  • 29. Electronic Annotations (IEA) • Quick way of producing large numbers of annotations • Annotations use less-specific GO terms • Only source of annotation for ~438,000 non-model organism species orthology taxon constraints Slide credit: Mélanie Courtot, Ph.D.
  • 30. * Includes manual annotations integrated from external model organism and specialist groups 2,752,604Manual annotations* 269,207,317Electronic annotations A public resource of data and tools Number of annotations in UniProt-GOA database (March 2016) http://www.ebi.ac.uk/GOA https://www.ebi.ac.uk/QuickGO/ Slide credit: Mélanie Courtot, Ph.D.
  • 31. Slide credit: Mélanie Courtot, Ph.D. & IKEA
  • 32. Using AMIGO2: http://amigo.geneontology.org • Find the Gene Ontology term for Nucleus • Find its child term Pronucleus • Find a C. Elegans gene associated with this term and find the PubMed id of the reference supporting the annotation • Repeat for a human gene, what is the evidence for the annotation? http://geneontology.org/page/guide-go-evidence-codes
  • 33. Gene Set Enrichment Analysis (previously covered)
  • 34. Questions about GO or other ontologies?
  • 36. Knowledge Graphs • Also called “knowledge bases” to distinguish them from databases. • An integrated collection of assertions or claims represented in something that can be visualized as a graph and is technically very much like a database. RNASeq reads Gene X is expressed Drug A caused Gene X to be expressed Knowing what to do with Drug A..
  • 37. Example knowledge graphs • Wikidata: The structured equivalent of Wikipedia • http://wikidata.org • UniProt Knowledge Base: Manually curated Protein knowledge base • http://www.uniprot.org/uniprot/ • Microsoft Knowledge Graph (“Satori”) • Google Knowledge Graph
  • 38. Example: “Google Knowledge Graph” (GKG) Vemurafenib 405,000 results 1 infobox 1 node in GKG https://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html
  • 39. Why Knowledge Graphs? ?• Answer explicit questions • Uncover implicit relations
  • 40. Implicit relations for hypothesis generation ABC model Swanson (1986) Fish oil, Raynaud’s syndrome and undiscovered public knowledge http://muse.jhu.edu/article/403510/pdf BA C Raynaud’s Syndrome Dietary fish oil• platelet inhibition • vasodilation • lower blood viscosity Co-occurs in an article with Co-occurs in an article with ?
  • 41. Open Discovery and Closed Discovery • Open, you don’t know what C or B is (e.g. disease -> ?drug) • Closed, you know what C is and are looking for B (e.g. disease – why? – drug) BA C ?
  • 42. Example question: drug repurposing • For a given drug, what diseases might it be used to treat? http://www.ncbi.nlm.nih.gov/pubmed/27189611 'RE:fine drugs': an interactive dashboard to access drug repurposing opportunities.
  • 43. Implicit relations for hypothesis generation ABC model for drug repurposing BA C drug diseasegenes Physical interaction genetic association ?
  • 44. Questions on ABC model ?
  • 45. Querying a knowledge graph with SPARQL • “SPARQL protocol and RDF query language” • RDF: Resource Description Framework (common standard for storing knowledge graphs) • A SPARQL query = a partially completed graph • ?’s show what you are looking for • rest constrains the search http://www.w3.org/TR/rdf-sparql-query/ ?disease Asking for Constraints Metformin treats Result: Metformin Type 2 diabetes treats
  • 46. = https://query.wikidata.org/ Metformin’s unique id: Q19484 Treats property id: P2175 Metformin ?disease treats
  • 47. Metformin’s unique id: Q19484 Treats property id: P2175 Metformin ?disease treats http://tinyurl.com/gwd6pep
  • 48. Example question: drug repurposing “For a given drug, what diseases might it be used to treat?” ?drug ?disease interacts with protein geneencoded by genetic association treats??
  • 49. Example question: repurposing Metformin http://tinyurl.com/zem3oxz Metformin ?disease interacts with protein SLC22A3encoded by genetic association treats?? Solute carrier family 22 member 3 SLC22A3 prostate cancer
  • 50. Aside • “Validating drug repurposing signals using electronic health records: a case study of metformin associated with reduced cancer mortality” • https://jamia.oxfordjournals.org/content/22/1/179
  • 51. Example question: repurposing all drugs http://tinyurl.com/hwm9388 ?drug ?disease interacts with protein geneencoded by genetic association treats??
  • 52. Adding constraints • Find drugs that may treat disease • according to the drug->gene->disease model • constrained to focus on cancers • ?disease wdt:P279* wd:Q12078 . • limited to genes related to cell proliferation • ?gene_product wdt:P682 ?biological_process • ?biological_process wdt:P279* wd:Q14818032 • http://tinyurl.com/j222k6g
  • 53. Other patterns? drug disease 2 disease 1 gene 1 biological process Is there a connecting path in the knowledge graph? Is it meaningful? gene 2 treats has function gwas has function gwas treats ? http://tinyurl.com/gpfr9kj
  • 55. SPARQL endpoints of interest • Wikidata http://query.wikidata.org • UniProt http://sparql.uniprot.org • MeSH https://id.nlm.nih.gov/mesh/query • EBI https://www.ebi.ac.uk/rdf/documentation/sparql- endpoints • Bio2RDF https://github.com/bio2rdf/bio2rdf- scripts/wiki/Query-repository
  • 56. 2 problems with knowledge graphs Not enough knowledge in the graph text and data mining crowdsourcing ? http://i9606.blogspot.com/2010/05/gene-wiki-hairball-1.html Too much knowledge in the graph sorting algorithms visualizations
  • 57. Plan for Thursday / Homework • Implement and apply an ABC Model style hypothesis generating program • Assignment: write the program, explain its logic, explain how you used it to generate a hypothesis, explain the hypothesis • A Jupyter notebook with Python code will be provided to get you started • If you do not want to program, there will be another option using online tools.
  • 58. Suggested Reading • Ontology • Biomedical Ontologies in Action: Role in Knowledge Management, Data Integration and Decision Support • http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2592252/ • Gene Ontology: tool for the unification of biology • http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3037419/ • Knowledge-based hypothesis generation • Fish oil, Raynaud’s syndrome and undiscovered public knowledge • http://muse.jhu.edu/article/403510/pdf • Knowledge discovery by automated identification and ranking of implicit relationships • http://bioinformatics.oxfordjournals.org/content/20/3/389.full.pdf • Text mining • Literature mining for the biologist: from information retrieval to biological discovery • http://www.nature.com/nrg/journal/v7/n2/full/nrg1768.html

Editor's Notes

  1. This picture is derived from Greek mythology: the blind giant Orion carried his servant Cedalion on his shoulders to act as the giant's eyes.
  2. If I have seen further, it is by standing on the shoulders of giants. Isaac Newton 1676* Bernard of Chartres used to say that we [the Moderns] are like dwarves perched on the shoulders of giants [the Ancients], and thus we are able to see more and farther than the latter. And this is not at all because of the acuteness of our sight or the stature of our body, but because we are carried aloft and elevated by the magnitude of the giants. https://en.wikipedia.org/wiki/Bernard_of_Chartres#cite_note-5 around 1100
  3. You can’t hope to keep even a small fraction of what could inform your work in your mind. You need to be able to find relevant information rapidly, aggregate it from many sources, and use it all to build an integrated view of what is known or at least presumed.
  4. Using a controlled vocabulary for finding articles.
  5. Examples: fainting, dizziness, cardiac hypertrophy fainting http://www.ncbi.nlm.nih.gov/mesh/68013575
  6. http://www.ncbi.nlm.nih.gov/mesh/68013575
  7. Try with http://www.ncbi.nlm.nih.gov/pubmed/advanced
  8. Provides automatic prediction of for uncharacterized sequence Predicts membership of protein families and presence of domains and features manually curated mapping of families and domains to GO terms High-level – has to be true for all (or most) members of a family – can add downstream filters Incorrect annotations when spotted and fed back to improve the mapping – changing the mapping or adding QC filtering downstream e.g. taxon constraints Mapping of domains has recently been improved – used to be to the whole protein that contained the domain, now just to the domain itself. Much more accurate.
  9. will show the link to Wormbase and PubMed Will show that its basically all transferred from some orthology computation
  10. A is implicitly related to C if A is explicitly related to B and B is explicitly related to C
  11. A is implicitly related to C if A is explicitly related to B and B is explicitly related to C