SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Chunlei Wu, Ph.D.
cwu@scripps.edu
@chunleiwu
Associate Professor of Molecular Medicine
Dept. of Molecular Experimental Medicine
The Scripps Research Institute
La Jolla, CA, USA
01/22/2016
From MyGene.info and MyVariant.info towards BioThings API
As a
MyGene.info and MyVariant.info recap
Annotations
Gene
Variant
(Aggregated)
(high-performance)
(real-time) Web Service
So many variant annotation resources
dbNSFP
The Exome Aggregation
Consortium (ExAC)
Annotations centered around bio-entities
Gene
G
Variant
V
Pathway
P
D
Metabolite
M
Disease
Simple JSON-based Aggregation mechanism
{
"_id": "chr1:g.196659237C>T",
"cadd": { … },
"clinvar": { … },
"cosmic": { … },
"dbsnp": { … },
"dbnsfp": { … },
"evs": { … },
"emv": { … },
"mutdb": { … },
"gwassnp": { … },
"snpedia": { … },
"wellderly": { … }
}
{
"_id": "chr1:g.196659237C>T",
“dbsnp": {
"snpclass": "single",
"rsid": "rs1061170",
"func": "missense"
}
}
{
"_id": "chr1:g.196659237C>T",
“cosmic": {
"tumor_site": "breast",
"mut_freq": 0.49,
}
}
{
"_id": "chr1:g.196659237C>T",
“dbnsfp": {
“sift": {
"breast“: “tolerated”,
“val”: 1
}
}
}
“cadd” “clinvar” “evs” “mutdb”
…
Keep data always up-to-date
Each data source is updated individually. Colors
indicate their different updating schedules.
Schematic view of MyVariant.info architecture
High-performance web service APIs
Schematic view of MyVariant.info architecture
MyVariant.info for the end users:
http://MyVariant.info
(currently v1 API, two endpoints)
http://MyVariant.info/v1/query?q=<query>
any query term(s)
matching variant hits
http://MyVariant.info/v1/variant/<variantid>
hgvs id(s)
matching variant object(s)
Both supports batch-mode via POST
Simple API. No sign-up. No API key.
Try our live API , and documentations
MyGene.info for the end users:
http://MyGene.info
(currently v2 API, two endpoints)
http://MyGene.info/v2/query?q=<query>
any query term(s)
matching gene hits
http://MyGene.info/v2/gene/<geneid>
gene id(s)
matching gene object(s)
Both supports batch-mode via POST
Simple API. No sign-up. No API key.
Try our live API , and documentations
MyGene.info usage updates
last
year
this
year
2M
3MMonthly hits in Millions
Usage spikes (5M hits/day) during X-Mas 2014
30%9%
35%
26%
Increased clients adoption
Requests by MyGene.info clients
Highlights:
• mygene Python client usage now surpasses BioGPS usage
• mygene R client usage now increased to 9% from <1%
10/07/2015-01/05/2016
30%9%
35%
26%
Increased clients adoption
mygene Python client hosted in PyPI
mygene R client hosted in Bioconductor
MyVariant.info updates
Total over 334 Millions of annotated variants
The Exome Aggregation Consortium (ExAC)
New additions:
dbNSFP
Updated:
MyVariant.info updates
30%
68%
2%
10/07/2015-01/05/2016
1 Million requests in 3 months
MyVariant.info official Python/R Clients
myvariant Python client hosted in PyPI
(initial release in Aug 2015)
myvariant R client hosted in Bioconductor
(initial release in Oct 2015)
A Node.js client made by a user with passion
Next?
MyVariant.info
MyGene.info
Make our APIs serve Linked Data
via
Why Linked Data?
Gene
G
Variant
V
Pathway
P
D
Metabolite
M
Disease
Linked Data for data aggregation
MyVariant.info
V
Another Variant API
V
V
Linked Data for data aggregation
MyVariant.info Another Variant API
{
"_id": "chr1:g.196659237C>T",
“cosmic": {
"tumor_site": "breast",
"mut_freq": 0.49,
},
"clinvar": {…},
"dbsnp": {…},
…
}
{
"pop": "GWD",
"nobs": 226,
"freq": 0.371681415929,
…
}
{
"_id": "chr1:g.196659237C>T",
“cosmic": {
"tumor_site": "breast",
"mut_freq": 0.49,
},
"clinvar": {…},
"dbsnp": {…},
"new_src": {
"pop": "GWD",
"nobs": 226,
"freq": 0.371681415929
},
…
}
JSON + context = JSON-LD
{
"@context": {
"clinvar": "http://schema.myvariant.info/datasource/clinvar",
"rcv": "http://schema.myvariant.info/datanode/rcv",
"gene": "http://schema.myvariant.info/datanode/gene",
"_id": "@id"
},
"_id": "chr6:g.26093141G>A",
"clinvar": {
"@context": {
"uniprot": "http://identifiers.org/uniprot/",
"omim": "http://identifiers.org/omim/"
},
"chrom": "6",
"alt": "A",
"ref": "G",
"allele_id": 15048,
"rsid": "rs1800562",
"rcv": {
"@context": {
"accession": "http://identifer.org/clinvar"
},
"accession": "RCV000000020",
"origin": "germline",
"clinical_significance": "risk factor"
},
"gene": {
"@context": {
"symbol": "http://identifiers.org/hgnc.symbol/"
},
"id": "3077",
"symbol": "HFE"
},
"omim": "613609.0001",
"variant_id": 9
}
}
Processed JSON-LD
<chr6:g.26093141G>A> <http://schema.myvariant.info/datasource/clinvar> _:b0 .
_:b0 <http://identifiers.org/omim/> "613609.0001" .
_:b0 <http://schema.myvariant.info/datanode/gene> _:b1 .
_:b0 <http://schema.myvariant.info/datanode/rcv> _:b2 .
_:b1 <http://identifiers.org/hgnc.symbol/> "HFE" .
_:b2 <http://identifer.org/clinvar> "RCV000000020" .
JSON-LD N-Quads output:
{
"@id": "chr6:g.26093141G>A",
"http://schema.myvariant.info/datasource/clinvar": {
"http://identifiers.org/omim/": "613609.0001",
"http://schema.myvariant.info/datanode/gene": {
"http://identifiers.org/hgnc.symbol/": "HFE"
},
"http://schema.myvariant.info/datanode/rcv": {
"http://identifer.org/clinvar": "RCV000000020"
}
}
}
JSON-LD compacted output:
In a nut-shell, what JSON-LD context does?
Marks values in a JSON object to defined URIs
"http://identifer.org/clinvar"
→clinvar.rcv.accession
JSON-LD context makes your data
"Linkable"
"Linked"
Downstream
processing libraries
A Python library for processing JSON-LD data
In [1]: fetch_value_source_for_variant("chr6:g.26093141G>A","http://identifiers.org/dbsnp/")
Out[1]:
['rs1800562 http://schema.myvarint.info/datasource/dbnsfp',
'rs1800562 http://schema.myvarint.info/datasource/clinvar',
'rs1800562 http://schema.myvarint.info/datasource/dbsnp',
'rs1800562 http://schema.myvarint.info/datasource/evs',
'rs1800562 http://schema.myvarint.info/datasource/gwassnps',
'rs1800562 http://schema.myvarint.info/datasource/mutdb']
By Kevin Xin
Need to define an API specs
• Output as a JSON object with a defined _id.
• "jsonld=true/false" toggle for the inclusion of JSON-LD
context.
• Support the retrieval of a single entity via GET
(use case: individual data aggregation on the fly)
• Support the retrieval of a list of entities via POST
(use case: routine data aggregation in batches)
• Output should indicate the entity existence:
GET /variant/<unknown_id>  404
POST /variant/ id1, <unknown_id>, id3 
[id1: {…},
<unknown_id>: "notfound",
id3: {…}]
to enable data exchange via JSON-LD
BioThings
API
MyVariant.info
MyGene.info
By Cyrus Afrasiabi
BioThings API
MyVariant.info
MyGene.info
JSON data
aggregation
mechanism
High-
performance
query engine
Well-designed
REST API
pattern
JSON-LD
enabled
Linked Data
Data-updating scheduler
Python/R clients
…
Data-sharing via Web API is trending
Making a single web service is trivial,
but making a sustainable/scalable
web API is non-trivial.
We would like to help other groups to
create their own hosted web API for
sharing their data.
Action item 1: BioThings API whitepaper
Also the action item from last BD2K CA
consortium meeting and the API working
group from last year's NIH BD2K AHM
Action item 2: BioThings API framework
NIH commons
Infrastructure as a Service:
Software as a Service:
BioThings API
Action item 3: expansion to other "BioThings"
D
Disease
D
Drugs
MyDrug.info MyDisease.info
need an alt. name here
Acknowledgement
Funding and Support
U54GM114833
U01HG008473
Washtington U:
Ben Ainscough
Obi Griffith
TSRI:
Andrew Su
Jiwen Xin
Cyrus Afrasiabi
Ginger Tsueng
Adam Mark
Greg Stupp
Tim Putman
STSI:
Eric Topol
Ali Torkamani
Galina Erikson
U. Washington:
Sean Mooney
Moritz Juchler
Nikhil Gopal
OICR:
Robin Haw
UC Berkeley:
Chris Mungall
UCSD:
Trish Whetzel
MyVariant.info MyGene.info

Weitere ähnliche Inhalte

Andere mochten auch

The Open Patent Chemistry “Big Bang”: Implications, Opportunities and Caveats
The Open Patent Chemistry “Big Bang”: Implications, Opportunities and CaveatsThe Open Patent Chemistry “Big Bang”: Implications, Opportunities and Caveats
The Open Patent Chemistry “Big Bang”: Implications, Opportunities and CaveatsChris Southan
 
20120717 ismb2012
20120717 ismb201220120717 ismb2012
20120717 ismb2012anewgene
 
2016 03 25_group_meeting MyVariant.info
2016 03 25_group_meeting MyVariant.info2016 03 25_group_meeting MyVariant.info
2016 03 25_group_meeting MyVariant.infoJiwen Xin
 
UCSD / DBMI seminar 2015-02-6
UCSD / DBMI seminar 2015-02-6UCSD / DBMI seminar 2015-02-6
UCSD / DBMI seminar 2015-02-6Andrew Su
 
Open biomedical knowledge using crowdsourcing and citizen science
Open biomedical knowledge using crowdsourcing and citizen scienceOpen biomedical knowledge using crowdsourcing and citizen science
Open biomedical knowledge using crowdsourcing and citizen scienceAndrew Su
 
MyGene.info learn-more
MyGene.info learn-moreMyGene.info learn-more
MyGene.info learn-moreanewgene
 
MyGene.info talk at ISMB/BOSC 2013
MyGene.info talk at ISMB/BOSC 2013MyGene.info talk at ISMB/BOSC 2013
MyGene.info talk at ISMB/BOSC 2013anewgene
 
High-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsHigh-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsChunlei Wu
 
Chunlei wu heart_bd2k_201602_ebi
Chunlei wu heart_bd2k_201602_ebiChunlei wu heart_bd2k_201602_ebi
Chunlei wu heart_bd2k_201602_ebiChunlei Wu
 

Andere mochten auch (10)

The Open Patent Chemistry “Big Bang”: Implications, Opportunities and Caveats
The Open Patent Chemistry “Big Bang”: Implications, Opportunities and CaveatsThe Open Patent Chemistry “Big Bang”: Implications, Opportunities and Caveats
The Open Patent Chemistry “Big Bang”: Implications, Opportunities and Caveats
 
20120717 ismb2012
20120717 ismb201220120717 ismb2012
20120717 ismb2012
 
2016 03 25_group_meeting MyVariant.info
2016 03 25_group_meeting MyVariant.info2016 03 25_group_meeting MyVariant.info
2016 03 25_group_meeting MyVariant.info
 
UCSD / DBMI seminar 2015-02-6
UCSD / DBMI seminar 2015-02-6UCSD / DBMI seminar 2015-02-6
UCSD / DBMI seminar 2015-02-6
 
Open biomedical knowledge using crowdsourcing and citizen science
Open biomedical knowledge using crowdsourcing and citizen scienceOpen biomedical knowledge using crowdsourcing and citizen science
Open biomedical knowledge using crowdsourcing and citizen science
 
MyGene.info learn-more
MyGene.info learn-moreMyGene.info learn-more
MyGene.info learn-more
 
MyGene.info talk at ISMB/BOSC 2013
MyGene.info talk at ISMB/BOSC 2013MyGene.info talk at ISMB/BOSC 2013
MyGene.info talk at ISMB/BOSC 2013
 
F01-Cloud-Mygene.info
F01-Cloud-Mygene.infoF01-Cloud-Mygene.info
F01-Cloud-Mygene.info
 
High-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsHigh-performance web services for gene and variant annotations
High-performance web services for gene and variant annotations
 
Chunlei wu heart_bd2k_201602_ebi
Chunlei wu heart_bd2k_201602_ebiChunlei wu heart_bd2k_201602_ebi
Chunlei wu heart_bd2k_201602_ebi
 

Ähnlich wie Chunlei Wu BD2K 201601 MyGene.info and MyVariant.info

BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeChunlei Wu
 
MyVariant.info: Variant Annotation as a Service
MyVariant.info: Variant Annotation as a ServiceMyVariant.info: Variant Annotation as a Service
MyVariant.info: Variant Annotation as a ServiceChunlei Wu
 
Biothings APIs: high-performance bioentity-centric web services
Biothings APIs: high-performance bioentity-centric web servicesBiothings APIs: high-performance bioentity-centric web services
Biothings APIs: high-performance bioentity-centric web servicesChunlei Wu
 
BioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioCatalogue
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeChunlei Wu
 
BioThings SDK: a toolkit for building high-performance data APIs in biology
BioThings SDK: a toolkit for building high-performance data APIs in biologyBioThings SDK: a toolkit for building high-performance data APIs in biology
BioThings SDK: a toolkit for building high-performance data APIs in biologyChunlei Wu
 
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...Chunlei Wu
 
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionArabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionJasonRafeMiller
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookReproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookKeiichiro Ono
 
WuXi NextCODE Scales up Genomic Sequencing on AWS (ANT210-S) - AWS re:Invent ...
WuXi NextCODE Scales up Genomic Sequencing on AWS (ANT210-S) - AWS re:Invent ...WuXi NextCODE Scales up Genomic Sequencing on AWS (ANT210-S) - AWS re:Invent ...
WuXi NextCODE Scales up Genomic Sequencing on AWS (ANT210-S) - AWS re:Invent ...Amazon Web Services
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...University of California, San Diego
 
Fostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked DataFostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked DataMuhammad Saleem
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchDavid Ruau
 
HyQue: Evaluating scientific Hypotheses using semantic web technologies
HyQue: Evaluating scientific Hypotheses using semantic web technologiesHyQue: Evaluating scientific Hypotheses using semantic web technologies
HyQue: Evaluating scientific Hypotheses using semantic web technologiesMichel Dumontier
 
Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Pr...
Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Pr...Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Pr...
Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Pr...Lionel Briand
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisJan Aerts
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Sanjay Padhi, Ph.D
 
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...Paolo Missier
 

Ähnlich wie Chunlei Wu BD2K 201601 MyGene.info and MyVariant.info (20)

BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
 
MyVariant.info: Variant Annotation as a Service
MyVariant.info: Variant Annotation as a ServiceMyVariant.info: Variant Annotation as a Service
MyVariant.info: Variant Annotation as a Service
 
Biothings APIs: high-performance bioentity-centric web services
Biothings APIs: high-performance bioentity-centric web servicesBiothings APIs: high-performance bioentity-centric web services
Biothings APIs: high-performance bioentity-centric web services
 
BioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogue
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
 
BioThings SDK: a toolkit for building high-performance data APIs in biology
BioThings SDK: a toolkit for building high-performance data APIs in biologyBioThings SDK: a toolkit for building high-performance data APIs in biology
BioThings SDK: a toolkit for building high-performance data APIs in biology
 
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
 
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionArabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookReproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter Notebook
 
WuXi NextCODE Scales up Genomic Sequencing on AWS (ANT210-S) - AWS re:Invent ...
WuXi NextCODE Scales up Genomic Sequencing on AWS (ANT210-S) - AWS re:Invent ...WuXi NextCODE Scales up Genomic Sequencing on AWS (ANT210-S) - AWS re:Invent ...
WuXi NextCODE Scales up Genomic Sequencing on AWS (ANT210-S) - AWS re:Invent ...
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
 
Fostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked DataFostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked Data
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
 
HyQue: Evaluating scientific Hypotheses using semantic web technologies
HyQue: Evaluating scientific Hypotheses using semantic web technologiesHyQue: Evaluating scientific Hypotheses using semantic web technologies
HyQue: Evaluating scientific Hypotheses using semantic web technologies
 
Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Pr...
Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Pr...Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Pr...
Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Pr...
 
Variant Query Tool
Variant Query ToolVariant Query Tool
Variant Query Tool
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysis
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021
 
Harvester I
Harvester IHarvester I
Harvester I
 
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
 

Kürzlich hochgeladen

(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2AuEnriquezLontok
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicAditi Jain
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxkumarsanjai28051
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Organic farming with special reference to vermiculture
Organic farming with special reference to vermicultureOrganic farming with special reference to vermiculture
Organic farming with special reference to vermicultureTakeleZike1
 

Kürzlich hochgeladen (20)

(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
 
Interferons.pptx.
Interferons.pptx.Interferons.pptx.
Interferons.pptx.
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by Petrovic
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptx
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Organic farming with special reference to vermiculture
Organic farming with special reference to vermicultureOrganic farming with special reference to vermiculture
Organic farming with special reference to vermiculture
 

Chunlei Wu BD2K 201601 MyGene.info and MyVariant.info

  • 1. Chunlei Wu, Ph.D. cwu@scripps.edu @chunleiwu Associate Professor of Molecular Medicine Dept. of Molecular Experimental Medicine The Scripps Research Institute La Jolla, CA, USA 01/22/2016 From MyGene.info and MyVariant.info towards BioThings API
  • 2. As a MyGene.info and MyVariant.info recap Annotations Gene Variant (Aggregated) (high-performance) (real-time) Web Service
  • 3. So many variant annotation resources dbNSFP The Exome Aggregation Consortium (ExAC)
  • 4. Annotations centered around bio-entities Gene G Variant V Pathway P D Metabolite M Disease
  • 5. Simple JSON-based Aggregation mechanism { "_id": "chr1:g.196659237C>T", "cadd": { … }, "clinvar": { … }, "cosmic": { … }, "dbsnp": { … }, "dbnsfp": { … }, "evs": { … }, "emv": { … }, "mutdb": { … }, "gwassnp": { … }, "snpedia": { … }, "wellderly": { … } } { "_id": "chr1:g.196659237C>T", “dbsnp": { "snpclass": "single", "rsid": "rs1061170", "func": "missense" } } { "_id": "chr1:g.196659237C>T", “cosmic": { "tumor_site": "breast", "mut_freq": 0.49, } } { "_id": "chr1:g.196659237C>T", “dbnsfp": { “sift": { "breast“: “tolerated”, “val”: 1 } } } “cadd” “clinvar” “evs” “mutdb” …
  • 6. Keep data always up-to-date Each data source is updated individually. Colors indicate their different updating schedules. Schematic view of MyVariant.info architecture
  • 7. High-performance web service APIs Schematic view of MyVariant.info architecture
  • 8. MyVariant.info for the end users: http://MyVariant.info (currently v1 API, two endpoints) http://MyVariant.info/v1/query?q=<query> any query term(s) matching variant hits http://MyVariant.info/v1/variant/<variantid> hgvs id(s) matching variant object(s) Both supports batch-mode via POST Simple API. No sign-up. No API key. Try our live API , and documentations
  • 9. MyGene.info for the end users: http://MyGene.info (currently v2 API, two endpoints) http://MyGene.info/v2/query?q=<query> any query term(s) matching gene hits http://MyGene.info/v2/gene/<geneid> gene id(s) matching gene object(s) Both supports batch-mode via POST Simple API. No sign-up. No API key. Try our live API , and documentations
  • 11. Usage spikes (5M hits/day) during X-Mas 2014
  • 12. 30%9% 35% 26% Increased clients adoption Requests by MyGene.info clients Highlights: • mygene Python client usage now surpasses BioGPS usage • mygene R client usage now increased to 9% from <1% 10/07/2015-01/05/2016
  • 13. 30%9% 35% 26% Increased clients adoption mygene Python client hosted in PyPI mygene R client hosted in Bioconductor
  • 14. MyVariant.info updates Total over 334 Millions of annotated variants The Exome Aggregation Consortium (ExAC) New additions: dbNSFP Updated:
  • 16. MyVariant.info official Python/R Clients myvariant Python client hosted in PyPI (initial release in Aug 2015) myvariant R client hosted in Bioconductor (initial release in Oct 2015)
  • 17. A Node.js client made by a user with passion
  • 19. Make our APIs serve Linked Data via
  • 21. Linked Data for data aggregation MyVariant.info V Another Variant API V V
  • 22. Linked Data for data aggregation MyVariant.info Another Variant API { "_id": "chr1:g.196659237C>T", “cosmic": { "tumor_site": "breast", "mut_freq": 0.49, }, "clinvar": {…}, "dbsnp": {…}, … } { "pop": "GWD", "nobs": 226, "freq": 0.371681415929, … } { "_id": "chr1:g.196659237C>T", “cosmic": { "tumor_site": "breast", "mut_freq": 0.49, }, "clinvar": {…}, "dbsnp": {…}, "new_src": { "pop": "GWD", "nobs": 226, "freq": 0.371681415929 }, … }
  • 23. JSON + context = JSON-LD { "@context": { "clinvar": "http://schema.myvariant.info/datasource/clinvar", "rcv": "http://schema.myvariant.info/datanode/rcv", "gene": "http://schema.myvariant.info/datanode/gene", "_id": "@id" }, "_id": "chr6:g.26093141G>A", "clinvar": { "@context": { "uniprot": "http://identifiers.org/uniprot/", "omim": "http://identifiers.org/omim/" }, "chrom": "6", "alt": "A", "ref": "G", "allele_id": 15048, "rsid": "rs1800562", "rcv": { "@context": { "accession": "http://identifer.org/clinvar" }, "accession": "RCV000000020", "origin": "germline", "clinical_significance": "risk factor" }, "gene": { "@context": { "symbol": "http://identifiers.org/hgnc.symbol/" }, "id": "3077", "symbol": "HFE" }, "omim": "613609.0001", "variant_id": 9 } }
  • 24. Processed JSON-LD <chr6:g.26093141G>A> <http://schema.myvariant.info/datasource/clinvar> _:b0 . _:b0 <http://identifiers.org/omim/> "613609.0001" . _:b0 <http://schema.myvariant.info/datanode/gene> _:b1 . _:b0 <http://schema.myvariant.info/datanode/rcv> _:b2 . _:b1 <http://identifiers.org/hgnc.symbol/> "HFE" . _:b2 <http://identifer.org/clinvar> "RCV000000020" . JSON-LD N-Quads output: { "@id": "chr6:g.26093141G>A", "http://schema.myvariant.info/datasource/clinvar": { "http://identifiers.org/omim/": "613609.0001", "http://schema.myvariant.info/datanode/gene": { "http://identifiers.org/hgnc.symbol/": "HFE" }, "http://schema.myvariant.info/datanode/rcv": { "http://identifer.org/clinvar": "RCV000000020" } } } JSON-LD compacted output:
  • 25. In a nut-shell, what JSON-LD context does? Marks values in a JSON object to defined URIs "http://identifer.org/clinvar" →clinvar.rcv.accession
  • 26. JSON-LD context makes your data "Linkable" "Linked" Downstream processing libraries
  • 27. A Python library for processing JSON-LD data In [1]: fetch_value_source_for_variant("chr6:g.26093141G>A","http://identifiers.org/dbsnp/") Out[1]: ['rs1800562 http://schema.myvarint.info/datasource/dbnsfp', 'rs1800562 http://schema.myvarint.info/datasource/clinvar', 'rs1800562 http://schema.myvarint.info/datasource/dbsnp', 'rs1800562 http://schema.myvarint.info/datasource/evs', 'rs1800562 http://schema.myvarint.info/datasource/gwassnps', 'rs1800562 http://schema.myvarint.info/datasource/mutdb'] By Kevin Xin
  • 28. Need to define an API specs • Output as a JSON object with a defined _id. • "jsonld=true/false" toggle for the inclusion of JSON-LD context. • Support the retrieval of a single entity via GET (use case: individual data aggregation on the fly) • Support the retrieval of a list of entities via POST (use case: routine data aggregation in batches) • Output should indicate the entity existence: GET /variant/<unknown_id>  404 POST /variant/ id1, <unknown_id>, id3  [id1: {…}, <unknown_id>: "notfound", id3: {…}] to enable data exchange via JSON-LD
  • 30. BioThings API MyVariant.info MyGene.info JSON data aggregation mechanism High- performance query engine Well-designed REST API pattern JSON-LD enabled Linked Data Data-updating scheduler Python/R clients …
  • 31. Data-sharing via Web API is trending Making a single web service is trivial, but making a sustainable/scalable web API is non-trivial. We would like to help other groups to create their own hosted web API for sharing their data.
  • 32. Action item 1: BioThings API whitepaper Also the action item from last BD2K CA consortium meeting and the API working group from last year's NIH BD2K AHM
  • 33. Action item 2: BioThings API framework NIH commons Infrastructure as a Service: Software as a Service: BioThings API
  • 34. Action item 3: expansion to other "BioThings" D Disease D Drugs MyDrug.info MyDisease.info need an alt. name here
  • 35. Acknowledgement Funding and Support U54GM114833 U01HG008473 Washtington U: Ben Ainscough Obi Griffith TSRI: Andrew Su Jiwen Xin Cyrus Afrasiabi Ginger Tsueng Adam Mark Greg Stupp Tim Putman STSI: Eric Topol Ali Torkamani Galina Erikson U. Washington: Sean Mooney Moritz Juchler Nikhil Gopal OICR: Robin Haw UC Berkeley: Chris Mungall UCSD: Trish Whetzel MyVariant.info MyGene.info

Hinweis der Redaktion

  1. A high-performance query engine for aggregated variant annotations.
  2. A high-performance query engine for aggregated variant annotations.
  3. Annotation data are fundamental Gene anno: no need a slide to explain, everyone need them Var anno: relatively new, more and more trending due to the booming of NGS