SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
The	
  ENCODE	
  metadata	
  standard	
  to	
  integrate	
  
diverse	
  experimental	
  data	
  sets	
  
Eurie	
  L.	
  Hong,	
  Ph.D.	
  (@elhong)	
  
Project	
  Manager,	
  ENCODE	
  DCC	
  	
  	
  
Department	
  of	
  GeneFcs	
  •	
  Stanford	
  University	
  School	
  of	
  Medicine	
  
Intro	
  to	
  
the	
  DCC	
  
Metadata	
  
definiFon	
  
Using	
  
ontologies	
  
Accessing	
  
metadata	
  
2	
  
Not	
  pictured:	
  Tim	
  Dreszer,	
  	
  Jorge	
  Garcia,	
  Donna	
  Karolchik,	
  Katrina	
  Learned,	
  Forrest	
  Tanaka,	
  Marcus	
  Ho	
  
ENCODE	
  DCC	
  
Galt	
  Barber,	
  Morgan	
  Maddren,	
  Nikhil	
  Podduturi,	
  Greg	
  Roe,	
  Kate	
  Rosenbloom,	
  Laurence	
  Rowe	
  
Esther	
  Chan,	
  Venkat	
  Malladi,	
  Cricket	
  Sloan,	
  Seth	
  StraWan	
  	
  
Eurie	
  Hong,	
  Mike	
  Cherry	
  (PI),	
  Jim	
  Kent	
  (co-­‐PI),	
  Ben	
  Hitz	
  
Brian	
  Lee,	
  Stuart	
  Miyasato,	
  MaW	
  Simison,	
  Zhenhua	
  Wang	
  
@encodedcc	
   encode-­‐help@lists.stanford.edu	
  
Data	
  Wranglers	
  
So]ware	
  
engineers	
  
QA,	
  sysadmins,	
  
admin	
  
hWps://github.com/ENCODE-­‐DCC/encoded	
  
ProducFon	
  labs	
  
Analysis	
  groups	
  
	
  Role: 	
  Data	
  genera?on 	
  Data	
  organiza?on 	
  Data	
  access	
  
	
  
	
  Tasks:	
   	
  Perform	
  assays 	
  Data	
  processing	
  &	
  validaFon 	
  Web-­‐based	
  searches	
  
	
   	
  Perform	
  analyses 	
  Data	
  file	
  storage 	
  Data	
  downloads	
  
	
   	
  Validate	
  data 	
  Metadata	
  curaFon	
  
	
   	
  Submit	
  data	
  files 	
  	
  
	
   	
  Submit	
  metadata 	
  	
  
Genome	
  Browser	
  
ENCODE	
  portal	
  
(DCC)	
  
Role	
  of	
  the	
  Data	
  CoordinaFon	
  Center	
  
Data	
  files	
  
Metadata	
   DCC	
  
DCC	
   Integrative
websites!
Scientific!
community!
Challenge:	
  How	
  do	
  you	
  define	
  a	
  metadata	
  
standard	
  for	
  diverse	
  assays	
  in	
  mulFple	
  
species?	
  
Modified	
  from	
  PLoS	
  Biol	
  9-­‐e1001046,2011	
  
(M.	
  Pazin)	
  
Principles	
  driving	
  metadata	
  definiFon	
  
•  Provide	
  transparency	
  about	
  how	
  experiments	
  were	
  performed	
  
•  Capture	
  data	
  provenance	
  during	
  analyses	
  
•  Communicate	
  key	
  experimental	
  variables	
  of	
  an	
  experiment	
  
•  Communicate	
  quality	
  metrics	
  about	
  the	
  data	
  
	
  
•	
  	
  Help	
  analyze	
  and	
  interpret	
  the	
  data	
  	
  
	
  
•	
  	
  Help	
  organize	
  and	
  find	
  the	
  data	
  
Capture	
  the	
  experimental	
  design	
  
Biological	
  
replicate	
  1	
  
Technical	
  
replicate	
  1	
  
Technical	
  
replicate	
  2	
  
Biological	
  
replicate	
  2	
  
Technical	
  
replicate	
  1	
  
Technical	
  
replicate	
  2	
  
Control	
  1	
  
Control	
  2	
  
Data	
  file	
  
Technical	
  
replicate	
  1	
  
Data	
  file	
  
Results	
  file	
  
Experiment	
  
Experiment	
  
IdenFfy	
  reusable	
  experimental	
  variables	
  
Biosamples	
  
•  Type	
  (e.g.	
  Fssue,	
  cell	
  line)	
  
•  Ontology	
  term	
  name	
  
•  Source,	
  product	
  id,	
  lot	
  id	
  
•  Treatments	
  
•  Knockdown	
  
•  Fusion	
  construct	
  informaFon	
  
•  Donor	
  or	
  strain	
  informaFon	
  
•  Dates	
  (e.g.	
  growth,	
  harvest,	
  
	
  	
  	
  	
  	
  procurement)	
  
•  Passage	
  number	
  
•  StarFng	
  amount	
  	
  
•  Lab	
  assigned	
  IDs	
  
AnFbodies	
  
•  Source,	
  product	
  id,	
  lot	
  id	
  
•  Isotype	
  
•  AnFgen	
  
•  Host	
  
•  PurificaFon	
  method	
  
•  ValidaFon	
  status	
  
•  NHGRI	
  approval	
  status	
  
•  Target	
  
•  Species	
  
•  Dbxrefs	
  
Libraries	
  
•  Library	
  preparaFon	
  protocol	
  
•  Strand	
  specificity	
  
•  Size	
  selecFon	
  method	
  
•  ValidaFon	
  document	
  
•  Lysis	
  method	
  
•  SonicaFon	
  method	
  
•  ExtracFon	
  method	
  
•  Nucleic	
  acid	
  type	
  
•  Nucleic	
  acid	
  size	
  range	
  
+	
  
Files	
  
Peak	
  calls	
  
•  Reference	
  genome	
  version	
  
•  Alignment	
  so]ware	
  
•  So]ware	
  parameters	
  
•  So]ware	
  version	
  
•  Quality	
  metrics	
  (e.g.	
  NRF,	
  FRiP)	
  	
  
Alignment	
  
(selected	
  subset	
  of	
  all	
  metadata)	
  
Experiment	
  with	
  replicates	
  
Accession	
  them	
  
Biosamples	
  
•  Type	
  (e.g.	
  Fssue,	
  cell	
  line)	
  
•  Ontology	
  term	
  name	
  
•  Source,	
  product	
  id,	
  lot	
  id	
  
•  Treatments	
  
•  Knockdown	
  
•  Fusion	
  construct	
  informaFon	
  
•  Donor	
  or	
  strain	
  informaFon	
  
•  Dates	
  (e.g.	
  growth,	
  harvest,	
  
	
  	
  	
  	
  	
  procurement)	
  
•  Passage	
  number	
  
•  StarFng	
  amount	
  	
  
•  Lab	
  assigned	
  IDs	
  
AnFbodies	
  
•  Source,	
  product	
  id,	
  lot	
  id	
  
•  Isotype	
  
•  AnFgen	
  
•  Host	
  
•  PurificaFon	
  method	
  
•  ValidaFon	
  status	
  
•  NHGRI	
  approval	
  status	
  
•  Target	
  
•  Species	
  
•  DBxrefs	
  
Libraries	
  
•  Library	
  preparaFon	
  protocol	
  
•  Strand	
  specificity	
  
•  Size	
  selecFon	
  method	
  
•  ValidaFon	
  document	
  
•  Lysis	
  method	
  
•  SonicaFon	
  method	
  
•  ExtracFon	
  method	
  
•  Nucleic	
  acid	
  type	
  
•  Nucleic	
  acid	
  size	
  range	
  
+	
  
Files	
  
Peak	
  calls	
  
•  Reference	
  genome	
  version	
  
•  Alignment	
  so]ware	
  
•  So]ware	
  parameters	
  
•  So]ware	
  version	
  
•  Quality	
  metrics	
  (e.g.	
  NRF,	
  FRiP)	
  	
  
Alignment	
  
(selected	
  subset	
  of	
  all	
  metadata)	
  
Experiment	
  with	
  replicates	
  (ENCSR000DRY)	
  
ENCBS095DKV	
  (biosample)	
  
ENCDO826IFN	
  (donors)	
   ENCAB964IAU	
   ENCLB239KAN	
   ENCFF254TDA	
  
Define	
  their	
  relaFonship	
  to	
  each	
  other	
  
Biosample	
  
AnFbodies	
  
Libraries	
  
+	
  
Files	
  
Donor	
  
Biosample	
  
Replicate	
  
has	
  
has	
  
has	
  
has	
  
has	
  
has	
  
Experiment	
  
has	
  
Challenge:	
  Find	
  common	
  biosamples	
  from	
  data	
  
generated	
  by	
  two	
  consorFa	
  
356	
  terms	
  
hWp://encodeproject.org/ENCODE/cellTypes.html	
  
Projects	
  are	
  internally	
  consistent…..	
  	
  
314	
  terms	
  
GEO	
  characterisFcs:	
  common_name,	
  Fssue_type,	
  cell_type,	
  lines	
  	
  
360	
  terms	
  
Cell	
  type	
  
…	
  but	
  only	
  3	
  biosample	
  names	
  match	
  exactly	
  between	
  projects	
  
314	
  terms	
  
GEO	
  
IMR90	
  
PBMC	
  
Th17	
  
Challenge:	
  Find	
  all	
  heart-­‐related	
  Fssues?	
  
Heart_OC	
  
HCF	
  
HCFaa	
  
HCM	
  
Others?	
  
Fetal	
  Heart	
  
Heart	
  
Right	
  Atrium	
  
Right	
  Ventricle	
  
Others?	
  
Project	
  integraFon	
  using	
  ontologies	
  
DCC	
  OBI	
  (for	
  assays):	
  hWp://obi-­‐ontology.org	
  
EFO	
  (for	
  cell	
  lines):	
  	
  hWp://www.ebi.ac.uk/efo/	
  
UBERON	
  (for	
  Fssues):	
  hWp://uberon.org/	
  
CL	
  (for	
  primary	
  cells):	
  hWp://cellontology.org/	
  
ENCODE	
  portal	
  
(DCC)	
  
Other	
  
projects	
  
Ontology-­‐driven	
  searches	
  
hWp://www.encodedcc.org/	
  
Metadata	
  database	
  
Metadata	
  in	
  JSON-­‐LD	
  
Metadata	
  viewed	
  as	
  
web	
  page	
  
Scripts	
  
Query	
  using	
  REST	
  API	
  commands:	
  
GET,	
  PATCH,	
  POST	
  
DCC	
  
Challenge:	
  Provide	
  user-­‐friendly	
  *AND*	
  
programmaFc	
  access	
  to	
  the	
  data	
  	
  
Genome	
  Browser	
  
IntegraFon	
  with	
  other	
  resources	
  
hWp://www.encodedcc.org/	
  
Future	
  direcFons	
  
•  Metadata	
  definiFon:	
  Finalize	
  so]ware	
  and	
  file	
  provenance	
  
•  Ontology-­‐based	
  searches:	
  Implement	
  searches	
  for	
  ChIP-­‐seq	
  
targets	
  using	
  GO	
  annotaFons	
  
•  ProgrammaFc	
  access:	
  Implement	
  addiFonal	
  validaFons	
  upon	
  
data	
  submission	
  
Intro	
  to	
  
the	
  DCC	
  
Metadata	
  
definiFon	
  
Using	
  
ontologies	
  
Accessing	
  
metadata	
  
We	
  developed	
  a	
  single	
  data	
  model	
  that	
  reflects	
  the	
  experimental	
  
process	
  to	
  store	
  the	
  30+	
  assays	
  done	
  by	
  the	
  ENCODE	
  producFon	
  labs	
  
Using	
  ontologies	
  to	
  annotate	
  metadata	
  provides	
  instant	
  
interoperability	
  with	
  other	
  datasets	
  &	
  search	
  funcFonality	
  
ApplicaFon	
  built	
  on	
  a	
  REST	
  API	
  &	
  JSON-­‐LD	
  supports	
  
programmaFc	
  querying	
  across	
  other	
  scienFfic	
  resources	
  
Conclusions	
  
19	
  
Acknowledgements	
  
Brian	
  Lee,	
  Nikhil	
  Podduturi,	
  Greg	
  Roe,	
  Laurence	
  Rowe	
  
Esther	
  Chan,	
  Venkat	
  Malladi,	
  Cricket	
  Sloan,	
  Seth	
  StraWan	
  	
  
Eurie	
  Hong,	
  Mike	
  Cherry	
  (PI),	
  Jim	
  Kent	
  (co-­‐PI),	
  Ben	
  Hitz	
  
@encodedcc	
   encode-­‐help@lists.stanford.edu	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3GenomeInABottle
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomesGenomeInABottle
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and modelsmyGrid team
 
Genome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenomeInABottle
 
ASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleGenomeInABottle
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsgroovescience
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research ObjectsCarole Goble
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128GenomeInABottle
 
Aug2015 horizon diagnostics
Aug2015 horizon diagnosticsAug2015 horizon diagnostics
Aug2015 horizon diagnosticsGenomeInABottle
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giantsBenjamin Good
 
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013Functional Genomics Data Society
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeGigaScience, BGI Hong Kong
 

Was ist angesagt? (20)

FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
Genome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp Leiden
 
ASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottle
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traits
 
Jan2016 horizon GIAB
Jan2016 horizon GIABJan2016 horizon GIAB
Jan2016 horizon GIAB
 
Cshl minseqe 2013_ouellette
Cshl minseqe 2013_ouelletteCshl minseqe 2013_ouellette
Cshl minseqe 2013_ouellette
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128
 
Aug2015 horizon diagnostics
Aug2015 horizon diagnosticsAug2015 horizon diagnostics
Aug2015 horizon diagnostics
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
 
NETTAB 2013
NETTAB 2013NETTAB 2013
NETTAB 2013
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data deluge
 
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
 

Andere mochten auch

Cross-linked metadata standards, repositories and the data policies - The Bio...
Cross-linked metadata standards, repositories and the data policies - The Bio...Cross-linked metadata standards, repositories and the data policies - The Bio...
Cross-linked metadata standards, repositories and the data policies - The Bio...Peter McQuilton
 
Implementation of GPU-based bioinformatic tools at the ENCODE DCC
Implementation of GPU-based bioinformatic tools at the ENCODE DCCImplementation of GPU-based bioinformatic tools at the ENCODE DCC
Implementation of GPU-based bioinformatic tools at the ENCODE DCCENCODE-DCC
 
Ontology application and use at the encode dcc
Ontology application and use at the encode dccOntology application and use at the encode dcc
Ontology application and use at the encode dccENCODE-DCC
 
GI 2013 - ENCODE Project Data Access via RESTful API and JSON
GI 2013 - ENCODE Project Data Access via RESTful API and JSONGI 2013 - ENCODE Project Data Access via RESTful API and JSON
GI 2013 - ENCODE Project Data Access via RESTful API and JSONENCODE-DCC
 
Introduction to github slideshare
Introduction to github slideshareIntroduction to github slideshare
Introduction to github slideshareRakesh Sukumar
 
Git and GitHub for Documentation
Git and GitHub for DocumentationGit and GitHub for Documentation
Git and GitHub for DocumentationAnne Gentle
 
Git 101: Git and GitHub for Beginners
Git 101: Git and GitHub for Beginners Git 101: Git and GitHub for Beginners
Git 101: Git and GitHub for Beginners HubSpot
 

Andere mochten auch (8)

Cross-linked metadata standards, repositories and the data policies - The Bio...
Cross-linked metadata standards, repositories and the data policies - The Bio...Cross-linked metadata standards, repositories and the data policies - The Bio...
Cross-linked metadata standards, repositories and the data policies - The Bio...
 
Implementation of GPU-based bioinformatic tools at the ENCODE DCC
Implementation of GPU-based bioinformatic tools at the ENCODE DCCImplementation of GPU-based bioinformatic tools at the ENCODE DCC
Implementation of GPU-based bioinformatic tools at the ENCODE DCC
 
Ontology application and use at the encode dcc
Ontology application and use at the encode dccOntology application and use at the encode dcc
Ontology application and use at the encode dcc
 
GI 2013 - ENCODE Project Data Access via RESTful API and JSON
GI 2013 - ENCODE Project Data Access via RESTful API and JSONGI 2013 - ENCODE Project Data Access via RESTful API and JSON
GI 2013 - ENCODE Project Data Access via RESTful API and JSON
 
Introduction to Git and GitHub
Introduction to Git and GitHubIntroduction to Git and GitHub
Introduction to Git and GitHub
 
Introduction to github slideshare
Introduction to github slideshareIntroduction to github slideshare
Introduction to github slideshare
 
Git and GitHub for Documentation
Git and GitHub for DocumentationGit and GitHub for Documentation
Git and GitHub for Documentation
 
Git 101: Git and GitHub for Beginners
Git 101: Git and GitHub for Beginners Git 101: Git and GitHub for Beginners
Git 101: Git and GitHub for Beginners
 

Ähnlich wie ENCODE-DCC-metadata-standard-Biocurator 2014

Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim D. Pruitt
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...Syed Ahmad Chan Bukhari, PhD
 
The Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchThe Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchJeremy Leipzig
 
OSFair2017 Workshop | OmicsDI: Omics discovery index
OSFair2017 Workshop | OmicsDI: Omics discovery indexOSFair2017 Workshop | OmicsDI: Omics discovery index
OSFair2017 Workshop | OmicsDI: Omics discovery indexOpen Science Fair
 
DeepBlue epigenomic data server: programmatic data retrieval and analysis of ...
DeepBlue epigenomic data server: programmatic data retrieval and analysis of ...DeepBlue epigenomic data server: programmatic data retrieval and analysis of ...
DeepBlue epigenomic data server: programmatic data retrieval and analysis of ...Felipe Albrecht
 
NCBI API - Integration into analysis code
NCBI API - Integration into analysis codeNCBI API - Integration into analysis code
NCBI API - Integration into analysis codeJiwoong Kim
 
How to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusableHow to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusablePhoenix Bioinformatics
 
using_web_based_tools.ppt
using_web_based_tools.pptusing_web_based_tools.ppt
using_web_based_tools.pptkenter
 
using_webbased_tools.ppt
using_webbased_tools.pptusing_webbased_tools.ppt
using_webbased_tools.pptssuserb86ba7
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportAraport
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectKen Karapetyan
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Prof. Wim Van Criekinge
 
Mar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupMar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupGenomeInABottle
 
KnetMiner - Knowledge Network Miner
KnetMiner - Knowledge Network MinerKnetMiner - Knowledge Network Miner
KnetMiner - Knowledge Network MinerKeywan Hassani-Pak
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224GenomeInABottle
 

Ähnlich wie ENCODE-DCC-metadata-standard-Biocurator 2014 (20)

Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 
The Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchThe Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational Research
 
OSFair2017 Workshop | OmicsDI: Omics discovery index
OSFair2017 Workshop | OmicsDI: Omics discovery indexOSFair2017 Workshop | OmicsDI: Omics discovery index
OSFair2017 Workshop | OmicsDI: Omics discovery index
 
DeepBlue epigenomic data server: programmatic data retrieval and analysis of ...
DeepBlue epigenomic data server: programmatic data retrieval and analysis of ...DeepBlue epigenomic data server: programmatic data retrieval and analysis of ...
DeepBlue epigenomic data server: programmatic data retrieval and analysis of ...
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
NCBI API - Integration into analysis code
NCBI API - Integration into analysis codeNCBI API - Integration into analysis code
NCBI API - Integration into analysis code
 
How to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusableHow to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusable
 
using_web_based_tools.ppt
using_web_based_tools.pptusing_web_based_tools.ppt
using_web_based_tools.ppt
 
using_webbased_tools.ppt
using_webbased_tools.pptusing_webbased_tools.ppt
using_webbased_tools.ppt
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - Araport
 
Bioinformatics t2-databases v2014
Bioinformatics t2-databases v2014Bioinformatics t2-databases v2014
Bioinformatics t2-databases v2014
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
KnetMiner Overview Oct 2017
KnetMiner Overview Oct 2017KnetMiner Overview Oct 2017
KnetMiner Overview Oct 2017
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 
Mar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupMar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working Group
 
Satya Sahoo Thesis Defense
Satya Sahoo Thesis DefenseSatya Sahoo Thesis Defense
Satya Sahoo Thesis Defense
 
KnetMiner - Knowledge Network Miner
KnetMiner - Knowledge Network MinerKnetMiner - Knowledge Network Miner
KnetMiner - Knowledge Network Miner
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224
 

Kürzlich hochgeladen

module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 

Kürzlich hochgeladen (20)

module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 

ENCODE-DCC-metadata-standard-Biocurator 2014

  • 1. The  ENCODE  metadata  standard  to  integrate   diverse  experimental  data  sets   Eurie  L.  Hong,  Ph.D.  (@elhong)   Project  Manager,  ENCODE  DCC       Department  of  GeneFcs  •  Stanford  University  School  of  Medicine   Intro  to   the  DCC   Metadata   definiFon   Using   ontologies   Accessing   metadata  
  • 2. 2   Not  pictured:  Tim  Dreszer,    Jorge  Garcia,  Donna  Karolchik,  Katrina  Learned,  Forrest  Tanaka,  Marcus  Ho   ENCODE  DCC   Galt  Barber,  Morgan  Maddren,  Nikhil  Podduturi,  Greg  Roe,  Kate  Rosenbloom,  Laurence  Rowe   Esther  Chan,  Venkat  Malladi,  Cricket  Sloan,  Seth  StraWan     Eurie  Hong,  Mike  Cherry  (PI),  Jim  Kent  (co-­‐PI),  Ben  Hitz   Brian  Lee,  Stuart  Miyasato,  MaW  Simison,  Zhenhua  Wang   @encodedcc   encode-­‐help@lists.stanford.edu   Data  Wranglers   So]ware   engineers   QA,  sysadmins,   admin   hWps://github.com/ENCODE-­‐DCC/encoded  
  • 3. ProducFon  labs   Analysis  groups    Role:  Data  genera?on  Data  organiza?on  Data  access      Tasks:    Perform  assays  Data  processing  &  validaFon  Web-­‐based  searches      Perform  analyses  Data  file  storage  Data  downloads      Validate  data  Metadata  curaFon      Submit  data  files        Submit  metadata     Genome  Browser   ENCODE  portal   (DCC)   Role  of  the  Data  CoordinaFon  Center   Data  files   Metadata   DCC   DCC   Integrative websites! Scientific! community!
  • 4. Challenge:  How  do  you  define  a  metadata   standard  for  diverse  assays  in  mulFple   species?   Modified  from  PLoS  Biol  9-­‐e1001046,2011   (M.  Pazin)  
  • 5. Principles  driving  metadata  definiFon   •  Provide  transparency  about  how  experiments  were  performed   •  Capture  data  provenance  during  analyses   •  Communicate  key  experimental  variables  of  an  experiment   •  Communicate  quality  metrics  about  the  data     •    Help  analyze  and  interpret  the  data       •    Help  organize  and  find  the  data  
  • 6. Capture  the  experimental  design   Biological   replicate  1   Technical   replicate  1   Technical   replicate  2   Biological   replicate  2   Technical   replicate  1   Technical   replicate  2   Control  1   Control  2   Data  file   Technical   replicate  1   Data  file   Results  file   Experiment   Experiment  
  • 7. IdenFfy  reusable  experimental  variables   Biosamples   •  Type  (e.g.  Fssue,  cell  line)   •  Ontology  term  name   •  Source,  product  id,  lot  id   •  Treatments   •  Knockdown   •  Fusion  construct  informaFon   •  Donor  or  strain  informaFon   •  Dates  (e.g.  growth,  harvest,            procurement)   •  Passage  number   •  StarFng  amount     •  Lab  assigned  IDs   AnFbodies   •  Source,  product  id,  lot  id   •  Isotype   •  AnFgen   •  Host   •  PurificaFon  method   •  ValidaFon  status   •  NHGRI  approval  status   •  Target   •  Species   •  Dbxrefs   Libraries   •  Library  preparaFon  protocol   •  Strand  specificity   •  Size  selecFon  method   •  ValidaFon  document   •  Lysis  method   •  SonicaFon  method   •  ExtracFon  method   •  Nucleic  acid  type   •  Nucleic  acid  size  range   +   Files   Peak  calls   •  Reference  genome  version   •  Alignment  so]ware   •  So]ware  parameters   •  So]ware  version   •  Quality  metrics  (e.g.  NRF,  FRiP)     Alignment   (selected  subset  of  all  metadata)   Experiment  with  replicates  
  • 8. Accession  them   Biosamples   •  Type  (e.g.  Fssue,  cell  line)   •  Ontology  term  name   •  Source,  product  id,  lot  id   •  Treatments   •  Knockdown   •  Fusion  construct  informaFon   •  Donor  or  strain  informaFon   •  Dates  (e.g.  growth,  harvest,            procurement)   •  Passage  number   •  StarFng  amount     •  Lab  assigned  IDs   AnFbodies   •  Source,  product  id,  lot  id   •  Isotype   •  AnFgen   •  Host   •  PurificaFon  method   •  ValidaFon  status   •  NHGRI  approval  status   •  Target   •  Species   •  DBxrefs   Libraries   •  Library  preparaFon  protocol   •  Strand  specificity   •  Size  selecFon  method   •  ValidaFon  document   •  Lysis  method   •  SonicaFon  method   •  ExtracFon  method   •  Nucleic  acid  type   •  Nucleic  acid  size  range   +   Files   Peak  calls   •  Reference  genome  version   •  Alignment  so]ware   •  So]ware  parameters   •  So]ware  version   •  Quality  metrics  (e.g.  NRF,  FRiP)     Alignment   (selected  subset  of  all  metadata)   Experiment  with  replicates  (ENCSR000DRY)   ENCBS095DKV  (biosample)   ENCDO826IFN  (donors)   ENCAB964IAU   ENCLB239KAN   ENCFF254TDA  
  • 9. Define  their  relaFonship  to  each  other   Biosample   AnFbodies   Libraries   +   Files   Donor   Biosample   Replicate   has   has   has   has   has   has   Experiment   has  
  • 10. Challenge:  Find  common  biosamples  from  data   generated  by  two  consorFa   356  terms   hWp://encodeproject.org/ENCODE/cellTypes.html   Projects  are  internally  consistent…..     314  terms   GEO  characterisFcs:  common_name,  Fssue_type,  cell_type,  lines    
  • 11. 360  terms   Cell  type   …  but  only  3  biosample  names  match  exactly  between  projects   314  terms   GEO   IMR90   PBMC   Th17  
  • 12. Challenge:  Find  all  heart-­‐related  Fssues?   Heart_OC   HCF   HCFaa   HCM   Others?   Fetal  Heart   Heart   Right  Atrium   Right  Ventricle   Others?  
  • 13. Project  integraFon  using  ontologies   DCC  OBI  (for  assays):  hWp://obi-­‐ontology.org   EFO  (for  cell  lines):    hWp://www.ebi.ac.uk/efo/   UBERON  (for  Fssues):  hWp://uberon.org/   CL  (for  primary  cells):  hWp://cellontology.org/   ENCODE  portal   (DCC)   Other   projects  
  • 15. Metadata  database   Metadata  in  JSON-­‐LD   Metadata  viewed  as   web  page   Scripts   Query  using  REST  API  commands:   GET,  PATCH,  POST   DCC   Challenge:  Provide  user-­‐friendly  *AND*   programmaFc  access  to  the  data     Genome  Browser  
  • 16. IntegraFon  with  other  resources   hWp://www.encodedcc.org/  
  • 17. Future  direcFons   •  Metadata  definiFon:  Finalize  so]ware  and  file  provenance   •  Ontology-­‐based  searches:  Implement  searches  for  ChIP-­‐seq   targets  using  GO  annotaFons   •  ProgrammaFc  access:  Implement  addiFonal  validaFons  upon   data  submission  
  • 18. Intro  to   the  DCC   Metadata   definiFon   Using   ontologies   Accessing   metadata   We  developed  a  single  data  model  that  reflects  the  experimental   process  to  store  the  30+  assays  done  by  the  ENCODE  producFon  labs   Using  ontologies  to  annotate  metadata  provides  instant   interoperability  with  other  datasets  &  search  funcFonality   ApplicaFon  built  on  a  REST  API  &  JSON-­‐LD  supports   programmaFc  querying  across  other  scienFfic  resources   Conclusions  
  • 19. 19   Acknowledgements   Brian  Lee,  Nikhil  Podduturi,  Greg  Roe,  Laurence  Rowe   Esther  Chan,  Venkat  Malladi,  Cricket  Sloan,  Seth  StraWan     Eurie  Hong,  Mike  Cherry  (PI),  Jim  Kent  (co-­‐PI),  Ben  Hitz   @encodedcc   encode-­‐help@lists.stanford.edu