Short description and updates about DeepBlue Epigenomic Data Server that I presented during the last Blueprint (http://www.blueprint-epigenome.eu/) Jamboree in Madrid (June 2016)
Microteaching on terms used in filtration .Pharmaceutical Engineering
DeepBlue epigenomic data server: programmatic data retrieval and analysis of epigenome region sets
1. DeepBlue epigenomic data server
programmatic data retrieval and analysis of epigenome region sets.
Felipe Albrecht, Markus List
Max Planck Institute for Informatics
June 23, 2016
2. Problems with the epigenomic data deluge
Data access
Data analysis
Scalabiltiy
3. • More than a simple data archive
• Organizes the data with a defined
vocabulary and ontologies (defined by
IHEC)
• Full-text search
• Web interface
• Detailed documentation
• Data operations
– filter by regions attributes (location,
score)
– data intersection
– flank and extend region
– aggregate/summarize
• Download only the relevant data
• Use your favorite language:
– R, Python, JavaScript, etc
4. Signal, peak and methylation
– Histone marks
– DNA Methylation - WGBS, RRBS
– RNA-seq (mRNA, shRNA knockdown)
– Chromatin accessibility (DNAse, NOMe)
– Transcription factors binding sites
– Gene annotation sets (GENCODE)
Module that periodically updates DeepBlue with new data
DeepBlue data
Data from the following Epigenome Mapping Consortia
• BLUEPRINT Epigenome
• DEEP (for DEEP members)
• ENCODE
• Roadmap Epigenomic
More than 56.000 experiments and 18 Tb of data
5. • Besides the experiment name,
all experiments have 5 mandatory metadata fields that are
part of controlled vocabularies:
– Genome assembly
– BioSources (cell lines, cell types, tissues, organs) -
CL, EFO, and UBERON ontology
– Epigenetic mark
– Technique
– Project
• It is possible to include key-value strings with extra information
about the epigenomic experiment
Experiments metadata
8. July 10, 2016 8/19
http://deepblue.mpi-inf.mpg.de/R
• Intuitive access for R users
• Connect to others R/Bioconductor packages to facilitate downstream analysis:
such as GenomicRanges and GVIz
• Documentation (examples and vignette)
• Submitted to Bioconductor
9. Summarize and plot the average DNA
Methylation level accross multiple files
More examples at
http://deepblue.mpi-inf.mpg.de/R
10. Acknowledgements
Thomas Lengauer
Christoph Bock
Joachim Büch
Georg Friedrich
Markus List
Peter Ebert
Fabian Müller
Obaro Odiete
Albrecht,F., List,M., Bock,C. and Lengauer,T. (2016) DeepBlue
epigenomic data server: programmatic data retrieval and
analysis of epigenome region sets. Nucleic Acids Research,
doi:10.1093/nar/gkw211