2. Background
• physics diploma, University of Heidelberg
• diploma thesis in radiation dosimetry
at DKFZ
• measurements at HIT
2 24.05.2012 Felix Klein
3. Why bioinformatics?
• interdisciplinary
• programmed in R
• worked on data analysis
3 24.05.2012 Felix Klein
6. Investigation of chromatin 3D structure
• role of chromatin 3D structure in gene regulation
• 4C to investigate detailed interactions of
cis-regulatory modules (CRMs)
• global chromatin interactome using HiC
6 24.05.2012 Felix Klein
9. What was important for me?
• bioinformatics group with
members of diverse
backgrounds
• PI who successfully
trained bioinformaticians
• well established group in
bioinformatics
9 24.05.2012 Felix Klein
10. What might be interesting for you
• turn data into biology
• interaction with people from biology groups
• communication skills !!!
• workload divides mainly into:
• programming (50 %)
• reports, meetings, email
10 24.05.2012 Felix Klein
11. Acknowledgements
Wolfgang Huber
Simon Anders
Joseph Barry
Bernd Fischer
Julian Gehring
Aleksandra Pekowska
Paul Theodor Pyl
Alejandro Reyes
Maria Secrier
Collaborators:
Michael Boutros
Christian Volz
Eileen Furlong
Yad Ghavi Helm
11 24.05.2012 Felix Klein
12. Data production rates
LHC: 1.8 GB / s at peak capacity (i.e. actively conducting a
primary aspect of the LHC’s four main experiments: ATLAS,
ALICE, CMS, and LHCb).
These experiments will take roughly a decade to complete, and
each of them is expected to produce over a 1 PB per year of
data.
One Illumina HiSeq: up to 600 Gb/run , i.e. ~600 GB/10 days =
18 TB/year (not including derived data e.g. BAM)
One Digital Embryo (2008): 3.5 TB (2048 x 2048 x 370 x 1226)
EMBL-EBI: in 9/2011, data storage capacity was 14 PB