Introducing Bioinformatics
Bioinformatics in the Big Data Era
How to get into Bioinformatics?
How to learn and practice Bioinformatics?
Bioinformatics Careers and Salaries Worldwide
Applications of Bioinformatics
Take-Home Messages
1. Bioinformatics:
What, Why and
Where?
Mohamed El-Hadidi
Assistant Professor of Bioinformatics
Biomedical Informatics Program Director
School of Information Technology and Computer Science
Nile University
2. Where DNA is Located in our Body?
6/3/2020 Bioinformatics: What, Why and Where? 2
3. From Human Body to DNA Sequences
DNA Sequencers
Sequence Files
How many cells
in the Human
Body?
10 Trillion Cells!
6/3/2020 Bioinformatics: What, Why and Where? 3
4. From Human Body to DNA Sequences
DNA Sequencers
Sequence Files
How many
chromosomes in one
cell?
46
Chromosomes!
6/3/2020 4Bioinformatics: What, Why and Where?
5. From Human Body to DNA Sequences
DNA Sequencers
Sequence Files
What is the length of
all chromosomes in
one cell?
2 m in one cell!
1500 times from Earth to
moon (all cells)
6/3/2020 5Bioinformatics: What, Why and Where?
6. From Human Body to DNA Sequences
DNA Sequencers
Sequence Files
What are in
these files?
GAATTTGGGCAAGAATCCAGGCATTGGAACTTATTCAAATAACTAGTTTGCCTGTAATTTTCACTTTTTC
AGAGTCATCTGATAAAGCTTTCTTGCTACACATTTAGATAGATACACTCAATCCAGTTGTCTAGAAAGTT
CCCTGAGCCAGCTGGGAGCAGGAGGGGTAGTTGGGGCCAGGAATATTGGGGGTGTGTTTACTGAGCCCCT
AGAAAGTAAGTGCTAGATTTGACATTTCAATCCCTGAAGGCCCTGAAGTTCAGTATCAAATGACTGGTCC
TGTGGACTGAGCATCTGTGAATTGCATATGCTTAGAGTAAATTTTACTCCTACCAGTTTCAGCAGCTTGC
TTTAGCAAGCAGTATGGAAACACTAACATGGGGGAGTAGAATTTCTCTCTCTGATCCAAGTTTTATCTCA
TTCTGGTGGGTTTTCAAGGAGAGACTCGGAGTCCAAGTGTCCTTTCTGAATATATCTGGAACTTCTCATT
AACAAAAGACTCAAGTTATAATTTAGGGGACAAGGCACCCAATGAGAATGCCTTGCAGGCAGCCCTAAGT
ACACCTGCAATTACACCATTACTAGCGCGGCAGCACACATGGCCCTGACTTAGTTTAAATAATTACGTAA
GTCAACCATGATTGTTTGCCCTTTGCATAGAAGGGCAAGTATTGGTACCTGTTACAACTTAGGCTTTTTT
TTCTTTATGTTTGAGCCATGATGAGTGATTTACACTGTTGCATCCATATGTTGAGATGTAAGAATAAATT
AGACTTGGTAATTGCCCTTAAGTGTCTGGAAGTCAACTGGGGAAAGAGAGCTAGAGATAATAAGTGTGAA
ACAATGTCACAGAATCAATGACGGAACTCTTCCCAGGACAAAGGATGACTTTTGAGTTCAGTCTTTGCCT
TTAATTCTACATGGGGAGGAGAGCACGTTTAGCCACAAATGGAAGGGATTACTCATTTGAGCTATTTGGT
TATATGATTATTTCCCCAGAGAATAGGATGTGCAGGGCATTACACAAGCAGTGCCAATAGCAGCAAAGTT
CTTGAGAGTGCTAGTAATTCAAATGGCAGGAAGAGAAGGAATAAATGGTAAGGCTACCTACAGTTCACAG
AGAGCTCCATCCTCACTGTGGCTTTGGATTTTGTCCTGTGTGAAAGAGAAGTGACTGTGAACTGACATGC
TGTGTTTGGTGTTTTAGAAAGATGGCTGCAGCAGCGGTTTGGGGAATGGACTGCAGGAGTGGCATTGGAA
ACAGGAAGGTTCATGACTATTGCCAGAGACAGAGGATGAAGCAGGAGCAAGGAAGATTCAGGACAGGGGA
CTCCGGGGCTGATCAGGAGGCAGAACTGGTTGATAAGTATATGTAGCAGCATAAGAAAGAAAGAATCCCA
GATTGACACCCAGGCTTCTCACTTGGAAGCCTGGATAGATACTGAATGCAATCACAAAGGCTGGGAAGTC
AATGGGACTGCAGGGAAGGGAAGGGAAGGGAGGAGAAGAGGAAGGGCAGGAGGGTCCAATATCAATATTC
AGCTTTTAGATGTGTTGAGCTTGAAGTGCTCAGATGGAGAAGTCCAGGAGGCAGTAGAATACGGTGGTCC
AGAGCACAGGAGAGCAATGTGGCTTGAGTTGTCATTTGCTCACATATTTCCGTGTCAGTTACTTGTCTTA
GATCACAGAACAAGTTCTCCTCTCACAGTTTCCTGGCTCCACCTGTCTCATGCTCACCGTCAGCATCGAA
ATTGAGCCACACCAGGGGTTCTGGATACCAGCTTCTCTCTAGGTGAGGCTGCTATAGTCAGCAGCTGATT
AGTTGCAGTTATCAGCAACTGGTAATATAATATATTGTGCATATAAGTGTACCAGAAGTCATGTTTATAT
ATTGCTGCAAATACTCGGAATGGGGATCTCTTGTTCCCTGCTTAAGACCACATCACATTACTTGGTTTTG
TACGCTAGTGGCTGAACCAAAAAAAGTAGGAGATGATTTTTTTTCTTTTTTCTTAAAGCAGTAGCTTTTG
AACCTTGACCATGCTTTCTAACCAGCTGAGGGGCTTTTGAAAAAGAGGGTGCCTTACTGTGCCCCAGACC
AGGACAATCAGTATTTCTGGGGAATGGAGCCTGGCACACACACATTTCTTAAAGCTCCCTTGGCAATTCT
GAGGAGTGGATTACATGTTGTATGTAGCTCGTAACGAAAGAAATCTTGTCTTTGCTCTCAGACCCCCATT
TCTTACTCATCTCATGAGCTCCTTCGAGATCCAGAAACAGTTGCATATTTCATTAGTAAATCAGTTCCAG
AGTCACATTTTATTTCACAAGTTAGTCCATTAAAAGTTTCCTGCAGTGAGGAAATAGCCAGAAAGAACAC
TCCACCCCTCCTCCTTTTTATAACTATAGGGTCTGGCTCGACAGAGCAGGAGCATCGCCATCTTGGACAA
6/3/2020 6Bioinformatics: What, Why and Where?
7. From Human Body to DNA Sequences
DNA Sequencers
Sequence Files
What are in
these files?
GAATTTGGGCAAGAATCCAGGCATTGGAACTTATTCAAATAACTAGTTTGCCTGTAATTTTCACTTTTTC
AGAGTCATCTGATAAAGCTTTCTTGCTACACATTTAGATAGATACACTCAATCCAGTTGTCTAGAAAGTT
CCCTGAGCCAGCTGGGAGCAGGAGGGGTAGTTGGGGCCAGGAATATTGGGGGTGTGTTTACTGAGCCCCT
AGAAAGTAAGTGCTAGATTTGACATTTCAATCCCTGAAGGCCCTGAAGTTCAGTATCAAATGACTGGTCC
TGTGGACTGAGCATCTGTGAATTGCATATGCTTAGAGTAAATTTTACTCCTACCAGTTTCAGCAGCTTGC
TTTAGCAAGCAGTATGGAAACACTAACATGGGGGAGTAGAATTTCTCTCTCTGATCCAAGTTTTATCTCA
TTCTGGTGGGTTTTCAAGGAGAGACTCGGAGTCCAAGTGTCCTTTCTGAATATATCTGGAACTTCTCATT
AACAAAAGACTCAAGTTATAATTTAGGGGACAAGGCACCCAATGAGAATGCCTTGCAGGCAGCCCTAAGT
ACACCTGCAATTACACCATTACTAGCGCGGCAGCACACATGGCCCTGACTTAGTTTAAATAATTACGTAA
GTCAACCATGATTGTTTGCCCTTTGCATAGAAGGGCAAGTATTGGTACCTGTTACAACTTAGGCTTTTTT
TTCTTTATGTTTGAGCCATGATGAGTGATTTACACTGTTGCATCCATATGTTGAGATGTAAGAATAAATT
AGACTTGGTAATTGCCCTTAAGTGTCTGGAAGTCAACTGGGGAAAGAGAGCTAGAGATAATAAGTGTGAA
ACAATGTCACAGAATCAATGACGGAACTCTTCCCAGGACAAAGGATGACTTTTGAGTTCAGTCTTTGCCT
TTAATTCTACATGGGGAGGAGAGCACGTTTAGCCACAAATGGAAGGGATTACTCATTTGAGCTATTTGGT
TATATGATTATTTCCCCAGAGAATAGGATGTGCAGGGCATTACACAAGCAGTGCCAATAGCAGCAAAGTT
CTTGAGAGTGCTAGTAATTCAAATGGCAGGAAGAGAAGGAATAAATGGTAAGGCTACCTACAGTTCACAG
AGAGCTCCATCCTCACTGTGGCTTTGGATTTTGTCCTGTGTGAAAGAGAAGTGACTGTGAACTGACATGC
TGTGTTTGGTGTTTTAGAAAGATGGCTGCAGCAGCGGTTTGGGGAATGGACTGCAGGAGTGGCATTGGAA
ACAGGAAGGTTCATGACTATTGCCAGAGACAGAGGATGAAGCAGGAGCAAGGAAGATTCAGGACAGGGGA
CTCCGGGGCTGATCAGGAGGCAGAACTGGTTGATAAGTATATGTAGCAGCATAAGAAAGAAAGAATCCCA
GATTGACACCCAGGCTTCTCACTTGGAAGCCTGGATAGATACTGAATGCAATCACAAAGGCTGGGAAGTC
AATGGGACTGCAGGGAAGGGAAGGGAAGGGAGGAGAAGAGGAAGGGCAGGAGGGTCCAATATCAATATTC
AGCTTTTAGATGTGTTGAGCTTGAAGTGCTCAGATGGAGAAGTCCAGGAGGCAGTAGAATACGGTGGTCC
AGAGCACAGGAGAGCAATGTGGCTTGAGTTGTCATTTGCTCACATATTTCCGTGTCAGTTACTTGTCTTA
GATCACAGAACAAGTTCTCCTCTCACAGTTTCCTGGCTCCACCTGTCTCATGCTCACCGTCAGCATCGAA
ATTGAGCCACACCAGGGGTTCTGGATACCAGCTTCTCTCTAGGTGAGGCTGCTATAGTCAGCAGCTGATT
AGTTGCAGTTATCAGCAACTGGTAATATAATATATTGTGCATATAAGTGTACCAGAAGTCATGTTTATAT
ATTGCTGCAAATACTCGGAATGGGGATCTCTTGTTCCCTGCTTAAGACCACATCACATTACTTGGTTTTG
TACGCTAGTGGCTGAACCAAAAAAAGTAGGAGATGATTTTTTTTCTTTTTTCTTAAAGCAGTAGCTTTTG
AACCTTGACCATGCTTTCTAACCAGCTGAGGGGCTTTTGAAAAAGAGGGTGCCTTACTGTGCCCCAGACC
AGGACAATCAGTATTTCTGGGGAATGGAGCCTGGCACACACACATTTCTTAAAGCTCCCTTGGCAATTCT
GAGGAGTGGATTACATGTTGTATGTAGCTCGTAACGAAAGAAATCTTGTCTTTGCTCTCAGACCCCCATT
TCTTACTCATCTCATGAGCTCCTTCGAGATCCAGAAACAGTTGCATATTTCATTAGTAAATCAGTTCCAG
AGTCACATTTTATTTCACAAGTTAGTCCATTAAAAGTTTCCTGCAGTGAGGAAATAGCCAGAAAGAACAC
TCCACCCCTCCTCCTTTTTATAACTATAGGGTCTGGCTCGACAGAGCAGGAGCATCGCCATCTTGGACAA
6/3/2020 7Bioinformatics: What, Why and Where?
8. From Human Body to DNA Sequences
DNA Sequencers
Sequence Files
GAATTTGGGCAAGAATCCAGGCATTGGAACTTATTCAAATAACTAGTTTGCCTGTAATTTTCACTTTTTC
AGAGTCATCTGATAAAGCTTTCTTGCTACACATTTAGATAGATACACTCAATCCAGTTGTCTAGAAAGTT
CCCTGAGCCAGCTGGGAGCAGGAGGGGTAGTTGGGGCCAGGAATATTGGGGGTGTGTTTACTGAGCCCCT
AGAAAGTAAGTGCTAGATTTGACATTTCAATCCCTGAAGGCCCTGAAGTTCAGTATCAAATGACTGGTCC
TGTGGACTGAGCATCTGTGAATTGCATATGCTTAGAGTAAATTTTACTCCTACCAGTTTCAGCAGCTTGC
TTTAGCAAGCAGTATGGAAACACTAACATGGGGGAGTAGAATTTCTCTCTCTGATCCAAGTTTTATCTCA
TTCTGGTGGGTTTTCAAGGAGAGACTCGGAGTCCAAGTGTCCTTTCTGAATATATCTGGAACTTCTCATT
AACAAAAGACTCAAGTTATAATTTAGGGGACAAGGCACCCAATGAGAATGCCTTGCAGGCAGCCCTAAGT
ACACCTGCAATTACACCATTACTAGCGCGGCAGCACACATGGCCCTGACTTAGTTTAAATAATTACGTAA
GTCAACCATGATTGTTTGCCCTTTGCATAGAAGGGCAAGTATTGGTACCTGTTACAACTTAGGCTTTTTT
TTCTTTATGTTTGAGCCATGATGAGTGATTTACACTGTTGCATCCATATGTTGAGATGTAAGAATAAATT
AGACTTGGTAATTGCCCTTAAGTGTCTGGAAGTCAACTGGGGAAAGAGAGCTAGAGATAATAAGTGTGAA
ACAATGTCACAGAATCAATGACGGAACTCTTCCCAGGACAAAGGATGACTTTTGAGTTCAGTCTTTGCCT
TTAATTCTACATGGGGAGGAGAGCACGTTTAGCCACAAATGGAAGGGATTACTCATTTGAGCTATTTGGT
TATATGATTATTTCCCCAGAGAATAGGATGTGCAGGGCATTACACAAGCAGTGCCAATAGCAGCAAAGTT
CTTGAGAGTGCTAGTAATTCAAATGGCAGGAAGAGAAGGAATAAATGGTAAGGCTACCTACAGTTCACAG
AGAGCTCCATCCTCACTGTGGCTTTGGATTTTGTCCTGTGTGAAAGAGAAGTGACTGTGAACTGACATGC
TGTGTTTGGTGTTTTAGAAAGATGGCTGCAGCAGCGGTTTGGGGAATGGACTGCAGGAGTGGCATTGGAA
ACAGGAAGGTTCATGACTATTGCCAGAGACAGAGGATGAAGCAGGAGCAAGGAAGATTCAGGACAGGGGA
CTCCGGGGCTGATCAGGAGGCAGAACTGGTTGATAAGTATATGTAGCAGCATAAGAAAGAAAGAATCCCA
GATTGACACCCAGGCTTCTCACTTGGAAGCCTGGATAGATACTGAATGCAATCACAAAGGCTGGGAAGTC
AATGGGACTGCAGGGAAGGGAAGGGAAGGGAGGAGAAGAGGAAGGGCAGGAGGGTCCAATATCAATATTC
AGCTTTTAGATGTGTTGAGCTTGAAGTGCTCAGATGGAGAAGTCCAGGAGGCAGTAGAATACGGTGGTCC
AGAGCACAGGAGAGCAATGTGGCTTGAGTTGTCATTTGCTCACATATTTCCGTGTCAGTTACTTGTCTTA
GATCACAGAACAAGTTCTCCTCTCACAGTTTCCTGGCTCCACCTGTCTCATGCTCACCGTCAGCATCGAA
ATTGAGCCACACCAGGGGTTCTGGATACCAGCTTCTCTCTAGGTGAGGCTGCTATAGTCAGCAGCTGATT
AGTTGCAGTTATCAGCAACTGGTAATATAATATATTGTGCATATAAGTGTACCAGAAGTCATGTTTATAT
ATTGCTGCAAATACTCGGAATGGGGATCTCTTGTTCCCTGCTTAAGACCACATCACATTACTTGGTTTTG
TACGCTAGTGGCTGAACCAAAAAAAGTAGGAGATGATTTTTTTTCTTTTTTCTTAAAGCAGTAGCTTTTG
AACCTTGACCATGCTTTCTAACCAGCTGAGGGGCTTTTGAAAAAGAGGGTGCCTTACTGTGCCCCAGACC
AGGACAATCAGTATTTCTGGGGAATGGAGCCTGGCACACACACATTTCTTAAAGCTCCCTTGGCAATTCT
GAGGAGTGGATTACATGTTGTATGTAGCTCGTAACGAAAGAAATCTTGTCTTTGCTCTCAGACCCCCATT
TCTTACTCATCTCATGAGCTCCTTCGAGATCCAGAAACAGTTGCATATTTCATTAGTAAATCAGTTCCAG
AGTCACATTTTATTTCACAAGTTAGTCCATTAAAAGTTTCCTGCAGTGAGGAAATAGCCAGAAAGAACAC
TCCACCCCTCCTCCTTTTTATAACTATAGGGTCTGGCTCGACAGAGCAGGAGCATCGCCATCTTGGACAA
How many
nucleotides in the
Human body?
3 Billion
Nucleotides!
6/3/2020 8Bioinformatics: What, Why and Where?
9. From Human Body to DNA Sequences
DNA Sequencers
Sequence Files
What is the size
of data?
150 GB/person
6/3/2020 9Bioinformatics: What, Why and Where?
10. How These Files were Generated?
6/3/2020 Bioinformatics: What, Why and Where? 10
11. How These Files were Generated?
6/3/2020 Bioinformatics: What, Why and Where? 11
12. Bioinformatics Data is
Increasing Rapidly!
⢠Speed of sequencing?
ď 10,000 bp/day/machine ->
billions bp/day/machine.
⢠Computing cost and time?
ď Sequencing cost is falling 5X
faster than computing
⢠Price / genome?
ď Dropped to $1000!
⢠Storage cost?
ď 150 GB/genome
Bioinformatics: What, Why and Where? 12
How These Files were Generated?
15. How to Make Sense of This BIG DATA?
Through Bioinformatics!
What is Bioinformatics??!
6/3/2020 Bioinformatics: What, Why and Where? 15
16. What Do You Need to Learn Bioinformatics?
6/3/2020 Bioinformatics: What, Why and Where? 16
Statistics
Computer
Science
Biology
Bioinformatics
Data
Science
Biostatistics Computational
Biology
20. What is Bioinformatics?
6/3/2020 Bioinformatics: What, Why and Where? 20
GAATTTGGGCAAGAATCCAGGCATTGGAACTTATTCAAATAACTAGTTTGCCTGTAATTTTCACTTTTTC
AGAGTCATCTGATAAAGCTTTCTTGCTACACATTTAGATAGATACACTCAATCCAGTTGTCTAGAAAGTT
CCCTGAGCCAGCTGGGAGCAGGAGGGGTAGTTGGGGCCAGGAATATTGGGGGTGTGTTTACTGAGCCCCT
AGAAAGTAAGTGCTAGATTTGACATTTCAATCCCTGAAGGCCCTGAAGTTCAGTATCAAATGACTGGTCC
TGTGGACTGAGCATCTGTGAATTGCATATGCTTAGAGTAAATTTTACTCCTACCAGTTTCAGCAGCTTGC
TTTAGCAAGCAGTATGGAAACACTAACATGGGGGAGTAGAATTTCTCTCTCTGATCCAAGTTTTATCTCA
TTCTGGTGGGTTTTCAAGGAGAGACTCGGAGTCCAAGTGTCCTTTCTGAATATATCTGGAACTTCTCATT
AACAAAAGACTCAAGTTATAATTTAGGGGACAAGGCACCCAATGAGAATGCCTTGCAGGCAGCCCTAAGT
ACACCTGCAATTACACCATTACTAGCGCGGCAGCACACATGGCCCTGACTTAGTTTAAATAATTACGTAA
GTCAACCATGATTGTTTGCCCTTTGCATAGAAGGGCAAGTATTGGTACCTGTTACAACTTAGGCTTTTTT
TTCTTTATGTTTGAGCCATGATGAGTGATTTACACTGTTGCATCCATATGTTGAGATGTAAGAATAAATT
AGACTTGGTAATTGCCCTTAAGTGTCTGGAAGTCAACTGGGGAAAGAGAGCTAGAGATAATAAGTGTGAA
ACAATGTCACAGAATCAATGACGGAACTCTTCCCAGGACAAAGGATGACTTTTGAGTTCAGTCTTTGCCT
TTAATTCTACATGGGGAGGAGAGCACGTTTAGCCACAAATGGAAGGGATTACTCATTTGAGCTATTTGGT
TATATGATTATTTCCCCAGAGAATAGGATGTGCAGGGCATTACACAAGCAGTGCCAATAGCAGCAAAGTT
CTTGAGAGTGCTAGTAATTCAAATGGCAGGAAGAGAAGGAATAAATGGTAAGGCTACCTACAGTTCACAG
AGAGCTCCATCCTCACTGTGGCTTTGGATTTTGTCCTGTGTGAAAGAGAAGTGACTGTGAACTGACATGC
TGTGTTTGGTGTTTTAGAAAGATGGCTGCAGCAGCGGTTTGGGGAATGGACTGCAGGAGTGGCATTGGAA
ACAGGAAGGTTCATGACTATTGCCAGAGACAGAGGATGAAGCAGGAGCAAGGAAGATTCAGGACAGGGGA
CTCCGGGGCTGATCAGGAGGCAGAACTGGTTGATAAGTATATGTAGCAGCATAAGAAAGAAAGAATCCCA
GATTGACACCCAGGCTTCTCACTTGGAAGCCTGGATAGATACTGAATGCAATCACAAAGGCTGGGAAGTC
AATGGGACTGCAGGGAAGGGAAGGGAAGGGAGGAGAAGAGGAAGGGCAGGAGGGTCCAATATCAATATTC
AGCTTTTAGATGTGTTGAGCTTGAAGTGCTCAGATGGAGAAGTCCAGGAGGCAGTAGAATACGGTGGTCC
AGAGCACAGGAGAGCAATGTGGCTTGAGTTGTCATTTGCTCACATATTTCCGTGTCAGTTACTTGTCTTA
GATCACAGAACAAGTTCTCCTCTCACAGTTTCCTGGCTCCACCTGTCTCATGCTCACCGTCAGCATCGAA
ATTGAGCCACACCAGGGGTTCTGGATACCAGCTTCTCTCTAGGTGAGGCTGCTATAGTCAGCAGCTGATT
AGTTGCAGTTATCAGCAACTGGTAATATAATATATTGTGCATATAAGTGTACCAGAAGTCATGTTTATAT
ATTGCTGCAAATACTCGGAATGGGGATCTCTTGTTCCCTGCTTAAGACCACATCACATTACTTGGTTTTG
TACGCTAGTGGCTGAACCAAAAAAAGTAGGAGATGATTTTTTTTCTTTTTTCTTAAAGCAGTAGCTTTTG
AACCTTGACCATGCTTTCTAACCAGCTGAGGGGCTTTTGAAAAAGAGGGTGCCTTACTGTGCCCCAGACC
AGGACAATCAGTATTTCTGGGGAATGGAGCCTGGCACACACACATTTCTTAAAGCTCCCTTGGCAATTCT
GAGGAGTGGATTACATGTTGTATGTAGCTCGTAACGAAAGAAATCTTGTCTTTGCTCTCAGACCCCCATT
TCTTACTCATCTCATGAGCTCCTTCGAGATCCAGAAACAGTTGCATATTTCATTAGTAAATCAGTTCCAG
AGTCACATTTTATTTCACAAGTTAGTCCATTAAAAGTTTCCTGCAGTGAGGAAATAGCCAGAAAGAACAC
TCCACCCCTCCTCCTTTTTATAACTATAGGGTCTGGCTCGACAGAGCAGGAGCATCGCCATCTTGGACAA
Use Existing tools to build
analysis workflows
⢠Linux
⢠Command Line
⢠Scripting
Develop your own tools
⢠Programming
⢠Algorithm Design
⢠Machine Learning
21. What is Bioinformatics?
6/3/2020 Bioinformatics: What, Why and Where? 21
GAATTTGGGCAAGAATCCAGGCATTGGAACTTATTCAAATAACTAGTTTGCCTGTAATTTTCACTTTTTC
AGAGTCATCTGATAAAGCTTTCTTGCTACACATTTAGATAGATACACTCAATCCAGTTGTCTAGAAAGTT
CCCTGAGCCAGCTGGGAGCAGGAGGGGTAGTTGGGGCCAGGAATATTGGGGGTGTGTTTACTGAGCCCCT
AGAAAGTAAGTGCTAGATTTGACATTTCAATCCCTGAAGGCCCTGAAGTTCAGTATCAAATGACTGGTCC
TGTGGACTGAGCATCTGTGAATTGCATATGCTTAGAGTAAATTTTACTCCTACCAGTTTCAGCAGCTTGC
TTTAGCAAGCAGTATGGAAACACTAACATGGGGGAGTAGAATTTCTCTCTCTGATCCAAGTTTTATCTCA
TTCTGGTGGGTTTTCAAGGAGAGACTCGGAGTCCAAGTGTCCTTTCTGAATATATCTGGAACTTCTCATT
AACAAAAGACTCAAGTTATAATTTAGGGGACAAGGCACCCAATGAGAATGCCTTGCAGGCAGCCCTAAGT
ACACCTGCAATTACACCATTACTAGCGCGGCAGCACACATGGCCCTGACTTAGTTTAAATAATTACGTAA
GTCAACCATGATTGTTTGCCCTTTGCATAGAAGGGCAAGTATTGGTACCTGTTACAACTTAGGCTTTTTT
TTCTTTATGTTTGAGCCATGATGAGTGATTTACACTGTTGCATCCATATGTTGAGATGTAAGAATAAATT
AGACTTGGTAATTGCCCTTAAGTGTCTGGAAGTCAACTGGGGAAAGAGAGCTAGAGATAATAAGTGTGAA
ACAATGTCACAGAATCAATGACGGAACTCTTCCCAGGACAAAGGATGACTTTTGAGTTCAGTCTTTGCCT
TTAATTCTACATGGGGAGGAGAGCACGTTTAGCCACAAATGGAAGGGATTACTCATTTGAGCTATTTGGT
TATATGATTATTTCCCCAGAGAATAGGATGTGCAGGGCATTACACAAGCAGTGCCAATAGCAGCAAAGTT
CTTGAGAGTGCTAGTAATTCAAATGGCAGGAAGAGAAGGAATAAATGGTAAGGCTACCTACAGTTCACAG
AGAGCTCCATCCTCACTGTGGCTTTGGATTTTGTCCTGTGTGAAAGAGAAGTGACTGTGAACTGACATGC
TGTGTTTGGTGTTTTAGAAAGATGGCTGCAGCAGCGGTTTGGGGAATGGACTGCAGGAGTGGCATTGGAA
ACAGGAAGGTTCATGACTATTGCCAGAGACAGAGGATGAAGCAGGAGCAAGGAAGATTCAGGACAGGGGA
CTCCGGGGCTGATCAGGAGGCAGAACTGGTTGATAAGTATATGTAGCAGCATAAGAAAGAAAGAATCCCA
GATTGACACCCAGGCTTCTCACTTGGAAGCCTGGATAGATACTGAATGCAATCACAAAGGCTGGGAAGTC
AATGGGACTGCAGGGAAGGGAAGGGAAGGGAGGAGAAGAGGAAGGGCAGGAGGGTCCAATATCAATATTC
AGCTTTTAGATGTGTTGAGCTTGAAGTGCTCAGATGGAGAAGTCCAGGAGGCAGTAGAATACGGTGGTCC
AGAGCACAGGAGAGCAATGTGGCTTGAGTTGTCATTTGCTCACATATTTCCGTGTCAGTTACTTGTCTTA
GATCACAGAACAAGTTCTCCTCTCACAGTTTCCTGGCTCCACCTGTCTCATGCTCACCGTCAGCATCGAA
ATTGAGCCACACCAGGGGTTCTGGATACCAGCTTCTCTCTAGGTGAGGCTGCTATAGTCAGCAGCTGATT
AGTTGCAGTTATCAGCAACTGGTAATATAATATATTGTGCATATAAGTGTACCAGAAGTCATGTTTATAT
ATTGCTGCAAATACTCGGAATGGGGATCTCTTGTTCCCTGCTTAAGACCACATCACATTACTTGGTTTTG
TACGCTAGTGGCTGAACCAAAAAAAGTAGGAGATGATTTTTTTTCTTTTTTCTTAAAGCAGTAGCTTTTG
AACCTTGACCATGCTTTCTAACCAGCTGAGGGGCTTTTGAAAAAGAGGGTGCCTTACTGTGCCCCAGACC
AGGACAATCAGTATTTCTGGGGAATGGAGCCTGGCACACACACATTTCTTAAAGCTCCCTTGGCAATTCT
GAGGAGTGGATTACATGTTGTATGTAGCTCGTAACGAAAGAAATCTTGTCTTTGCTCTCAGACCCCCATT
TCTTACTCATCTCATGAGCTCCTTCGAGATCCAGAAACAGTTGCATATTTCATTAGTAAATCAGTTCCAG
AGTCACATTTTATTTCACAAGTTAGTCCATTAAAAGTTTCCTGCAGTGAGGAAATAGCCAGAAAGAACAC
TCCACCCCTCCTCCTTTTTATAACTATAGGGTCTGGCTCGACAGAGCAGGAGCATCGCCATCTTGGACAA
Use Existing tools to build
analysis workflows
Develop your own tools
⢠Linux
⢠Command Line
⢠Scripting
⢠Programming
⢠Algorithm Design
⢠Machine Learning
A = 1765 G = 3561
C = 2677 T = 1121
22. What is Bioinformatics?
6/3/2020 Bioinformatics: What, Why and Where? 22
Use Existing tools to build
analysis workflows
Develop your own tools
⢠Linux
⢠Command Line
⢠Scripting
⢠Programming
⢠Algorithm Design
⢠Machine Learning
23. Biologist (Biology Background)
Use existing bioinformatics tools
Computer Scientist (CS Background)
Develops bioinformatics tools
Basic User
Windows OS
Web-based Tools
GUI Standalone tools
No Programming skills
Advanced User
Linux OS
Command line Standalone
tools
Basic Programming Skills
Developer
Basic Biology Knowledge
Advanced Programming Skills
Advanced Mathematics
Advanced Statistics
Who Can Be a Bioinformatician?
6/3/2020 Bioinformatics: What, Why and Where? 23
24. How can I Learn Bioinformatics?
Tons of free courses are available online!
More than 26 million
results when searching
without comma!
6/3/2020 Bioinformatics: What, Why and Where? 24
25. How can I Learn Bioinformatics?
Tons of free courses are available online!
More than 46 million
results when searching
without comma!
6/3/2020 Bioinformatics: What, Why and Where? 25
26. Examples of Free Online Bioinformatics MOOCs
Websites
6/3/2020 Bioinformatics: What, Why and Where? 26
28. Milestones of
Bioinformatics
28
⢠OMICS Sciences
⢠Programming and Data
Structure
â˘Algorithm Design
⢠LINUX
⢠Statistics
â˘Basic Mathematics
⢠AI and Data Science
â˘Data Visualization
⢠Results Interpretation
38. 6/3/2020 Bioinformatics: What, Why and Where? 38
Institute/Company Department Sequencer
American University in Cairo (AUC) Biology Ion S5
American University in Cairo (AUC)
Global Health and Human
Ecology MiSeq
National Research Center (NRC) Genetics MiSeq
Zewail City of Science and Technology Center for Genomics
MiSeq and
NextSeq 500
Kasr Alainy School of Medicine Clinical Oncology 3 MiSeq
CCHE 57357 Genomics program
MiSeq and
NextSeq 500
Ahram Canadian University Central Research Lab
Agilent
Bioanalyzer 2100
National Research Center (NRC) Genetics Ion torrent
National Research Center (NRC) Environmental department Ion torrent PGM
MASRI ain shams University Center
Ion S5 and Ion
shef
Air forces specialised hospital Labs Miseq
Maadi military hospital Labs Ion S5
Mansoura University Stem cells center Ion torrent
National Cancer Institute (NCI) Molecular biology Ion S5
Abo Alraish Hospital Microbiology Labs MiSeq
Alexandria Regional Center for Women's Health
and Development Ion S5
Tanta University - Faculty of Medicine - Center of
Exellence Genomic Signature Center MiSeq
Magdi Yacoub Foundation
MiSeq and
NextSeq
Generations Genetics Labs MiSeq
Sequencers in Egypt
(Sample)
Source: Prof. Ahmed Moustafa, AUC.
39. What Bioinformatics Can Do for Life Sciences?
6/3/2020 Bioinformatics: What, Why and Where? 39
41. Gene Prediction
⢠Gene structure
⢠Open Reading Frames (ORFs).
⢠Start and stop of the gene
⢠Locations of exons and introns
⢠Splice variants
⢠Gene prediction is one of the first and
most important steps in understanding
any genome after being sequenced.
6/3/2020 Bioinformatics: What, Why and Where? 41
42. Sequence Comparison
⢠Compare unknown gene or protein
sequences against known sequences to
identify their origin or function.
⢠Finding Signatures that can be used in
diagnostics
6/3/2020 Bioinformatics: What, Why and Where? 42
43. Phylogenetic Analyses
⢠Evolutionary relationship among a
group of related molecules or
organisms
⢠Track gene flow based on sequence
similarity
6/3/2020 Bioinformatics: What, Why and Where? 43
44. Understand the Functions of Genes (Pathway
Analysis)
6/3/2020 Bioinformatics: What, Why and Where? 44
45. Predicting Protein Structure and Function
⢠Proteinâs 3D structure Prediction
⢠Understand how biomolecules
interact with other molecules
⢠Predict functions based on
interactions
6/3/2020 Bioinformatics: What, Why and Where? 45
46. Drug Design
⢠It is faster to analyze molecules on
computer as compared to
experimental approaches.
⢠Helps in identifying drug
targets easily
⢠Simulating drug effects on computers
6/3/2020 Bioinformatics: What, Why and Where? 46
48. Applications of
Bioinformatics in Medicine
⢠The Human Genome Project (HGP) helps scientists to
search for genes directly associated with diseases and
understand the molecular basis of those identified
diseases.
⢠This new Information will help in better understanding
of the mechanisms of diseases and hence develop
better treatment and preventive methods.
6/3/2020 Bioinformatics: What, Why and Where? 48
49. Applications of
Bioinformatics in Pharmacy
⢠Identification and validating new drugs through
Computer Aided Drug Design (CADD).
⢠Helps to develop specific drugs with less side
effect
6/3/2020 Bioinformatics: What, Why and Where? 49
50. Applications of Bioinformatics
in Food Security
⢠Large amount of genomics data is available from plants and
animals
⢠Bioinformatic analysis of plant and animal genomes will
help scientists to improve crops
⢠Resistant to drought
⢠Resistant to insects and pests
⢠More nutritional value
⢠Animals with higher meat quality and productivity
6/3/2020 Bioinformatics: What, Why and Where? 50
51. Applications of Bioinformatics
in the Environment
⢠Sequencing and analysis of microbial genomes and search
for genes expressing enzymes for
⢠Bioremediation and biodegradation
⢠Climate change studies (Microbes that use CO2 as their
sole source of enegy)
⢠Alternative energy sources (energy from light)
⢠Microbes with industrial benefits
⢠Generation of Biogas
6/3/2020 Bioinformatics: What, Why and Where? 51
53. Take Home Messages
⢠Understand the biological background first (in details)!
⢠For writing a software
⢠For using a software
⢠Which tool/software to use?
⢠Understand the algorithm behind each software/tool
⢠Test different parameters
⢠Select the best tool
⢠Free software are everywhere
⢠Read about benchmarking studies first
⢠Before Writing your own software
⢠Check if it is exist (donât work from scratch)
⢠Modify existing tools
6/3/2020 Bioinformatics: What, Why and Where? 53
54. Biologists and Computer Scientitst Should
Communicate!
6/3/2020 Bioinformatics: What, Why and Where? 54