SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Genome exploration in  A-T G-C space introducing   Icarus a DNA walking program Jonathan Blakes MSc Biotechnology and Computation Department of Biosciences Faculty of Science, Technology and Medical Studies
Problem too much information!
EnsEMBL UCSC Genome Browsers
Hypothesis Can DNA sequences be plotted in such a way that long sequences can be easily interpreted by humans without  a prior i knowledge? “ It seems that the simplest method of visualizing some properties of genomes is to send a virtual walker for a genomic walk, ask "it" to talk about what it has seen and note its observations. If our walker doesn't move with a Brownian-like motion, it is possible to extract from its walk a lot of information . ” Stanislaw Cebrat , the principal Polish proponent of DNA walks Assigning a cardinal coordinate ( north ,  south ,  east  or  west ) to each of the four nucleotide bases ( A ,  T ,  G ,  C ) and taking steps in those directions as a sequence is read sequentially will produce a ‘walk’ of the sequence in which repetitive DNA elements will be seen as repetitive 2-dimensional ‘structures’.
DNA walks are plots of DNA or RNA sequences where each of the four nucleotide bases is assigned a direction and distance, the sequence is read off one nucleotide at a time and for each nucleotide the virtual walker takes a step in the designated direction creating a 'walk' of the sequence that reveals elements of structure in the nucleotide composition. DNA walking From  Comparative Genometrics website,  L'Université de Lausanne
Icarus Live Demonstration Could someone please suggest a mammalian gene to walk?
Mapping 24  possible combinations of cardinal vectors: 4 rotations for each of the 3 above mappings, and  4 rotations of each of their reflections about the x or y plane. Choosing which  3  ‘unique’ mappings of those 24 is a matter of parsimony.
A-T G-C
A-G C-T
A-C G-T
A-T G-C
A-T G-C is consistently smallest Smaller pictures can contain more information in less space and are therefore more amenable to publication, hence  Genome Exploration in  A-T G-C space
Duplications exons   introns a  7  fold contiguous duplication in the male Y chromosome. Members of the TSPY (Testis-specific Y-encoded proteins) family identified by Skaletsky et al 1  using a combination of a whole chromosome dotplot with a 2-kb window and a custom Perl script running BLAST alignments of all 5-kb sequence segments, in 2-kb steps, of the entire MSY (Male Specific Y).  In contrast I stumbled upon this purely by accident. 1. Skaletsky et al. Nature 2003 423.
DNA walks for phylogenetics ,[object Object],[object Object],[object Object],Imagine a 1-dimensional textual DNA sequence. The distance from the first base to the last is simply the number of bases in the sequence. A comparison of aligned sequences on the basis of spatial distance (a much simpler measure than the Jukes-Cantor definition of evolutionary distance) will be unable to discriminate between them. 7  previously aligned 1798-nucleotide long  small ribosomal subunit sequences  of Candida and Saccharomyces species as detailed in Gilfillan 1  were walked and their total  euclidean  distances used to produce a phylogeny, which was compared to Gilfillan’s. 1.  Gilfillan GD, et. al. Microbiology. 1998. 144: 829-838.
Phylogeny algorithms neighbour joining Icarus’ UPGMA Distance Matrix
Phylogeny Demonstration
Newick format    Distance Matrix Output Newick format string representation of a tree: (Bovine:0.69395, (Gibbon:0.36079, (Orang:0.33636, (Gorilla:0.17147, (Chimp:0.19268, Human:0.11927) :0.08386):0.06124):0.15057):0.54939, Mouse:1.21460);
Phylogenies with DNA walks
Does summing distances from 3 mappings eliminate bias and produce a better phylogeny? NO. A better distance measure is needed.
Conclusion ,[object Object],[object Object],[object Object]
Acknowledgements I would like to thank: Dr. Gary Robinson Dr. Colin Johnson Dr. Anthony Baines And everyone I have met during the  Biotechnology and Computation MSc.

Weitere ähnliche Inhalte

Was ist angesagt?

Human genome
Human genomeHuman genome
Human genome
Dansfera
 
Unilag workshop complex genome analysis
Unilag workshop   complex genome analysisUnilag workshop   complex genome analysis
Unilag workshop complex genome analysis
Dr. Olusoji Adewumi
 
Gene Mapping; By: Lauren Mary
Gene Mapping; By: Lauren MaryGene Mapping; By: Lauren Mary
Gene Mapping; By: Lauren Mary
Cooldawg101
 

Was ist angesagt? (20)

How to quantify hierarchy?
How to quantify hierarchy?How to quantify hierarchy?
How to quantify hierarchy?
 
Gene Mapping Methods:Linkage Maps & Mapping with Molecular Markers
Gene  Mapping  Methods:Linkage Maps & Mapping with Molecular MarkersGene  Mapping  Methods:Linkage Maps & Mapping with Molecular Markers
Gene Mapping Methods:Linkage Maps & Mapping with Molecular Markers
 
Genome mapping
Genome mappingGenome mapping
Genome mapping
 
Gene mapping
Gene mappingGene mapping
Gene mapping
 
Gene mapping
Gene mappingGene mapping
Gene mapping
 
Chromosome or gene mapping &Linkage analysis
Chromosome or gene mapping &Linkage analysisChromosome or gene mapping &Linkage analysis
Chromosome or gene mapping &Linkage analysis
 
Human genome
Human genomeHuman genome
Human genome
 
Difference between genetic linkage and physical map
Difference between genetic  linkage and physical  mapDifference between genetic  linkage and physical  map
Difference between genetic linkage and physical map
 
genome mapping
genome mappinggenome mapping
genome mapping
 
Linkage analysis and genome mapping
Linkage analysis and genome mappingLinkage analysis and genome mapping
Linkage analysis and genome mapping
 
Unilag workshop complex genome analysis
Unilag workshop   complex genome analysisUnilag workshop   complex genome analysis
Unilag workshop complex genome analysis
 
Genetic mapping
Genetic mappingGenetic mapping
Genetic mapping
 
Location and mapping of chromosomes using conventional and cytological means.
Location and mapping of chromosomes using conventional and cytological means.Location and mapping of chromosomes using conventional and cytological means.
Location and mapping of chromosomes using conventional and cytological means.
 
Gene Mapping; By: Lauren Mary
Gene Mapping; By: Lauren MaryGene Mapping; By: Lauren Mary
Gene Mapping; By: Lauren Mary
 
Concept of genome mapping
Concept of genome mappingConcept of genome mapping
Concept of genome mapping
 
Human genome
Human genomeHuman genome
Human genome
 
Gene mapping
Gene mappingGene mapping
Gene mapping
 
Genomics
GenomicsGenomics
Genomics
 
Gene mapping
Gene mappingGene mapping
Gene mapping
 
Gene mapping
Gene  mappingGene  mapping
Gene mapping
 

Andere mochten auch (6)

Powerpoint presentation in DNA of living organisms
Powerpoint presentation in DNA of living organismsPowerpoint presentation in DNA of living organisms
Powerpoint presentation in DNA of living organisms
 
Lecture 4 winter 2012
Lecture 4 winter 2012Lecture 4 winter 2012
Lecture 4 winter 2012
 
DNA structure, genes and its chemical composition
DNA structure, genes and its chemical compositionDNA structure, genes and its chemical composition
DNA structure, genes and its chemical composition
 
Chemical composition of dna
Chemical composition of dnaChemical composition of dna
Chemical composition of dna
 
Physical and chemical mutagen copy
Physical and chemical mutagen   copyPhysical and chemical mutagen   copy
Physical and chemical mutagen copy
 
A complete PPT on DNA
A complete PPT on DNA A complete PPT on DNA
A complete PPT on DNA
 

Ähnlich wie Genome Exploration in A-T G-C space (mk1)

Human Genome 2009
Human Genome 2009Human Genome 2009
Human Genome 2009
lyonja
 
Validating and improving the D. melanogaster reference genome sequence using ...
Validating and improving the D. melanogaster reference genome sequence using ...Validating and improving the D. melanogaster reference genome sequence using ...
Validating and improving the D. melanogaster reference genome sequence using ...
Casey Bergman
 
A Search for Technosignatures Around 11,680 Stars with the Green Bank Telesco...
A Search for Technosignatures Around 11,680 Stars with the Green Bank Telesco...A Search for Technosignatures Around 11,680 Stars with the Green Bank Telesco...
A Search for Technosignatures Around 11,680 Stars with the Green Bank Telesco...
Sérgio Sacani
 

Ähnlich wie Genome Exploration in A-T G-C space (mk1) (20)

Human Genome 2009
Human Genome 2009Human Genome 2009
Human Genome 2009
 
Validating and improving the D. melanogaster reference genome sequence using ...
Validating and improving the D. melanogaster reference genome sequence using ...Validating and improving the D. melanogaster reference genome sequence using ...
Validating and improving the D. melanogaster reference genome sequence using ...
 
New generation Sequencing
New generation Sequencing New generation Sequencing
New generation Sequencing
 
Basics of Genome Assembly
Basics of Genome Assembly Basics of Genome Assembly
Basics of Genome Assembly
 
A statistical physics approach to system biology
A statistical physics approach to system biologyA statistical physics approach to system biology
A statistical physics approach to system biology
 
Apollo - A webinar for the Phascolarctos cinereus research community
Apollo - A webinar for the Phascolarctos cinereus research communityApollo - A webinar for the Phascolarctos cinereus research community
Apollo - A webinar for the Phascolarctos cinereus research community
 
Marzillier_09052014.pdf
Marzillier_09052014.pdfMarzillier_09052014.pdf
Marzillier_09052014.pdf
 
Gene mapping and its sequence
Gene mapping and its sequenceGene mapping and its sequence
Gene mapping and its sequence
 
Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...
 
A Search for Technosignatures Around 11,680 Stars with the Green Bank Telesco...
A Search for Technosignatures Around 11,680 Stars with the Green Bank Telesco...A Search for Technosignatures Around 11,680 Stars with the Green Bank Telesco...
A Search for Technosignatures Around 11,680 Stars with the Green Bank Telesco...
 
Karen miga centromere sequence characterization and variant detection
Karen miga centromere sequence characterization and variant detectionKaren miga centromere sequence characterization and variant detection
Karen miga centromere sequence characterization and variant detection
 
HGP, the human genome project
HGP, the human genome projectHGP, the human genome project
HGP, the human genome project
 
A tutorial in Connectome Analysis (1) - Marcus Kaiser
A tutorial in Connectome Analysis (1) - Marcus KaiserA tutorial in Connectome Analysis (1) - Marcus Kaiser
A tutorial in Connectome Analysis (1) - Marcus Kaiser
 
Predicting Functional Regions in Genomic DNA Sequences Using Artificial Neur...
Predicting Functional Regions in Genomic DNA Sequences Using  Artificial Neur...Predicting Functional Regions in Genomic DNA Sequences Using  Artificial Neur...
Predicting Functional Regions in Genomic DNA Sequences Using Artificial Neur...
 
Synthetic biology
Synthetic biologySynthetic biology
Synthetic biology
 
Kulakova sbb2014
Kulakova sbb2014Kulakova sbb2014
Kulakova sbb2014
 
Genome Informatics 2016 poster
Genome Informatics 2016 posterGenome Informatics 2016 poster
Genome Informatics 2016 poster
 
A tutorial in Connectome Analysis (3) - Marcus Kaiser
A tutorial in Connectome Analysis (3) - Marcus KaiserA tutorial in Connectome Analysis (3) - Marcus Kaiser
A tutorial in Connectome Analysis (3) - Marcus Kaiser
 
Genomic mapping by kk sahu
Genomic mapping by kk sahuGenomic mapping by kk sahu
Genomic mapping by kk sahu
 
Genetic mapping
Genetic mappingGenetic mapping
Genetic mapping
 

Mehr von Jonathan Blakes

20080516 Spontaneous separation of bi-stable biochemical systems
20080516 Spontaneous separation of bi-stable biochemical systems20080516 Spontaneous separation of bi-stable biochemical systems
20080516 Spontaneous separation of bi-stable biochemical systems
Jonathan Blakes
 
20090608 Abstraction and reusability in the biological modelling process
20090608 Abstraction and reusability in the biological modelling process20090608 Abstraction and reusability in the biological modelling process
20090608 Abstraction and reusability in the biological modelling process
Jonathan Blakes
 
20090219 The case for another systems biology modelling environment
20090219 The case for another systems biology modelling environment20090219 The case for another systems biology modelling environment
20090219 The case for another systems biology modelling environment
Jonathan Blakes
 
20080620 Formal systems/synthetic biology modelling re-engineered
20080620 Formal systems/synthetic biology modelling re-engineered20080620 Formal systems/synthetic biology modelling re-engineered
20080620 Formal systems/synthetic biology modelling re-engineered
Jonathan Blakes
 

Mehr von Jonathan Blakes (6)

20101026 ASAP Seminar
20101026 ASAP Seminar20101026 ASAP Seminar
20101026 ASAP Seminar
 
20080516 Spontaneous separation of bi-stable biochemical systems
20080516 Spontaneous separation of bi-stable biochemical systems20080516 Spontaneous separation of bi-stable biochemical systems
20080516 Spontaneous separation of bi-stable biochemical systems
 
20090608 Abstraction and reusability in the biological modelling process
20090608 Abstraction and reusability in the biological modelling process20090608 Abstraction and reusability in the biological modelling process
20090608 Abstraction and reusability in the biological modelling process
 
20090918 Agile Computer Control of a Complex Experiment
20090918 Agile Computer Control of a Complex Experiment20090918 Agile Computer Control of a Complex Experiment
20090918 Agile Computer Control of a Complex Experiment
 
20090219 The case for another systems biology modelling environment
20090219 The case for another systems biology modelling environment20090219 The case for another systems biology modelling environment
20090219 The case for another systems biology modelling environment
 
20080620 Formal systems/synthetic biology modelling re-engineered
20080620 Formal systems/synthetic biology modelling re-engineered20080620 Formal systems/synthetic biology modelling re-engineered
20080620 Formal systems/synthetic biology modelling re-engineered
 

Kürzlich hochgeladen

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Kürzlich hochgeladen (20)

This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 

Genome Exploration in A-T G-C space (mk1)

  • 1. Genome exploration in A-T G-C space introducing Icarus a DNA walking program Jonathan Blakes MSc Biotechnology and Computation Department of Biosciences Faculty of Science, Technology and Medical Studies
  • 2. Problem too much information!
  • 4. Hypothesis Can DNA sequences be plotted in such a way that long sequences can be easily interpreted by humans without a prior i knowledge? “ It seems that the simplest method of visualizing some properties of genomes is to send a virtual walker for a genomic walk, ask "it" to talk about what it has seen and note its observations. If our walker doesn't move with a Brownian-like motion, it is possible to extract from its walk a lot of information . ” Stanislaw Cebrat , the principal Polish proponent of DNA walks Assigning a cardinal coordinate ( north , south , east or west ) to each of the four nucleotide bases ( A , T , G , C ) and taking steps in those directions as a sequence is read sequentially will produce a ‘walk’ of the sequence in which repetitive DNA elements will be seen as repetitive 2-dimensional ‘structures’.
  • 5. DNA walks are plots of DNA or RNA sequences where each of the four nucleotide bases is assigned a direction and distance, the sequence is read off one nucleotide at a time and for each nucleotide the virtual walker takes a step in the designated direction creating a 'walk' of the sequence that reveals elements of structure in the nucleotide composition. DNA walking From Comparative Genometrics website, L'Université de Lausanne
  • 6. Icarus Live Demonstration Could someone please suggest a mammalian gene to walk?
  • 7. Mapping 24 possible combinations of cardinal vectors: 4 rotations for each of the 3 above mappings, and 4 rotations of each of their reflections about the x or y plane. Choosing which 3 ‘unique’ mappings of those 24 is a matter of parsimony.
  • 12. A-T G-C is consistently smallest Smaller pictures can contain more information in less space and are therefore more amenable to publication, hence Genome Exploration in A-T G-C space
  • 13. Duplications exons introns a 7 fold contiguous duplication in the male Y chromosome. Members of the TSPY (Testis-specific Y-encoded proteins) family identified by Skaletsky et al 1 using a combination of a whole chromosome dotplot with a 2-kb window and a custom Perl script running BLAST alignments of all 5-kb sequence segments, in 2-kb steps, of the entire MSY (Male Specific Y). In contrast I stumbled upon this purely by accident. 1. Skaletsky et al. Nature 2003 423.
  • 14.
  • 15. Phylogeny algorithms neighbour joining Icarus’ UPGMA Distance Matrix
  • 17. Newick format  Distance Matrix Output Newick format string representation of a tree: (Bovine:0.69395, (Gibbon:0.36079, (Orang:0.33636, (Gorilla:0.17147, (Chimp:0.19268, Human:0.11927) :0.08386):0.06124):0.15057):0.54939, Mouse:1.21460);
  • 19. Does summing distances from 3 mappings eliminate bias and produce a better phylogeny? NO. A better distance measure is needed.
  • 20.
  • 21. Acknowledgements I would like to thank: Dr. Gary Robinson Dr. Colin Johnson Dr. Anthony Baines And everyone I have met during the Biotechnology and Computation MSc.