SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Short introduction to Bioinformatics
             What are the Probabilistic Models?
                            Sequence Alignment
                             Pairwise Alignment
            Multiple Sequence Alignment Models
                         What is Phylogenetics?
                     Building Phylogenetic Trees
                                   Other Models
                                    Conctact Us




Introduction to Probabilistic Models for Bioinformatics

              Igor Bogicevic (igor.bogicevic@sbgenomics.com)




                                          July 3, 2011




                                                                                                         EVEN BRIDGES
                                                                                                             G E N O M I C S, LLC




  Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Short introduction to Bioinformatics




       Bioinformatics is the application of statistics and computer science to the field of
       molecular biology.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Short introduction to Bioinformatics




       Bioinformatics is the application of statistics and computer science to the field of
       molecular biology.
       Major research efforts in the field include sequence alignment, gene finding,
       genome assembly, drug design, drug discovery, protein structure alignment,
       protein structure prediction, prediction of gene expression and protein-protein
       interactions, genome-wide association studies and the modeling of evolution.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Short introduction to Bioinformatics




       Bioinformatics is the application of statistics and computer science to the field of
       molecular biology.
       Major research efforts in the field include sequence alignment, gene finding,
       genome assembly, drug design, drug discovery, protein structure alignment,
       protein structure prediction, prediction of gene expression and protein-protein
       interactions, genome-wide association studies and the modeling of evolution.
       At the current moment, given the enormous volumes of sequenced data, one of
       the biggest challenges is not producing, but actually understanding the data.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What are the Probabilistic Models?

       There are 2 basic definitions:




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What are the Probabilistic Models?

       There are 2 basic definitions:
       Statistical analysis tool that estimates, on the basis of past (historical) data, the
       probability of an event occurring again.
       Probabilistic model is a system that simulates the object under the consideration
       and produces different outcomes with different probabilities.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What are the Probabilistic Models?

       There are 2 basic definitions:
       Statistical analysis tool that estimates, on the basis of past (historical) data, the
       probability of an event occurring again.
       Probabilistic model is a system that simulates the object under the consideration
       and produces different outcomes with different probabilities.
       Simple example - rolling a die.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What are the Probabilistic Models?

       There are 2 basic definitions:
       Statistical analysis tool that estimates, on the basis of past (historical) data, the
       probability of an event occurring again.
       Probabilistic model is a system that simulates the object under the consideration
       and produces different outcomes with different probabilities.
       Simple example - rolling a die.
       A bit more relevant example - random sequence model in DNA .
       Biological sequences are strings from a finite alphabet of residues, most
       commonly either four nucleotides, or twenty amino acids.
       Imagine that a residue a occurs with probability qa , if protein or DNA sequence is
       denoted x1 ...xn , then probability of the whole sequence is:
                                                                     n
                                                                     Y
                                                  qx1 qx2 ...qxn =         qxi
                                                                     i=1
                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Sequence Alignment




       Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein
       to identify regions of similarity that may be a consequence of functional,
       structural, or evolutionary relationships between the sequences.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Sequence Alignment




       Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein
       to identify regions of similarity that may be a consequence of functional,
       structural, or evolutionary relationships between the sequences.
       A variety of computational algorithms have been applied to the sequence
       alignment problem, i.e. dynamic programming, heuristic algorithms, probabilistic
       methods.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Sequence Alignment




       Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein
       to identify regions of similarity that may be a consequence of functional,
       structural, or evolutionary relationships between the sequences.
       A variety of computational algorithms have been applied to the sequence
       alignment problem, i.e. dynamic programming, heuristic algorithms, probabilistic
       methods.
       Common formats for representing alignments are FASTA and GenBank format




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
           What are the Probabilistic Models?
                          Sequence Alignment
                           Pairwise Alignment
          Multiple Sequence Alignment Models
                       What is Phylogenetics?
                   Building Phylogenetic Trees
                                 Other Models
                                  Conctact Us




                                                                                                       EVEN BRIDGES
                                                                                                           G E N O M I C S, LLC




Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.
       The three primary methods of producing pairwise alignments are dot-matrix
       methods, dynamic programming, and word methods.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.
       The three primary methods of producing pairwise alignments are dot-matrix
       methods, dynamic programming, and word methods.
       Needleman-Wunsch algorithm (Global Alignment)




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.
       The three primary methods of producing pairwise alignments are dot-matrix
       methods, dynamic programming, and word methods.
       Needleman-Wunsch algorithm (Global Alignment)
       Smith-Waterman algorithm (Local Alignment)




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.
       The three primary methods of producing pairwise alignments are dot-matrix
       methods, dynamic programming, and word methods.
       Needleman-Wunsch algorithm (Global Alignment)
       Smith-Waterman algorithm (Local Alignment)
       FASTA/BLAST Algorithms (k-tuple heuristic methods, often combined with
       dynamic models)




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.
       The three primary methods of producing pairwise alignments are dot-matrix
       methods, dynamic programming, and word methods.
       Needleman-Wunsch algorithm (Global Alignment)
       Smith-Waterman algorithm (Local Alignment)
       FASTA/BLAST Algorithms (k-tuple heuristic methods, often combined with
       dynamic models)
       Gap Penalities - modeling a cost of a gap in matched sequences (linear, affine,
       etc.)



                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                 What are the Probabilistic Models?
                                Sequence Alignment
                                 Pairwise Alignment
                Multiple Sequence Alignment Models
                             What is Phylogenetics?
                         Building Phylogenetic Trees
                                       Other Models
                                        Conctact Us




Example - Smith-Waterman: A matrix H is built as follows:

                                         H(i, 0) = 0, 0 ≤ i ≤ m
                                         H(0, j) = 0, 0 ≤ j ≤ n


                               if ai = bj then w (ai , bj ) = w (match)
                          or if ai ! = bj then w (ai , bj ) = w (mismatch)

                  8                                                          9
                  >
                  >          0                                               >
                                                                             >
                H(i − 1, j − 1) + w (ai , bj )                 Match/Mismatch
                  <                                                          =
H(i, j) = max                                                                  , 1 ≤ i ≤ m, 1 ≤ j ≤ n
              > H(i − 1, j) + w (ai , −)
              >                                                   Deletion   >
                                                                             >
                 H(i, j − 1) + w (−, bj )                         Insertion
              :                                                              ;



                                                                                                             EVEN BRIDGES
                                                                                                                 G E N O M I C S, LLC




      Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
               What are the Probabilistic Models?
                              Sequence Alignment
                               Pairwise Alignment
              Multiple Sequence Alignment Models
                           What is Phylogenetics?
                       Building Phylogenetic Trees
                                     Other Models
                                      Conctact Us



Sequence 1 = ACACACTA, Sequence 2 = AGCACACA




                                                                                                           EVEN BRIDGES
                                                                                                               G E N O M I C S, LLC




    Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                What are the Probabilistic Models?
                               Sequence Alignment
                                Pairwise Alignment
               Multiple Sequence Alignment Models
                            What is Phylogenetics?
                        Building Phylogenetic Trees
                                      Other Models
                                       Conctact Us



Sequence 1 = ACACACTA, Sequence 2 = AGCACACA
w(match) = +2
w(a,-) = w(-,b) = w(mismatch) = -1

                                  −      A      C     A       C      A       C        T       A
                        0                                                                       1
                  B−              0      0      0     0       0      0        0        0      0C
                  BA              0      2      1     2       1      2        1        0      2C
                  B                                                                             C
                  BG              0      1      1     1       1      1        1        0      1C
                  B                                                                             C
                  BC              0      0      3     2       3      2        3        2      1C
                  B                                                                             C
                H=B
                  BA              0      2      2     5       4      5        4        3      4C
                                                                                                C
                  BC              0      1      4     4       7      6        7        6      5C
                  B                                                                             C
                  BA              0      2      3     6       6      9        8        7      8C
                  B                                                                             C
                  @C              0      1      4     5       8      8       11       10       9A
                    A             0      2      3     6       7      10      10       10      12




                                                                                                                EVEN BRIDGES
                                                                                                                    G E N O M I C S, LLC




     Igor Bogicevic (igor.bogicevic@sbgenomics.com)       Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                 What are the Probabilistic Models?
                                Sequence Alignment
                                 Pairwise Alignment
                Multiple Sequence Alignment Models
                             What is Phylogenetics?
                         Building Phylogenetic Trees
                                       Other Models
                                        Conctact Us



Sequence 1 = ACACACTA, Sequence 2 = AGCACACA
w(match) = +2
w(a,-) = w(-,b) = w(mismatch) = -1

                                   −      A      C     A       C      A       C        T       A
                         0                                                                       1
                   B−              0      0      0     0       0      0        0        0      0C
                   BA              0      2      1     2       1      2        1        0      2C
                   B                                                                             C
                   BG              0      1      1     1       1      1        1        0      1C
                   B                                                                             C
                   BC              0      0      3     2       3      2        3        2      1C
                   B                                                                             C
                 H=B
                   BA              0      2      2     5       4      5        4        3      4C
                                                                                                 C
                   BC              0      1      4     4       7      6        7        6      5C
                   B                                                                             C
                   BA              0      2      3     6       6      9        8        7      8C
                   B                                                                             C
                   @C              0      1      4     5       8      8       11       10       9A
                     A             0      2      3     6       7      10      10       10      12

In the example, the highest value corresponds to the cell in position (8,8). The
walk back corresponds to (8,8), (7,7), (7,6), (6,5), (5,4), (4,3), (3,2), (2,1),
(1,1), and (0,0)
Sequence 1 = A-CACACTA, Sequence 2 = AGCACAC-A                                                                   EVEN BRIDGES
                                                                                                                     G E N O M I C S, LLC




      Igor Bogicevic (igor.bogicevic@sbgenomics.com)       Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Multiple Sequence Alignment Models



       A multiple sequence alignment (MSA) is a sequence alignment of three or more
       biological sequences, commonly protein, DNA, or RNA.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Multiple Sequence Alignment Models



       A multiple sequence alignment (MSA) is a sequence alignment of three or more
       biological sequences, commonly protein, DNA, or RNA.
       We usually want to do multiple alignments to find a homologous sequences that
       point to a shared evolutionary origins that can be used for further phylogenetic
       analysis.
       Progressive Alignment Methods - constructing succession of a pairwise alignment.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Multiple Sequence Alignment Models



       A multiple sequence alignment (MSA) is a sequence alignment of three or more
       biological sequences, commonly protein, DNA, or RNA.
       We usually want to do multiple alignments to find a homologous sequences that
       point to a shared evolutionary origins that can be used for further phylogenetic
       analysis.
       Progressive Alignment Methods - constructing succession of a pairwise alignment.
       Hidden Markov Models - representation of MSA as DAG, observed states are
       individual alignment columns and the hidden states represent the presumed
       ancestral sequence.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
           What are the Probabilistic Models?
                          Sequence Alignment
                           Pairwise Alignment
          Multiple Sequence Alignment Models
                       What is Phylogenetics?
                   Building Phylogenetic Trees
                                 Other Models
                                  Conctact Us




                                                                                                       EVEN BRIDGES
                                                                                                           G E N O M I C S, LLC




Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What is Phylogenetics?



       Phylogenetics is the study of evolutionary relatedness among groups of organisms
       (e.g. species, populations), which is discovered through molecular sequencing
       data and morphological data matrices.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What is Phylogenetics?



       Phylogenetics is the study of evolutionary relatedness among groups of organisms
       (e.g. species, populations), which is discovered through molecular sequencing
       data and morphological data matrices.
       Evolution is regarded as a branching process, whereby populations are altered
       over time and may speciate into separate branches, hybridize together, or
       terminate by extinction. This may be visualized in a phylogenetic tree.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What is Phylogenetics?



       Phylogenetics is the study of evolutionary relatedness among groups of organisms
       (e.g. species, populations), which is discovered through molecular sequencing
       data and morphological data matrices.
       Evolution is regarded as a branching process, whereby populations are altered
       over time and may speciate into separate branches, hybridize together, or
       terminate by extinction. This may be visualized in a phylogenetic tree.
       Ernst Haeckel’s recapitulation theory (”ontogeny recapitulates phylogeny”) is a
       hypothesis that in developing from embryo to adult, animals go through stages
       resembling or representing successive stages in the evolution of their remote
       ancestors.



                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Building Phylogenetic Trees


       Phylogenetic trees among a nontrivial number of input sequences are constructed
       using computational phylogenetics methods.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Building Phylogenetic Trees


       Phylogenetic trees among a nontrivial number of input sequences are constructed
       using computational phylogenetics methods.
       Common method is to search for maximum likelihood, often within a Bayesian
       Framework, and apply an explicit model of evolution to phylogenetic tree
       estimation.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Building Phylogenetic Trees


       Phylogenetic trees among a nontrivial number of input sequences are constructed
       using computational phylogenetics methods.
       Common method is to search for maximum likelihood, often within a Bayesian
       Framework, and apply an explicit model of evolution to phylogenetic tree
       estimation.
       Identifying the optimal tree using many of these techniques is NP-hard, so
       heuristic search and optimization methods are used in combination with
       tree-scoring functions to identify a reasonably good tree that fits the data.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Building Phylogenetic Trees


       Phylogenetic trees among a nontrivial number of input sequences are constructed
       using computational phylogenetics methods.
       Common method is to search for maximum likelihood, often within a Bayesian
       Framework, and apply an explicit model of evolution to phylogenetic tree
       estimation.
       Identifying the optimal tree using many of these techniques is NP-hard, so
       heuristic search and optimization methods are used in combination with
       tree-scoring functions to identify a reasonably good tree that fits the data.
       They do not necessarily accurately represent the species evolutionary history as
       the data on which they are based is noisy; the analysis can be confounded by
       horizontal gene transfer, hybridisation between species that were not nearest
       neighbors on the tree before hybridisation takes place, convergent evolution, and
       conserved sequences.

                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
           What are the Probabilistic Models?
                          Sequence Alignment
                           Pairwise Alignment
          Multiple Sequence Alignment Models
                       What is Phylogenetics?
                   Building Phylogenetic Trees
                                 Other Models
                                  Conctact Us




                                                                                                       EVEN BRIDGES
                                                                                                           G E N O M I C S, LLC




Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Other Models




       Transformational Grammars (Chomsky Hierarchy)
       RNA Structure Analysis Models (RNA contains the interactions - rather than
       preserving the sequence)




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Contact Us




       We are Hiring!




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics

Weitere ähnliche Inhalte

Was ist angesagt?

Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningParas Kohli
 
A survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithmsA survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithmsAhmed Magdy Ezzeldin, MSc.
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filteringD Yogendra Rao
 
Lecture 1 graphical models
Lecture 1  graphical modelsLecture 1  graphical models
Lecture 1 graphical modelsDuy Tung Pham
 
Network Biology: from lists to underpinnings of molecular behaviour
Network Biology: from lists to underpinnings of molecular behaviourNetwork Biology: from lists to underpinnings of molecular behaviour
Network Biology: from lists to underpinnings of molecular behaviourMichel Dumontier
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learningmahutte
 
Computer aided drug designing (CADD)
Computer aided drug designing (CADD)Computer aided drug designing (CADD)
Computer aided drug designing (CADD)Aakshay Subramaniam
 
Molecular dynamics Simulation.pptx
Molecular dynamics Simulation.pptxMolecular dynamics Simulation.pptx
Molecular dynamics Simulation.pptxHassanShah396906
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksFrancesco Collova'
 
Types of machine learning
Types of machine learningTypes of machine learning
Types of machine learningHimaniAloona
 
Prediction of Corporate Bankruptcy using Machine Learning Techniques
Prediction of Corporate Bankruptcy using Machine Learning Techniques Prediction of Corporate Bankruptcy using Machine Learning Techniques
Prediction of Corporate Bankruptcy using Machine Learning Techniques Shantanu Deshpande
 
Chemo informatics scope and applications
Chemo informatics scope and applicationsChemo informatics scope and applications
Chemo informatics scope and applicationsshyam I
 
System's Biology
System's Biology System's Biology
System's Biology Pritam Shil
 
Machine Learning for Chemical Sciences
Machine Learning for Chemical SciencesMachine Learning for Chemical Sciences
Machine Learning for Chemical SciencesIchigaku Takigawa
 

Was ist angesagt? (20)

COMPUTATIONAL BIOLOGY
COMPUTATIONAL BIOLOGYCOMPUTATIONAL BIOLOGY
COMPUTATIONAL BIOLOGY
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
A survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithmsA survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithms
 
Biological networks
Biological networksBiological networks
Biological networks
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
 
Lecture 1 graphical models
Lecture 1  graphical modelsLecture 1  graphical models
Lecture 1 graphical models
 
Nanofibers
NanofibersNanofibers
Nanofibers
 
Network Biology: from lists to underpinnings of molecular behaviour
Network Biology: from lists to underpinnings of molecular behaviourNetwork Biology: from lists to underpinnings of molecular behaviour
Network Biology: from lists to underpinnings of molecular behaviour
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learning
 
Computer aided drug designing (CADD)
Computer aided drug designing (CADD)Computer aided drug designing (CADD)
Computer aided drug designing (CADD)
 
Molecular dynamics Simulation.pptx
Molecular dynamics Simulation.pptxMolecular dynamics Simulation.pptx
Molecular dynamics Simulation.pptx
 
Homology modeling: Modeller
Homology modeling: ModellerHomology modeling: Modeller
Homology modeling: Modeller
 
An approach to use PERA in Enterprise Modeling for industrial systems
An approach to use PERA in Enterprise Modeling for industrial systemsAn approach to use PERA in Enterprise Modeling for industrial systems
An approach to use PERA in Enterprise Modeling for industrial systems
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
Types of machine learning
Types of machine learningTypes of machine learning
Types of machine learning
 
CS8080_IRT_UNIT - III T2 UNSUPERVISED ALGORITHMS -CLUSTERING.pdf
CS8080_IRT_UNIT - III T2 UNSUPERVISED ALGORITHMS -CLUSTERING.pdfCS8080_IRT_UNIT - III T2 UNSUPERVISED ALGORITHMS -CLUSTERING.pdf
CS8080_IRT_UNIT - III T2 UNSUPERVISED ALGORITHMS -CLUSTERING.pdf
 
Prediction of Corporate Bankruptcy using Machine Learning Techniques
Prediction of Corporate Bankruptcy using Machine Learning Techniques Prediction of Corporate Bankruptcy using Machine Learning Techniques
Prediction of Corporate Bankruptcy using Machine Learning Techniques
 
Chemo informatics scope and applications
Chemo informatics scope and applicationsChemo informatics scope and applications
Chemo informatics scope and applications
 
System's Biology
System's Biology System's Biology
System's Biology
 
Machine Learning for Chemical Sciences
Machine Learning for Chemical SciencesMachine Learning for Chemical Sciences
Machine Learning for Chemical Sciences
 

Andere mochten auch

Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning GenomeCompiler
 
Sequence comparison techniques
Sequence comparison techniquesSequence comparison techniques
Sequence comparison techniquesruchibioinfo
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignmentKubuldinho
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsNikesh Narayanan
 
Application of bioinformatics
Application of bioinformaticsApplication of bioinformatics
Application of bioinformaticsKamlesh Patade
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignmentavrilcoghlan
 
2015 bioinformatics phylogenetics_wim_vancriekinge
2015 bioinformatics phylogenetics_wim_vancriekinge2015 bioinformatics phylogenetics_wim_vancriekinge
2015 bioinformatics phylogenetics_wim_vancriekingeProf. Wim Van Criekinge
 
TCS: A new multiple sequence alignment reliability measure to estimate align...
 TCS: A new multiple sequence alignment reliability measure to estimate align... TCS: A new multiple sequence alignment reliability measure to estimate align...
TCS: A new multiple sequence alignment reliability measure to estimate align...JIA-MING CHANG
 
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic Trees
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic TreesBIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic Trees
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic TreesJonathan Eisen
 
The Needleman Wunsch algorithm
The Needleman Wunsch algorithmThe Needleman Wunsch algorithm
The Needleman Wunsch algorithmavrilcoghlan
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Vijay Hemmadi
 
Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Marina Santini
 

Andere mochten auch (20)

Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning
 
Sequence comparison techniques
Sequence comparison techniquesSequence comparison techniques
Sequence comparison techniques
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Application of bioinformatics
Application of bioinformaticsApplication of bioinformatics
Application of bioinformatics
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignment
 
2015 bioinformatics phylogenetics_wim_vancriekinge
2015 bioinformatics phylogenetics_wim_vancriekinge2015 bioinformatics phylogenetics_wim_vancriekinge
2015 bioinformatics phylogenetics_wim_vancriekinge
 
TCS: A new multiple sequence alignment reliability measure to estimate align...
 TCS: A new multiple sequence alignment reliability measure to estimate align... TCS: A new multiple sequence alignment reliability measure to estimate align...
TCS: A new multiple sequence alignment reliability measure to estimate align...
 
Phylogenetics2
Phylogenetics2Phylogenetics2
Phylogenetics2
 
Phylogenetics1
Phylogenetics1Phylogenetics1
Phylogenetics1
 
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic Trees
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic TreesBIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic Trees
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic Trees
 
Clustal X
Clustal XClustal X
Clustal X
 
The Needleman Wunsch algorithm
The Needleman Wunsch algorithmThe Needleman Wunsch algorithm
The Needleman Wunsch algorithm
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Hidden markov model
Hidden markov modelHidden markov model
Hidden markov model
 
Phylogeny
PhylogenyPhylogeny
Phylogeny
 
Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)
 
Phylogenetic tree
Phylogenetic treePhylogenetic tree
Phylogenetic tree
 
Blast fasta 4
Blast fasta 4Blast fasta 4
Blast fasta 4
 

Ähnlich wie Introduction to Probabilistic Models for Bioinformatics

Bio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challengesBio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challengesJanna Hastings
 
My ontology is better than yours! Building and evaluating ontologies for inte...
My ontology is better than yours! Building and evaluating ontologies for inte...My ontology is better than yours! Building and evaluating ontologies for inte...
My ontology is better than yours! Building and evaluating ontologies for inte...Robert Hoehndorf
 
Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Sage Base
 
Biotechnology as Career Option 2012
Biotechnology as Career Option 2012Biotechnology as Career Option 2012
Biotechnology as Career Option 2012Reportbioinformatics
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfkigaruantony
 
Vicarious Systems at Singularity Summit 2011
Vicarious Systems at Singularity Summit 2011Vicarious Systems at Singularity Summit 2011
Vicarious Systems at Singularity Summit 2011Scott Brown
 

Ähnlich wie Introduction to Probabilistic Models for Bioinformatics (8)

Bioinformatica t1-bioinformatics
Bioinformatica t1-bioinformaticsBioinformatica t1-bioinformatics
Bioinformatica t1-bioinformatics
 
Bio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challengesBio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challenges
 
HOMOLOGY MODELING.pptx.pdf
HOMOLOGY MODELING.pptx.pdfHOMOLOGY MODELING.pptx.pdf
HOMOLOGY MODELING.pptx.pdf
 
My ontology is better than yours! Building and evaluating ontologies for inte...
My ontology is better than yours! Building and evaluating ontologies for inte...My ontology is better than yours! Building and evaluating ontologies for inte...
My ontology is better than yours! Building and evaluating ontologies for inte...
 
Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27
 
Biotechnology as Career Option 2012
Biotechnology as Career Option 2012Biotechnology as Career Option 2012
Biotechnology as Career Option 2012
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdf
 
Vicarious Systems at Singularity Summit 2011
Vicarious Systems at Singularity Summit 2011Vicarious Systems at Singularity Summit 2011
Vicarious Systems at Singularity Summit 2011
 

Kürzlich hochgeladen

UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024TopCSSGallery
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2DianaGray10
 
Buy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdfBuy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdfEasyPrinterHelp
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsUXDXConf
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101vincent683379
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationZilliz
 

Kürzlich hochgeladen (20)

UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Buy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdfBuy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdf
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 

Introduction to Probabilistic Models for Bioinformatics

  • 1. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Introduction to Probabilistic Models for Bioinformatics Igor Bogicevic (igor.bogicevic@sbgenomics.com) July 3, 2011 EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 2. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Short introduction to Bioinformatics Bioinformatics is the application of statistics and computer science to the field of molecular biology. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 3. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Short introduction to Bioinformatics Bioinformatics is the application of statistics and computer science to the field of molecular biology. Major research efforts in the field include sequence alignment, gene finding, genome assembly, drug design, drug discovery, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, genome-wide association studies and the modeling of evolution. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 4. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Short introduction to Bioinformatics Bioinformatics is the application of statistics and computer science to the field of molecular biology. Major research efforts in the field include sequence alignment, gene finding, genome assembly, drug design, drug discovery, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, genome-wide association studies and the modeling of evolution. At the current moment, given the enormous volumes of sequenced data, one of the biggest challenges is not producing, but actually understanding the data. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 5. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What are the Probabilistic Models? There are 2 basic definitions: EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 6. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What are the Probabilistic Models? There are 2 basic definitions: Statistical analysis tool that estimates, on the basis of past (historical) data, the probability of an event occurring again. Probabilistic model is a system that simulates the object under the consideration and produces different outcomes with different probabilities. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 7. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What are the Probabilistic Models? There are 2 basic definitions: Statistical analysis tool that estimates, on the basis of past (historical) data, the probability of an event occurring again. Probabilistic model is a system that simulates the object under the consideration and produces different outcomes with different probabilities. Simple example - rolling a die. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 8. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What are the Probabilistic Models? There are 2 basic definitions: Statistical analysis tool that estimates, on the basis of past (historical) data, the probability of an event occurring again. Probabilistic model is a system that simulates the object under the consideration and produces different outcomes with different probabilities. Simple example - rolling a die. A bit more relevant example - random sequence model in DNA . Biological sequences are strings from a finite alphabet of residues, most commonly either four nucleotides, or twenty amino acids. Imagine that a residue a occurs with probability qa , if protein or DNA sequence is denoted x1 ...xn , then probability of the whole sequence is: n Y qx1 qx2 ...qxn = qxi i=1 EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 9. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence Alignment Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 10. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence Alignment Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. A variety of computational algorithms have been applied to the sequence alignment problem, i.e. dynamic programming, heuristic algorithms, probabilistic methods. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 11. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence Alignment Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. A variety of computational algorithms have been applied to the sequence alignment problem, i.e. dynamic programming, heuristic algorithms, probabilistic methods. Common formats for representing alignments are FASTA and GenBank format EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 12. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 13. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 14. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. The three primary methods of producing pairwise alignments are dot-matrix methods, dynamic programming, and word methods. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 15. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. The three primary methods of producing pairwise alignments are dot-matrix methods, dynamic programming, and word methods. Needleman-Wunsch algorithm (Global Alignment) EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 16. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. The three primary methods of producing pairwise alignments are dot-matrix methods, dynamic programming, and word methods. Needleman-Wunsch algorithm (Global Alignment) Smith-Waterman algorithm (Local Alignment) EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 17. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. The three primary methods of producing pairwise alignments are dot-matrix methods, dynamic programming, and word methods. Needleman-Wunsch algorithm (Global Alignment) Smith-Waterman algorithm (Local Alignment) FASTA/BLAST Algorithms (k-tuple heuristic methods, often combined with dynamic models) EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 18. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. The three primary methods of producing pairwise alignments are dot-matrix methods, dynamic programming, and word methods. Needleman-Wunsch algorithm (Global Alignment) Smith-Waterman algorithm (Local Alignment) FASTA/BLAST Algorithms (k-tuple heuristic methods, often combined with dynamic models) Gap Penalities - modeling a cost of a gap in matched sequences (linear, affine, etc.) EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 19. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Example - Smith-Waterman: A matrix H is built as follows: H(i, 0) = 0, 0 ≤ i ≤ m H(0, j) = 0, 0 ≤ j ≤ n if ai = bj then w (ai , bj ) = w (match) or if ai ! = bj then w (ai , bj ) = w (mismatch) 8 9 > > 0 > > H(i − 1, j − 1) + w (ai , bj ) Match/Mismatch < = H(i, j) = max , 1 ≤ i ≤ m, 1 ≤ j ≤ n > H(i − 1, j) + w (ai , −) > Deletion > > H(i, j − 1) + w (−, bj ) Insertion : ; EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 20. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence 1 = ACACACTA, Sequence 2 = AGCACACA EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 21. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence 1 = ACACACTA, Sequence 2 = AGCACACA w(match) = +2 w(a,-) = w(-,b) = w(mismatch) = -1 − A C A C A C T A 0 1 B− 0 0 0 0 0 0 0 0 0C BA 0 2 1 2 1 2 1 0 2C B C BG 0 1 1 1 1 1 1 0 1C B C BC 0 0 3 2 3 2 3 2 1C B C H=B BA 0 2 2 5 4 5 4 3 4C C BC 0 1 4 4 7 6 7 6 5C B C BA 0 2 3 6 6 9 8 7 8C B C @C 0 1 4 5 8 8 11 10 9A A 0 2 3 6 7 10 10 10 12 EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 22. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence 1 = ACACACTA, Sequence 2 = AGCACACA w(match) = +2 w(a,-) = w(-,b) = w(mismatch) = -1 − A C A C A C T A 0 1 B− 0 0 0 0 0 0 0 0 0C BA 0 2 1 2 1 2 1 0 2C B C BG 0 1 1 1 1 1 1 0 1C B C BC 0 0 3 2 3 2 3 2 1C B C H=B BA 0 2 2 5 4 5 4 3 4C C BC 0 1 4 4 7 6 7 6 5C B C BA 0 2 3 6 6 9 8 7 8C B C @C 0 1 4 5 8 8 11 10 9A A 0 2 3 6 7 10 10 10 12 In the example, the highest value corresponds to the cell in position (8,8). The walk back corresponds to (8,8), (7,7), (7,6), (6,5), (5,4), (4,3), (3,2), (2,1), (1,1), and (0,0) Sequence 1 = A-CACACTA, Sequence 2 = AGCACAC-A EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 23. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Multiple Sequence Alignment Models A multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, commonly protein, DNA, or RNA. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 24. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Multiple Sequence Alignment Models A multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, commonly protein, DNA, or RNA. We usually want to do multiple alignments to find a homologous sequences that point to a shared evolutionary origins that can be used for further phylogenetic analysis. Progressive Alignment Methods - constructing succession of a pairwise alignment. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 25. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Multiple Sequence Alignment Models A multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, commonly protein, DNA, or RNA. We usually want to do multiple alignments to find a homologous sequences that point to a shared evolutionary origins that can be used for further phylogenetic analysis. Progressive Alignment Methods - constructing succession of a pairwise alignment. Hidden Markov Models - representation of MSA as DAG, observed states are individual alignment columns and the hidden states represent the presumed ancestral sequence. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 26. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 27. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What is Phylogenetics? Phylogenetics is the study of evolutionary relatedness among groups of organisms (e.g. species, populations), which is discovered through molecular sequencing data and morphological data matrices. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 28. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What is Phylogenetics? Phylogenetics is the study of evolutionary relatedness among groups of organisms (e.g. species, populations), which is discovered through molecular sequencing data and morphological data matrices. Evolution is regarded as a branching process, whereby populations are altered over time and may speciate into separate branches, hybridize together, or terminate by extinction. This may be visualized in a phylogenetic tree. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 29. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What is Phylogenetics? Phylogenetics is the study of evolutionary relatedness among groups of organisms (e.g. species, populations), which is discovered through molecular sequencing data and morphological data matrices. Evolution is regarded as a branching process, whereby populations are altered over time and may speciate into separate branches, hybridize together, or terminate by extinction. This may be visualized in a phylogenetic tree. Ernst Haeckel’s recapitulation theory (”ontogeny recapitulates phylogeny”) is a hypothesis that in developing from embryo to adult, animals go through stages resembling or representing successive stages in the evolution of their remote ancestors. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 30. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Building Phylogenetic Trees Phylogenetic trees among a nontrivial number of input sequences are constructed using computational phylogenetics methods. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 31. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Building Phylogenetic Trees Phylogenetic trees among a nontrivial number of input sequences are constructed using computational phylogenetics methods. Common method is to search for maximum likelihood, often within a Bayesian Framework, and apply an explicit model of evolution to phylogenetic tree estimation. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 32. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Building Phylogenetic Trees Phylogenetic trees among a nontrivial number of input sequences are constructed using computational phylogenetics methods. Common method is to search for maximum likelihood, often within a Bayesian Framework, and apply an explicit model of evolution to phylogenetic tree estimation. Identifying the optimal tree using many of these techniques is NP-hard, so heuristic search and optimization methods are used in combination with tree-scoring functions to identify a reasonably good tree that fits the data. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 33. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Building Phylogenetic Trees Phylogenetic trees among a nontrivial number of input sequences are constructed using computational phylogenetics methods. Common method is to search for maximum likelihood, often within a Bayesian Framework, and apply an explicit model of evolution to phylogenetic tree estimation. Identifying the optimal tree using many of these techniques is NP-hard, so heuristic search and optimization methods are used in combination with tree-scoring functions to identify a reasonably good tree that fits the data. They do not necessarily accurately represent the species evolutionary history as the data on which they are based is noisy; the analysis can be confounded by horizontal gene transfer, hybridisation between species that were not nearest neighbors on the tree before hybridisation takes place, convergent evolution, and conserved sequences. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 34. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 35. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Other Models Transformational Grammars (Chomsky Hierarchy) RNA Structure Analysis Models (RNA contains the interactions - rather than preserving the sequence) EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 36. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Contact Us We are Hiring! EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics