SlideShare ist ein Scribd-Unternehmen logo
1 von 47
Downloaden Sie, um offline zu lesen
Using R for GO Terms Analysis

     Boyce Thompson Institute for Plant
                 Research
                Tower Road
       Ithaca, New York 14853-1801
                  U.S.A.

                  by
      Aureliano Bombarely Gomez
Using R for GO Terms Analysis:
   1. What are GO terms ?
   2. What is R ?
   3. What is Bioconductor ?
   4. Bioconductor modules for GO terms
         4.1. Basics: GO.db
         4.2. Gene Enrinchment Analysis: TopGO
         4.3. GO profiles: GOProfiles
         4.4. Gene Similarities: GOSim
    5. Example.
         5.1. Comparing two arabidopsis chromosomes.
Using R for GO Terms Analysis:
   1. What are GO terms ?
   2. What is R ?
   3. What is Bioconductor ?
   4. Bioconductor modules for GO terms
         4.1. Basics: GO.db
         4.2. Gene Enrinchment Analysis: TopGO
         4.3. GO profiles: GOProfiles
         4.4. Gene Similarities: GOSim
    5. Example.
         5.1. Comparing two arabidopsis chromosomes.
1. What are GO Terms ?


Gene ontologies:
    Structured controlled vocabularies
(ontologies) that describe gene products
in terms of their associated
   biological processes,
   cellular components and
   molecular functions
in a species-independent manner



   http://www.geneontology.org/GO.doc.shtml
1. What are GO Terms ?


Biological processes,
   Recognized series of events or molecular functions. A process is
a collection of molecular events with a defined beginning and end.

Cellular components,
  Describes locations, at the levels of subcellular structures and
macromolecular complexes.

Molecular functions
   Describes activities, such as catalytic or binding activities, that
occur at the molecular level.
1. What are GO Terms ?


Gene ontologies:
    Structured controlled vocabulariesbetween the terms
                              relationships
(ontologies) that describe gene products
in terms ofbe described in terms of a acyclic graph, where:
    GO can their associated
      each GO term is a node, and the
   biological processes, the terms are arcs between the nodes
      relationships between
   cellular components and
   molecular functions
in a species-independent manner
Using R for GO Terms Analysis:
   1. What are GO terms ?
   2. What is R ?
   3. What is Bioconductor ?
   4. Bioconductor modules for GO terms
         4.1. Basics: GO.db
         4.2. Gene Enrinchment Analysis: TopGO
         4.3. GO profiles: GOProfiles
         4.4. Gene Similarities: GOSim
    5. Example.
         5.1. Comparing two arabidopsis chromosomes.
2. What is R ?


 R is a language and environment for statistical
            computing and graphics. .
2. What is R ?


WEB:
   OFICIAL WEB: http://www.r-project.org/index.html
   QUICK-R:                 http://www.statmethods.net/index.html


BOOKS:
   Introductory Statistics with R (Statistics and Computing), P. Dalgaard
  [available as manual at R project web]

   The R Book, MJ. Crawley


R itself:                        help() and example()
2. What is R ?


INTEGRATED DEVELOPMENT ENVIRONMENT (IDE):
  R-Studio: http://rstudio.org/
Using R for GO Terms Analysis:
   1. What are GO terms ?
   2. What is R ?
   3. What is Bioconductor ?
   4. Bioconductor modules for GO terms
         4.1. Basics: GO.db
         4.2. Gene Enrinchment Analysis: TopGO
         4.3. GO profiles: GOProfiles
         4.4. Gene Similarities: GOSim
    5. Example.
         5.1. Comparing two arabidopsis chromosomes.
3. What is Bioconductor ?


  Bioconductor is an open source, open development
 software project to provide tools for the analysis and
comprehension of high-throughput genomic data. It is
  based primarily on the R programming language.
3. What is Bioconductor ?


WEB:
   OFICIAL WEB: http://www.bioconductor.org/


BOOKS:
   Bioinformatics and Computational Biology Solutions Using R and
Bioconductor, 2005, Gentleman R. et al. (Springer)
   R Programming for Bioinformatics, 2008, Gentleman R. (CRC Press)


MAILING LIST:
   http://www.bioconductor.org/help/mailing-list/
3. What is Bioconductor ?


Installation of Bioconductor:
3. What is Bioconductor ?


Use of Bioconductor:
Using R for GO Terms Analysis:
   1. What are GO terms ?
   2. What is R ?
   3. What is Bioconductor ?
   4. Bioconductor modules for GO terms
         4.1. Basics: GO.db
         4.2. Gene Similarities: GOSim
         4.3. GO profiles: GOProfiles
         4.4. Gene Enrinchment Analysis:TopGO
    5. Example.
         5.1. Comparing two arabidopsis chromosomes.
4. Bioconductor modules for GOterms


Bioconductor Packages for GO Terms:
4. Bioconductor modules for GOterms


Bioconductor Packages for GO Terms:

 GO.db         A set of annotation maps describing the entire Gene Ontology

 Gostats       Tools for manipulating GO and microarrays

 GOSim         functional similarities between GO terms and gene products

 GOProfiles    Statistical analysis of functional profiles

 TopGO         Enrichment analysis for Gene Ontology
Using R for GO Terms Analysis:
   1. What are GO terms ?
   2. What is R ?
   3. What is Bioconductor ?
   4. Bioconductor modules for GO terms
         4.1. Basics: GO.db
         4.2. Gene Enrinchment Analysis: TopGO
         4.3. GO profiles: GOProfiles
         4.4. Gene Similarities: GOSim
    5. Example.
         5.1. Comparing two arabidopsis chromosomes.
4.1 GO.db

             http://www.bioconductor.org/packages/2.9/data/annotation/html/GO.db.html




  1) GO terms stored in objects
      GOTERM,
      GOBPPARENTS, GOCCPARENTS, GOMFPARENTS
      GOBPANCESTOR, GOCCANCESTOR, GOMFANCESTOR
      GOBPCHILDREN, GOCCCHILDREN, GOMFCHILDREN
      GOBPOFFSPRING, GOCCOFFSPRING, GOMFOFFSPRING


More Information:
http://www.bioconductor.org/packages/2.6/bioc/vignettes/annotate/inst/doc/GOusage.pdf
4.1 GO.db




               GOBPANCESTOR       GOBPANCESTOR$"GO:0006720”




            GOBPPARENTS        GOBPPARENTS$"GO:0006720”

            GOTERM     GOTERM$"GO:0006720”

            GOBPCHILDREN       GOBPCHILDREN$"GO:0006720”

               GOBPOFFSPRING      GOBPOFFSPRING$"GO:0006720”
4.1 GO.db


Exercise 1:

   1.1 Which term correspond top the GOID: GO:0006720 ?

   1.2 How many synonyms has this term ?

   1.3 How many children ?

   1.4 … and parents ?

   1.5 … and offsprings ?

   1.6 … and ancestors ?
4.1 GO.db

2) Mapping between gene and GO terms stored in objects or
dataframes.




org.At.tair.db
Genome wide annotation for Arabidopsis, primarily based on mapping
using TAIR identifiers.
4.1 GO.db

2) Mapping between gene and GO terms stored in objects
(annotate package) or dataframes.

     > library("org.At.tair.db")
     > org.At.tairGO[["AT5G58560"]]

Use a list:

     > org.At.tairGO[["AT5G58560"]][[1]]$Ontology

Functions (“annotate” package):
    + getOntology(inlist, gocategorylist)
    + getEvidence(inlist)
     > getOntology(org.At.tairGO[["AT5G58560"]])
     > getEvidence(org.At.tairGO[["AT5G58560"]])
4.1 GO.db


Exercise 2:

   2.1 How many terms map with the gene: AT5G58560 ?

   2.2 What biological process terms map with this gene?

   2.3 What evidences are associated with these terms ?
Using R for GO Terms Analysis:
   1. What are GO terms ?
   2. What is R ?
   3. What is Bioconductor ?
   4. Bioconductor modules for GO terms
         4.1. Basics: GO.db
         4.2. Gene Enrinchment Analysis: TopGO
         4.3. GO profiles: GOProfiles
         4.4. Gene Similarities: GOSim
    5. Example.
         5.1. Comparing two arabidopsis chromosomes.
4.2 TopGO

Gene Set Enrinchment Analysis (GSEA)
     Computational method that determines whether an a priori
defined set of genes shows statistically significant, concordant
differences between two biological states (e.g. phenotypes).




                    Documentation:   http://www.broadinstitute.org/gsea/index.jsp
4.2 TopGO

    http://www.bioconductor.org/packages/release/bioc/html/topGO.html
4.2 TopGO

Methodology:
  1) Data preparation:
     a) Gene Universe.                                   OBJECT:
     b) GO Annotation.                                  topGOdata
     c) Criteria to select interesting genes.

  > sampleGOdata <- new("topGOdata",
     description = "Simple session",
     ontology = "BP",
     allGenes = geneList,                       Gene Universe

     geneSel = topDiffGenes,                    Selected Genes
     nodeSize = 10,
                                                Function to map data provided
     annot = annFUN.db,                         in the annotation data packages
     affyLib = “hgu95av2.db”)                   Data package
4.2 TopGO
> sampleGOdata
------------------------- topGOdata object -------------------------   OBJECT:
Description:                                                           topGOdata
  - Simple session
Ontology:
  - BP
323 available genes (all genes from the array):
  - symbol: 1095_s_at 1130_at 1196_at 1329_s_at 1340_s_at ...
  - score : 1 1 0.62238 0.541224 1 ...
  - 50 significant genes.
316 feasible genes (genes that can be used in the analysis):
  - symbol: 1095_s_at 1130_at 1196_at 1329_s_at 1340_s_at ...
  - score : 1 1 0.62238 0.541224 1 ...
  - 50 significant genes.
GO graph (nodes with at least 10 genes):
  - a graph with directed edges
  - number of nodes = 672
  - number of edges = 1358
------------------------- topGOdata object -------------------------
4.2 TopGO

Methodology:
  2) Running the enrinchment test: (runTest function)

    > resultFisher <- runTest(sampleGOdata,
                      algorithm = "classic",      whichAlgorithms()
                      statistic = "fisher")       whichTest()
    > resultFisher                                 "fisher"
                                                   "ks"
    Description: Simple session                     "t"
    Ontology: BP                                    "globaltest"
    'classic' algorithm with the 'fisher' test     "sum"
    672 GO terms scored: 66 terms with p < 0.01    "ks.ties"
    Annotation data:
       Annotated genes: 316
       Significant genes: 50
       Min. no. of genes annotated to a GO: 10
       Nontrivial nodes: 607
4.2 TopGO

Methodology:
  2) Running the enrinchment test:

    > resultKS <- runTest(sampleGOdata,
                     algorithm = "classic",       whichAlgorithms()
                     statistic = "ks")            whichTest()
    > resultKS                                    "fisher"
                                                  "ks"
    Description: Simple session                    "t"
    Ontology: BP                                   "globaltest"
    'classic' algorithm with the 'ks' test        "sum"
    672 GO terms scored: 57 terms with p < 0.01   "ks.ties"
    Annotation data:
       Annotated genes: 316
       Significant genes: 50
       Min. no. of genes annotated to a GO: 10
       Nontrivial nodes: 672
4.2 TopGO

Methodology:
   3) Analysis of the results: (GenTable function)

  > allRes <- GenTable(sampleGOdata,
                       classicFisher = resultFisher,
                       classicKS = resultKS,
                       orderBy = "classicKS",
                       ranksOf = "classicFisher",
                       topNodes = 10)

 > allRes
Using R for GO Terms Analysis:
   1. What are GO terms ?
   2. What is R ?
   3. What is Bioconductor ?
   4. Bioconductor modules for GO terms
         4.1. Basics: GO.db
         4.2. Gene Enrinchment Analysis: TopGO
         4.3. GO profiles: GOProfiles
         4.4. Gene Similarities: GOSim
    5. Example.
         5.1. Comparing two arabidopsis chromosomes.
4.3 goProfiles

             http://www.bioconductor.org/packages/2.8/bioc/html/goProfiles.html




1) Profiles are built by slicing the GO graph at a given level

                                                                   Level 1
                                                                   Level 2

                                                                   Level 3


                                                                   Level 4
4.3 goProfiles
2) Functional profile at a given GO level is obtained by counting the
number of identifiers having a hit in each category of this level




3) Profiles comparissons:
    1. How different or representative of a given gene universe is a given set of genes?
        • Universe: All genes analyzed,
          Gene Set: Differentially expressed genes in a microarray experiment
        • Universe: All genes in a database,
          Gene Set: Arbitrarily selected set of genes

    2. How biologically different are two given sets of genes?
        • Differentially expressed genes in two experiments
        • Arbitrarily chosen lists of genes
4.3 goProfiles
Methodology:
   1) Data preparation:
       a) Gene GO term map (object or dataframe).
           Dataframe with 4 columns:
               + GeneID
               + Ontology
               + Evidence
               + GOID


   2) Profile creation:
       a) basicProfile function

          BasicProfile( genelist,                   “Entrez” (default),
                        idType = "Entrez",          “BiocProbes”,
                        onto = "ANY",               “GoTermsFrame”
                        Level = 2,
                        orgPackage=NULL,            Requested for “Entrez”
                        anotPackage=NULL,           Requested for
                        ...)                        “BiocProbes”
4.3 goProfiles
Methodology:
   2) Profile creation:
       a) basicProfile function

        > printProfiles(bpprofile)

        Functional Profile
        ==================
        [1] "BP ontology"
                      Description         GOID          Frequency
        25 cellular component biogen...   GO:0044085    5
        14 cellular component organi...   GO:0016043    6
        12          cellular process      GO:0009987   122
        32 establishment of localiza...   GO:0051234    4
        4     immune system process...    GO:0002376    1
        31            localization        GO:0051179    4
        2         metabolic process       GO:0008152    3
        33 multi-organism process...      GO:0051704    2
        30       response to stimulus     GO:0050896    4
4.3 goProfiles
Methodology:
   2) Profile creation:
       a) basicProfile function

        > plotProfiles(bpprofile)
4.3 goProfiles
Methodology:
   2) Profile creation:
       b) expandedProfile function
            Used mainly for comparisons of profiles.

             ExpandedProfile( genelist,
                        idType = "Entrez",
                        onto = "ANY",
                        Level = 2,
                        orgPackage=NULL,
                        anotPackage=NULL
                        ...)

    3) Profile comparisons:

        - Case I (INCLUSION): One list incluided in the other.

        - Case || (DISJOINT): Non overlapping gene sets

        - Case III (INTERSECTION): Overlapping genes
4.3 goProfiles
Methodology:
   3) Profile comparisons:

       - Case I (INCLUSION):             compareProfilesLists()

       - Case || (DISJOINT):             compareGeneLists()

       - Case III (INTERSECTION):        compareGeneLists()


         > comp_ath_genes <- compareGeneLists(
               genelist1=ath_chl_list,
               genelist2=ath_mit_list,
               idType="Entrez", orgPackage="org.At.tair.db",
               onto="BP", level=2)

         > print(compSummary(comp_ath_genes))

            Sqr.Euc.Dist StdErr   pValue      0.95CI.low   0.95CI.up
            0.031401     0.024101 0.005000    -0.015837    0.078639
4.3 goProfiles
Methodology:
   3) Profile comparisons:

         > basic_mitprof <- basicProfile(ath_mit_list, idType="Entrez",
                 onto="BP", level=2, orgPackage="org.At.tair.db",
                 empty.cats=TRUE)

        > basic_chloprof <- basicProfile(ath_chl_list, idType="Entrez",
                 onto="BP", level=2, orgPackage="org.At.tair.db",
                 empty.cats=TRUE)

        > merged_prof <- mergeProfilesLists(basic_chloprof, basic_mitprof,
                profNames=c("Chloroplast", "Mitochondria"))

        > plotProfiles(merged_prof, percentage=TRUE, legend=TRUE)
4.3 goProfiles
Methodology:
   3) Profile comparisons:
Using R for GO Terms Analysis:
   1. What are GO terms ?
   2. What is R ?
   3. What is Bioconductor ?
   4. Bioconductor modules for GO terms
         4.1. Basics: GO.db
         4.2. Gene Enrinchment Analysis: TopGO
         4.3. GO profiles: GOProfiles
         4.4. Gene Similarities: GOSim
    5. Example.
         5.1. Comparing two arabidopsis chromosomes.
4.4 GOSim
Using R for GO Terms Analysis:
   1. What are GO terms ?
   2. What is R ?
   3. What is Bioconductor ?
   4. Bioconductor modules for GO terms
         4.1. Basics: GO.db
         4.2. Gene Enrinchment Analysis: TopGO
         4.3. GO profiles: GOProfiles
         4.4. Gene Similarities: GOSim
    5. Example.
         5.1. Comparing two arabidopsis chromosomes.
5 Example


Exercise 3:

   Compare the goProfiles for A. thaliana chromosomes 1 and 5.

   Are they significantly differents ?

   Source:
       ftp://ftp.arabidopsis.org/home/tair/Genes/

       TAIR10_genome_release/TAIR10_gene_lists/TAIR10_all_gene_models



   R Package:
       GoProfiles

Weitere ähnliche Inhalte

Was ist angesagt?

NGS data formats and analyses
NGS data formats and analysesNGS data formats and analyses
NGS data formats and analysesrjorton
 
Computational Biology and Bioinformatics
Computational Biology and BioinformaticsComputational Biology and Bioinformatics
Computational Biology and BioinformaticsSharif Shuvo
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES nadeem akhter
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...VHIR Vall d’Hebron Institut de Recerca
 
Part 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw dataPart 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw dataJoachim Jacob
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsAyeshaYousaf20
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl databaseAshfaq Ahmad
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Sachin Kumar
 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple nadeem akhter
 
Recent trends in bioinformatics
Recent trends in bioinformaticsRecent trends in bioinformatics
Recent trends in bioinformaticsZeeshan Hanjra
 
Protein function and bioinformatics
Protein function and bioinformaticsProtein function and bioinformatics
Protein function and bioinformaticsNeil Saunders
 

Was ist angesagt? (20)

Protein Predictinon
Protein PredictinonProtein Predictinon
Protein Predictinon
 
NGS data formats and analyses
NGS data formats and analysesNGS data formats and analyses
NGS data formats and analyses
 
Biological Database
Biological DatabaseBiological Database
Biological Database
 
Computational Biology and Bioinformatics
Computational Biology and BioinformaticsComputational Biology and Bioinformatics
Computational Biology and Bioinformatics
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
Ensembl genome
Ensembl genomeEnsembl genome
Ensembl genome
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
 
Part 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw dataPart 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw data
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomics
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl database
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics
 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple
 
Biological networks - building and visualizing
Biological networks - building and visualizingBiological networks - building and visualizing
Biological networks - building and visualizing
 
Protein database
Protein databaseProtein database
Protein database
 
Pymol
PymolPymol
Pymol
 
Recent trends in bioinformatics
Recent trends in bioinformaticsRecent trends in bioinformatics
Recent trends in bioinformatics
 
Protein function and bioinformatics
Protein function and bioinformaticsProtein function and bioinformatics
Protein function and bioinformatics
 
Genome analysis2
Genome analysis2Genome analysis2
Genome analysis2
 
ProCheck
ProCheckProCheck
ProCheck
 

Ähnlich wie GoTermsAnalysisWithR

Bio-protocol4446.GO_KEGG_RICE.pdf
Bio-protocol4446.GO_KEGG_RICE.pdfBio-protocol4446.GO_KEGG_RICE.pdf
Bio-protocol4446.GO_KEGG_RICE.pdfssuserb500f8
 
Bioinfomatics Presentation
Bioinfomatics PresentationBioinfomatics Presentation
Bioinfomatics PresentationZhenhong Bao
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchDavid Ruau
 
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Rothamsted Research, UK
 
ICAR2016 TAIR talk
ICAR2016 TAIR talkICAR2016 TAIR talk
ICAR2016 TAIR talkDonghui Li
 
Towards semantic systems chemical biology
Towards semantic systems chemical biology Towards semantic systems chemical biology
Towards semantic systems chemical biology Bin Chen
 
Functional annotation
Functional annotationFunctional annotation
Functional annotationRavi Gandham
 
Neo4j_Cypher.pdf
Neo4j_Cypher.pdfNeo4j_Cypher.pdf
Neo4j_Cypher.pdfJaberRad1
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Nathan Dunn
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS
 
LODを用いたバイオインフォマティクスアプリケーション
LODを用いたバイオインフォマティクスアプリケーションLODを用いたバイオインフォマティクスアプリケーション
LODを用いたバイオインフォマティクスアプリケーションKazuki Oshita
 
Gene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment AnalysisGene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment AnalysisUC Davis
 
TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...Phoenix Bioinformatics
 
GPU-accelerated Virtual Screening
GPU-accelerated Virtual ScreeningGPU-accelerated Virtual Screening
GPU-accelerated Virtual ScreeningOlexandr Isayev
 
Gene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -TutorialGene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -TutorialDmitry Grapov
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesMelanie Courtot
 

Ähnlich wie GoTermsAnalysisWithR (20)

Bio-protocol4446.GO_KEGG_RICE.pdf
Bio-protocol4446.GO_KEGG_RICE.pdfBio-protocol4446.GO_KEGG_RICE.pdf
Bio-protocol4446.GO_KEGG_RICE.pdf
 
Bioinfomatics Presentation
Bioinfomatics PresentationBioinfomatics Presentation
Bioinfomatics Presentation
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
 
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
 
ICAR2016 TAIR talk
ICAR2016 TAIR talkICAR2016 TAIR talk
ICAR2016 TAIR talk
 
Towards semantic systems chemical biology
Towards semantic systems chemical biology Towards semantic systems chemical biology
Towards semantic systems chemical biology
 
Functional annotation
Functional annotationFunctional annotation
Functional annotation
 
Neo4j_Cypher.pdf
Neo4j_Cypher.pdfNeo4j_Cypher.pdf
Neo4j_Cypher.pdf
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequences
 
LODを用いたバイオインフォマティクスアプリケーション
LODを用いたバイオインフォマティクスアプリケーションLODを用いたバイオインフォマティクスアプリケーション
LODを用いたバイオインフォマティクスアプリケーション
 
Gene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment AnalysisGene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment Analysis
 
TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...
 
GPU-accelerated Virtual Screening
GPU-accelerated Virtual ScreeningGPU-accelerated Virtual Screening
GPU-accelerated Virtual Screening
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 
Chado introduction
Chado introductionChado introduction
Chado introduction
 
Neo4j and bioinformatics
Neo4j and bioinformaticsNeo4j and bioinformatics
Neo4j and bioinformatics
 
Gene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -TutorialGene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -Tutorial
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resources
 
GiTools
GiToolsGiTools
GiTools
 

Mehr von Aureliano Bombarely (11)

Lesson mitochondria bombarely_a20180927
Lesson mitochondria bombarely_a20180927Lesson mitochondria bombarely_a20180927
Lesson mitochondria bombarely_a20180927
 
Genome Assembly 2018
Genome Assembly 2018Genome Assembly 2018
Genome Assembly 2018
 
PlastidEvolution
PlastidEvolutionPlastidEvolution
PlastidEvolution
 
Genomic Data Analysis: From Reads to Variants
Genomic Data Analysis: From Reads to VariantsGenomic Data Analysis: From Reads to Variants
Genomic Data Analysis: From Reads to Variants
 
Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
 
RNAseq Analysis
RNAseq AnalysisRNAseq Analysis
RNAseq Analysis
 
BasicLinux
BasicLinuxBasicLinux
BasicLinux
 
PerlTesting
PerlTestingPerlTesting
PerlTesting
 
PerlScripting
PerlScriptingPerlScripting
PerlScripting
 
Introduction2R
Introduction2RIntroduction2R
Introduction2R
 
BasicGraphsWithR
BasicGraphsWithRBasicGraphsWithR
BasicGraphsWithR
 

Kürzlich hochgeladen

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 

Kürzlich hochgeladen (20)

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 

GoTermsAnalysisWithR

  • 1. Using R for GO Terms Analysis Boyce Thompson Institute for Plant Research Tower Road Ithaca, New York 14853-1801 U.S.A. by Aureliano Bombarely Gomez
  • 2. Using R for GO Terms Analysis: 1. What are GO terms ? 2. What is R ? 3. What is Bioconductor ? 4. Bioconductor modules for GO terms 4.1. Basics: GO.db 4.2. Gene Enrinchment Analysis: TopGO 4.3. GO profiles: GOProfiles 4.4. Gene Similarities: GOSim 5. Example. 5.1. Comparing two arabidopsis chromosomes.
  • 3. Using R for GO Terms Analysis: 1. What are GO terms ? 2. What is R ? 3. What is Bioconductor ? 4. Bioconductor modules for GO terms 4.1. Basics: GO.db 4.2. Gene Enrinchment Analysis: TopGO 4.3. GO profiles: GOProfiles 4.4. Gene Similarities: GOSim 5. Example. 5.1. Comparing two arabidopsis chromosomes.
  • 4. 1. What are GO Terms ? Gene ontologies: Structured controlled vocabularies (ontologies) that describe gene products in terms of their associated biological processes, cellular components and molecular functions in a species-independent manner http://www.geneontology.org/GO.doc.shtml
  • 5. 1. What are GO Terms ? Biological processes, Recognized series of events or molecular functions. A process is a collection of molecular events with a defined beginning and end. Cellular components, Describes locations, at the levels of subcellular structures and macromolecular complexes. Molecular functions Describes activities, such as catalytic or binding activities, that occur at the molecular level.
  • 6. 1. What are GO Terms ? Gene ontologies: Structured controlled vocabulariesbetween the terms relationships (ontologies) that describe gene products in terms ofbe described in terms of a acyclic graph, where: GO can their associated each GO term is a node, and the biological processes, the terms are arcs between the nodes relationships between cellular components and molecular functions in a species-independent manner
  • 7. Using R for GO Terms Analysis: 1. What are GO terms ? 2. What is R ? 3. What is Bioconductor ? 4. Bioconductor modules for GO terms 4.1. Basics: GO.db 4.2. Gene Enrinchment Analysis: TopGO 4.3. GO profiles: GOProfiles 4.4. Gene Similarities: GOSim 5. Example. 5.1. Comparing two arabidopsis chromosomes.
  • 8. 2. What is R ? R is a language and environment for statistical computing and graphics. .
  • 9. 2. What is R ? WEB: OFICIAL WEB: http://www.r-project.org/index.html QUICK-R: http://www.statmethods.net/index.html BOOKS: Introductory Statistics with R (Statistics and Computing), P. Dalgaard [available as manual at R project web] The R Book, MJ. Crawley R itself: help() and example()
  • 10. 2. What is R ? INTEGRATED DEVELOPMENT ENVIRONMENT (IDE): R-Studio: http://rstudio.org/
  • 11. Using R for GO Terms Analysis: 1. What are GO terms ? 2. What is R ? 3. What is Bioconductor ? 4. Bioconductor modules for GO terms 4.1. Basics: GO.db 4.2. Gene Enrinchment Analysis: TopGO 4.3. GO profiles: GOProfiles 4.4. Gene Similarities: GOSim 5. Example. 5.1. Comparing two arabidopsis chromosomes.
  • 12. 3. What is Bioconductor ? Bioconductor is an open source, open development software project to provide tools for the analysis and comprehension of high-throughput genomic data. It is based primarily on the R programming language.
  • 13. 3. What is Bioconductor ? WEB: OFICIAL WEB: http://www.bioconductor.org/ BOOKS: Bioinformatics and Computational Biology Solutions Using R and Bioconductor, 2005, Gentleman R. et al. (Springer) R Programming for Bioinformatics, 2008, Gentleman R. (CRC Press) MAILING LIST: http://www.bioconductor.org/help/mailing-list/
  • 14. 3. What is Bioconductor ? Installation of Bioconductor:
  • 15. 3. What is Bioconductor ? Use of Bioconductor:
  • 16. Using R for GO Terms Analysis: 1. What are GO terms ? 2. What is R ? 3. What is Bioconductor ? 4. Bioconductor modules for GO terms 4.1. Basics: GO.db 4.2. Gene Similarities: GOSim 4.3. GO profiles: GOProfiles 4.4. Gene Enrinchment Analysis:TopGO 5. Example. 5.1. Comparing two arabidopsis chromosomes.
  • 17. 4. Bioconductor modules for GOterms Bioconductor Packages for GO Terms:
  • 18. 4. Bioconductor modules for GOterms Bioconductor Packages for GO Terms: GO.db A set of annotation maps describing the entire Gene Ontology Gostats Tools for manipulating GO and microarrays GOSim functional similarities between GO terms and gene products GOProfiles Statistical analysis of functional profiles TopGO Enrichment analysis for Gene Ontology
  • 19. Using R for GO Terms Analysis: 1. What are GO terms ? 2. What is R ? 3. What is Bioconductor ? 4. Bioconductor modules for GO terms 4.1. Basics: GO.db 4.2. Gene Enrinchment Analysis: TopGO 4.3. GO profiles: GOProfiles 4.4. Gene Similarities: GOSim 5. Example. 5.1. Comparing two arabidopsis chromosomes.
  • 20. 4.1 GO.db http://www.bioconductor.org/packages/2.9/data/annotation/html/GO.db.html 1) GO terms stored in objects GOTERM, GOBPPARENTS, GOCCPARENTS, GOMFPARENTS GOBPANCESTOR, GOCCANCESTOR, GOMFANCESTOR GOBPCHILDREN, GOCCCHILDREN, GOMFCHILDREN GOBPOFFSPRING, GOCCOFFSPRING, GOMFOFFSPRING More Information: http://www.bioconductor.org/packages/2.6/bioc/vignettes/annotate/inst/doc/GOusage.pdf
  • 21. 4.1 GO.db GOBPANCESTOR GOBPANCESTOR$"GO:0006720” GOBPPARENTS GOBPPARENTS$"GO:0006720” GOTERM GOTERM$"GO:0006720” GOBPCHILDREN GOBPCHILDREN$"GO:0006720” GOBPOFFSPRING GOBPOFFSPRING$"GO:0006720”
  • 22. 4.1 GO.db Exercise 1: 1.1 Which term correspond top the GOID: GO:0006720 ? 1.2 How many synonyms has this term ? 1.3 How many children ? 1.4 … and parents ? 1.5 … and offsprings ? 1.6 … and ancestors ?
  • 23. 4.1 GO.db 2) Mapping between gene and GO terms stored in objects or dataframes. org.At.tair.db Genome wide annotation for Arabidopsis, primarily based on mapping using TAIR identifiers.
  • 24. 4.1 GO.db 2) Mapping between gene and GO terms stored in objects (annotate package) or dataframes. > library("org.At.tair.db") > org.At.tairGO[["AT5G58560"]] Use a list: > org.At.tairGO[["AT5G58560"]][[1]]$Ontology Functions (“annotate” package): + getOntology(inlist, gocategorylist) + getEvidence(inlist) > getOntology(org.At.tairGO[["AT5G58560"]]) > getEvidence(org.At.tairGO[["AT5G58560"]])
  • 25. 4.1 GO.db Exercise 2: 2.1 How many terms map with the gene: AT5G58560 ? 2.2 What biological process terms map with this gene? 2.3 What evidences are associated with these terms ?
  • 26. Using R for GO Terms Analysis: 1. What are GO terms ? 2. What is R ? 3. What is Bioconductor ? 4. Bioconductor modules for GO terms 4.1. Basics: GO.db 4.2. Gene Enrinchment Analysis: TopGO 4.3. GO profiles: GOProfiles 4.4. Gene Similarities: GOSim 5. Example. 5.1. Comparing two arabidopsis chromosomes.
  • 27. 4.2 TopGO Gene Set Enrinchment Analysis (GSEA) Computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes). Documentation: http://www.broadinstitute.org/gsea/index.jsp
  • 28. 4.2 TopGO http://www.bioconductor.org/packages/release/bioc/html/topGO.html
  • 29. 4.2 TopGO Methodology: 1) Data preparation: a) Gene Universe. OBJECT: b) GO Annotation. topGOdata c) Criteria to select interesting genes. > sampleGOdata <- new("topGOdata", description = "Simple session", ontology = "BP", allGenes = geneList, Gene Universe geneSel = topDiffGenes, Selected Genes nodeSize = 10, Function to map data provided annot = annFUN.db, in the annotation data packages affyLib = “hgu95av2.db”) Data package
  • 30. 4.2 TopGO > sampleGOdata ------------------------- topGOdata object ------------------------- OBJECT: Description: topGOdata - Simple session Ontology: - BP 323 available genes (all genes from the array): - symbol: 1095_s_at 1130_at 1196_at 1329_s_at 1340_s_at ... - score : 1 1 0.62238 0.541224 1 ... - 50 significant genes. 316 feasible genes (genes that can be used in the analysis): - symbol: 1095_s_at 1130_at 1196_at 1329_s_at 1340_s_at ... - score : 1 1 0.62238 0.541224 1 ... - 50 significant genes. GO graph (nodes with at least 10 genes): - a graph with directed edges - number of nodes = 672 - number of edges = 1358 ------------------------- topGOdata object -------------------------
  • 31. 4.2 TopGO Methodology: 2) Running the enrinchment test: (runTest function) > resultFisher <- runTest(sampleGOdata, algorithm = "classic", whichAlgorithms() statistic = "fisher") whichTest() > resultFisher "fisher" "ks" Description: Simple session "t" Ontology: BP "globaltest" 'classic' algorithm with the 'fisher' test "sum" 672 GO terms scored: 66 terms with p < 0.01 "ks.ties" Annotation data: Annotated genes: 316 Significant genes: 50 Min. no. of genes annotated to a GO: 10 Nontrivial nodes: 607
  • 32. 4.2 TopGO Methodology: 2) Running the enrinchment test: > resultKS <- runTest(sampleGOdata, algorithm = "classic", whichAlgorithms() statistic = "ks") whichTest() > resultKS "fisher" "ks" Description: Simple session "t" Ontology: BP "globaltest" 'classic' algorithm with the 'ks' test "sum" 672 GO terms scored: 57 terms with p < 0.01 "ks.ties" Annotation data: Annotated genes: 316 Significant genes: 50 Min. no. of genes annotated to a GO: 10 Nontrivial nodes: 672
  • 33. 4.2 TopGO Methodology: 3) Analysis of the results: (GenTable function) > allRes <- GenTable(sampleGOdata, classicFisher = resultFisher, classicKS = resultKS, orderBy = "classicKS", ranksOf = "classicFisher", topNodes = 10) > allRes
  • 34. Using R for GO Terms Analysis: 1. What are GO terms ? 2. What is R ? 3. What is Bioconductor ? 4. Bioconductor modules for GO terms 4.1. Basics: GO.db 4.2. Gene Enrinchment Analysis: TopGO 4.3. GO profiles: GOProfiles 4.4. Gene Similarities: GOSim 5. Example. 5.1. Comparing two arabidopsis chromosomes.
  • 35. 4.3 goProfiles http://www.bioconductor.org/packages/2.8/bioc/html/goProfiles.html 1) Profiles are built by slicing the GO graph at a given level Level 1 Level 2 Level 3 Level 4
  • 36. 4.3 goProfiles 2) Functional profile at a given GO level is obtained by counting the number of identifiers having a hit in each category of this level 3) Profiles comparissons: 1. How different or representative of a given gene universe is a given set of genes? • Universe: All genes analyzed, Gene Set: Differentially expressed genes in a microarray experiment • Universe: All genes in a database, Gene Set: Arbitrarily selected set of genes 2. How biologically different are two given sets of genes? • Differentially expressed genes in two experiments • Arbitrarily chosen lists of genes
  • 37. 4.3 goProfiles Methodology: 1) Data preparation: a) Gene GO term map (object or dataframe). Dataframe with 4 columns: + GeneID + Ontology + Evidence + GOID 2) Profile creation: a) basicProfile function BasicProfile( genelist, “Entrez” (default), idType = "Entrez", “BiocProbes”, onto = "ANY", “GoTermsFrame” Level = 2, orgPackage=NULL, Requested for “Entrez” anotPackage=NULL, Requested for ...) “BiocProbes”
  • 38. 4.3 goProfiles Methodology: 2) Profile creation: a) basicProfile function > printProfiles(bpprofile) Functional Profile ================== [1] "BP ontology" Description GOID Frequency 25 cellular component biogen... GO:0044085 5 14 cellular component organi... GO:0016043 6 12 cellular process GO:0009987 122 32 establishment of localiza... GO:0051234 4 4 immune system process... GO:0002376 1 31 localization GO:0051179 4 2 metabolic process GO:0008152 3 33 multi-organism process... GO:0051704 2 30 response to stimulus GO:0050896 4
  • 39. 4.3 goProfiles Methodology: 2) Profile creation: a) basicProfile function > plotProfiles(bpprofile)
  • 40. 4.3 goProfiles Methodology: 2) Profile creation: b) expandedProfile function Used mainly for comparisons of profiles. ExpandedProfile( genelist, idType = "Entrez", onto = "ANY", Level = 2, orgPackage=NULL, anotPackage=NULL ...) 3) Profile comparisons: - Case I (INCLUSION): One list incluided in the other. - Case || (DISJOINT): Non overlapping gene sets - Case III (INTERSECTION): Overlapping genes
  • 41. 4.3 goProfiles Methodology: 3) Profile comparisons: - Case I (INCLUSION): compareProfilesLists() - Case || (DISJOINT): compareGeneLists() - Case III (INTERSECTION): compareGeneLists() > comp_ath_genes <- compareGeneLists( genelist1=ath_chl_list, genelist2=ath_mit_list, idType="Entrez", orgPackage="org.At.tair.db", onto="BP", level=2) > print(compSummary(comp_ath_genes)) Sqr.Euc.Dist StdErr pValue 0.95CI.low 0.95CI.up 0.031401 0.024101 0.005000 -0.015837 0.078639
  • 42. 4.3 goProfiles Methodology: 3) Profile comparisons: > basic_mitprof <- basicProfile(ath_mit_list, idType="Entrez", onto="BP", level=2, orgPackage="org.At.tair.db", empty.cats=TRUE) > basic_chloprof <- basicProfile(ath_chl_list, idType="Entrez", onto="BP", level=2, orgPackage="org.At.tair.db", empty.cats=TRUE) > merged_prof <- mergeProfilesLists(basic_chloprof, basic_mitprof, profNames=c("Chloroplast", "Mitochondria")) > plotProfiles(merged_prof, percentage=TRUE, legend=TRUE)
  • 43. 4.3 goProfiles Methodology: 3) Profile comparisons:
  • 44. Using R for GO Terms Analysis: 1. What are GO terms ? 2. What is R ? 3. What is Bioconductor ? 4. Bioconductor modules for GO terms 4.1. Basics: GO.db 4.2. Gene Enrinchment Analysis: TopGO 4.3. GO profiles: GOProfiles 4.4. Gene Similarities: GOSim 5. Example. 5.1. Comparing two arabidopsis chromosomes.
  • 46. Using R for GO Terms Analysis: 1. What are GO terms ? 2. What is R ? 3. What is Bioconductor ? 4. Bioconductor modules for GO terms 4.1. Basics: GO.db 4.2. Gene Enrinchment Analysis: TopGO 4.3. GO profiles: GOProfiles 4.4. Gene Similarities: GOSim 5. Example. 5.1. Comparing two arabidopsis chromosomes.
  • 47. 5 Example Exercise 3: Compare the goProfiles for A. thaliana chromosomes 1 and 5. Are they significantly differents ? Source: ftp://ftp.arabidopsis.org/home/tair/Genes/ TAIR10_genome_release/TAIR10_gene_lists/TAIR10_all_gene_models R Package: GoProfiles