SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Basic bioinformatics concepts,
                      databases and tools
                                                       Module 4
                                       Beyond the sequences

                                                    Dr. Joachim Jacob
                                                http://www.bits.vib.be

Updated Nov 2011
http://dl.dropbox.com/u/18352887/BITS_training_material/Link%20to%20mod4-intro_H1_2011_otherRelevantData.pdf
Module 4 broadens our view
To understand life, we need not only
sequences, but many other concepts
      
          Bioinformatics is also storing and analyzing
             −   gene information: variations, isoforms,...
             −   Expression data
             −   3D protein structure data
             −   Interaction data
             −   Pathways and network


                     “Storing all relevant biological data”
Schematic view II
GeneA                sequence     annotations – gene expr – pathway – struct,...

GeneB                sequence     annotations – gene expr – pathway – struct,...

GeneC                sequence     annotations – gene expr – pathway – struct,...


                       analysis                  Additional information
                                                        sources
                   results   results
Primary database
Other sequence
databases
The indispensable databases
      
          Gene Ontology – structuring
      
          KEGG – biochemical pathways
      
          PDB – Structure of proteins
      
          Intact – Interaction data
      
          dbSNP – database of genomic variation
      
          Expression sources – Microarray data
Gene Ontology structures the way we
communicate about life




Gene translation                  Protein production                 Protein synthesis



                                            http://www.arabidopsis.org/help/tutorials/go1.jsp
  http://www.geneontology.org/teaching_resources/tutorials/2005-09_BiB-journal-tutorial_jlomax
Gene Ontology structures life
               http://www.geneontology.org/
               Agreement on standardized keywords (often referred to as
                 'controlled vocabularies'), describing all natural processes in an
                 hierarchical way (ontology).
               Keywords are assigned to genes based different evidence
               Keywords are ordered in a hierarchical tree-like structure ( 'directed
                 acyclic graphs')
               Three GO 'trees' exists, describing:
                                 "Biological Process"
                                 "Cellular Component"
                                 "Molecular Function"
                                           http://www.arabidopsis.org/help/tutorials/go1.jsp
 http://www.geneontology.org/teaching_resources/tutorials/2005-09_BiB-journal-tutorial_jlomax
A gene can be given
different GO terms

 Example, cytochrome c:

     molecular function: oxidoreductase activity,

     biological process: oxidative phosphorylation and
 induction of cell death,

     cellular component: mitochondrial matrix and
 mitochondrial inner membrane.

 In each tree, the terms are organised in a directed acyclic
 graph: a network consisting of parents and child-terms (as
 nodes) and lines between them as relationships.
Different evidence codes can assign a
degree of confidence to the assignment
         http://www.geneontology.org/GO.evidence.shtml

         Evidence codes can be grouped by:
         
             Experimental (e.g. IDA – inferred from direct assay)
         
             Computational analysis
         
             Author statement
         
             Curator statement
         
             Inferred from electronic annotation (IEA)
         If available, each annotation has also a reference
Different evidence codes can assign a
degree of confidence to the assignment
Gene Ontology structures all genes
according to their biological significance
         The GO structure and the terms can be browsed by a browser
           called AmiGO.
         The Quick Go from EBI has some nice visualisation
         Excellent GO-wiki for all your questions
GO can be used to retrieve all gene
(products) related to one specific term
         You can search broad, e.g. Amigo search for Diabetes
           leads to following GO term
         http://amigo.geneontology.org/
GO can be used to retrieve all gene
(products) related to one specific term
              Amigo search for Diabetes
GO can be used to retrieve all gene
(products) related to one specific term
              Amigo search for Diabetes
GO is also useful to analyze and compare
different gene lists
          A lot of tools on GO are available on website.




                                http://www.geneontology.org/GO.tools.shtml
Some things to know about GO
         For analyses, one can make use of 'shrinked' GO sets,
           the so-called GO-slims
                –   GO slims are a subset of biologically more
                    relevant GO terms (available per species)
                –   GO ontologies can be downloaded in .obo
                    format.
         Not all information is captured by GO and need to be
           retrieved in other databases
                Metabolic pathways: KEGG, …
                Phenotype/diseases
                       •   Mapping files exists e.g. kegg2go
                              http://www.geneontology.org/GO.slims.shtml
Biological pathways databases organise
genes by molecular reactions
        3 important databases on biological pathways
        
            http://www.kegg.jp/




           http://www.reactome.org/ - EBI
           http://metacyc.org
Proteins with enzymatic function receive
an Enzyme Commission (EC) number
        http://www.chem.qmul.ac.uk/iubmb/enzyme/
        EC 6   Ligases
        EC 5   Isomerases
        EC 4   Lyases
        EC 3   Hydrolases
        EC 2   Transferases
        EC 1   Oxidoreductases
IntAct database contains interaction
information of proteins
         http://www.ebi.ac.uk/intact
         Three types of interactions stored
            
                Protein-protein
            
                Protein-dna
            
                Protein-small molecule
IntAct database represents all
interactions as binary: caution!
Interaction networks can be analysed on
your computer using Cytoscape




                    Cytoscape training material on the BITS website
PDB hosts 3-dimensional
structural data on molecules
PDB hosts 3-dimensional
structural data on molecules

         PDB = Protein DataBank
             http://www.pdb.org/pdb/home/home.do
         Only structures resolved through NMR and X-ray
           (or other accurate techniques)
         
             Proteins
         
             DNA
         
             RNA
         
             Ligands

         Understanding PDB data: tutorial
PDB files can be read by a lot of different
  tools to display the structure
                       Every entry in PDB contains its own PDB accession
                         number (often 1 digit and three letters)
                       The PDB file contains 3D coordinates from every
                         single atom in the structure, together with
                         variability of that position (last two digits)




http://www.bits.vib.be/index.php?option=com_content&view=article&id=17203817:protein-structure-
PDB files can be read by a lot of different
tools to display the structure
         Tools to visualize (and some to analyze
           structures) (see BITS wiki)




                      http://www.bits.vib.be/wiki/index.php/Protein_structure
To find a structure for your protein
  sequence is to search for similarity
               Homology modeling
               Similarity on sequence level projected to a structure
                    Blast your query against PDB db by cblast , or at expasy
                    PSI-BLAST - can detect sequences with similar structures
                     (twilight zone!)
                    If still no success: 3D-jury (a meta approach, including fold
                     recognition and local structure prediction)
               Similarity on structural level: aligning structures
                    VAST (structure)
                    Distance mAtrix aLIgnment DALI

                                             BITS training on protein structure analysis
                http://www.ii.uib.no/~slars/bioinfocourse/PDFs/structpred_tutorial.pdf
Tools at EBI                           http://consurf.tau.ac.il/pe/protexpl/psbiores.htm
Structural information is used to classify
proteins              Database cross-references in PDB entry




             
                 SCOP
             Groups proteins based on evolutionary, domain
               architecture and structural information.
             
                 CATH
             Manually curated classification on protein domains

                                           http://scop.mrc-lmb.cam.ac.uk/scop/
                                                        http://www.cathdb.info/
dbSNP is a public-domain archive for
simple genetic polymorphisms
      
          Single Nucleotide Polymorphism database (NCBI)
      
          Each dbSNP entry has a code rsxx (RefSNP) or ssxx
          (submitted SNP)
          
              single-base nucleotide substitutions (also known as
              single nucleotide polymorphisms or SNPs),
          
              small-scale multi-base deletions or insertions (also
              called deletion insertion polymorphisms or DIPs)
          
              retroposable element insertions and microsatellite
              repeat variations (also called short tandem repeats or
              STRs).
      
          Synchronized with new genome builds
Expression data can be sequence-based
or hybridisation-based
      Sequence-based (ESTs - RNA seq - SAGE)
        
            Digital gene expression/northern
      Microarray databases – hybridisation based:
        
            GEO: gene expression omnibus (NCBI)
             −   Platform: GPLxxxxxxx
             −   Experiment: GSExxxxxx (= several samples)
             −   Sample: GSMxxxxxxxx
             −   Some experiments are curated: GDSxxxxx (online
                 analysis possible)
        
            ArrayExpress (EBI)
Example of expression data at GEO
Example of expression data at GEO
Example of expression data at GEO
Example at ArrayExpress
Example at ArrayExpress
Entrez interconnects the databases at
NCBI for easy querying
        
            UniGene : sequences grouped by gene
        
            PopSet : sequence alignments for population
            studies and phylogeny
        
            Structure : 3D structures (PDB)
        
            Genome : genomic maps of chromosomes and
            plasmids
        
            UniSTS (Sequence Tagged Sites)
        
            PubMed : literature abstracts (MEDLINE,…)
        
            OMIM (Online Mendelian Inheritance in Man) :
            literature reviews,
        
            Mesh (Medical Subject Headings) : keywords
        
            Taxonomy
Finding relevant data
Summarizing most important links to
discover everything you need ...
             Protein data
               Interpro (heavily integrated with EBI resources)
               http://www.interpro.org

             Gene data
               Entrez at NCBI : 'Entrez Gene'
               http://www.ncbi.nlm.nih.gov/Entrez/
               Ebeye Search at EBI : excellent for cross-species
               http://www.ebi.ac.uk/ebisearch/
Hold back your horses!

            Phew, where do I place this all?
Bioinformatics is all about different data,
as versatile as life itself
            Due to the strong cross-references between
              different databases, new databases and
              relevant info are rapidly integrated in existing
              databases.
            You can discover them by taking time to read the
              entries.
New tools are emerging everyday to
enable you to browse all data sources...
         BioGPS, all in one window!
New tools are emerging everyday to
enable you to browse all data sources...
Integrative resources are increasingly
being organised on a species basis
        
            EMAGE database of in situ gene expression in mouse
        
            OMIM Database of diseases in man
        
            Websites providing an interface to integrate all
            this data is increasingly important
        
            Often organized on a species basis
             −   TAIR
             −   Flybase
             −   Wormbase
The organizing biological data
information by species

                     By species, why?
  There is one biological information resource which stays
           more or less unchanged per species ...

Weitere ähnliche Inhalte

Was ist angesagt? (20)

Databases short nucletide polymorphism
Databases short nucletide polymorphismDatabases short nucletide polymorphism
Databases short nucletide polymorphism
 
Bioinformatics data mining
Bioinformatics data miningBioinformatics data mining
Bioinformatics data mining
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequence
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Molecular modeling database
Molecular modeling database Molecular modeling database
Molecular modeling database
 
Swiss PROT
Swiss PROT Swiss PROT
Swiss PROT
 
Functional proteomics, methods and tools
Functional proteomics, methods and toolsFunctional proteomics, methods and tools
Functional proteomics, methods and tools
 
Protein Database
Protein DatabaseProtein Database
Protein Database
 
BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)
 
Genomic Data Analysis
Genomic Data AnalysisGenomic Data Analysis
Genomic Data Analysis
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Proteomic databases
Proteomic databasesProteomic databases
Proteomic databases
 
ENTREZ.ppt
ENTREZ.pptENTREZ.ppt
ENTREZ.ppt
 
Molecular phylogenetics
Molecular phylogeneticsMolecular phylogenetics
Molecular phylogenetics
 
Swiss pdb viewer
Swiss pdb viewerSwiss pdb viewer
Swiss pdb viewer
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
Functional proteomics, and tools
Functional proteomics, and toolsFunctional proteomics, and tools
Functional proteomics, and tools
 
Proteomics, definatio , general concept, signficance
Proteomics,  definatio , general concept, signficanceProteomics,  definatio , general concept, signficance
Proteomics, definatio , general concept, signficance
 

Andere mochten auch

BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.BITS
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS
 
BITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS
 
Bioinformatics
BioinformaticsBioinformatics
BioinformaticsJTADrexel
 
The important bits of cloud computing
The important bits of cloud computingThe important bits of cloud computing
The important bits of cloud computingCarsonified Team
 
L01 ecture 01-
L01 ecture 01-L01 ecture 01-
L01 ecture 01-MUBOSScz
 
Bioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyBioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyJoaquin Dopazo
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastRai University
 
Biological Database Systems
Biological Database SystemsBiological Database Systems
Biological Database SystemsDenis Shestakov
 
B.sc biochem i bobi u 4 gene prediction
B.sc biochem i bobi u 4 gene predictionB.sc biochem i bobi u 4 gene prediction
B.sc biochem i bobi u 4 gene predictionRai University
 
Features of biological databases
Features of biological databasesFeatures of biological databases
Features of biological databasesCharu Sharma
 
September 1 Day Workshop
September 1 Day WorkshopSeptember 1 Day Workshop
September 1 Day WorkshopThe Biome
 
DRUG DESIGN BASED ON BIOINFORMATICS TOOLS
DRUG DESIGN BASED ON BIOINFORMATICS TOOLSDRUG DESIGN BASED ON BIOINFORMATICS TOOLS
DRUG DESIGN BASED ON BIOINFORMATICS TOOLSNIPER MOHALI
 
Dotplots for Bioinformatics
Dotplots for BioinformaticsDotplots for Bioinformatics
Dotplots for Bioinformaticsavrilcoghlan
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES nadeem akhter
 
Computer aided drug designing
Computer aided drug designing Computer aided drug designing
Computer aided drug designing Ayesha Aftab
 

Andere mochten auch (20)

BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databases
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarity
 
BITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS: Basics of sequence analysis
BITS: Basics of sequence analysis
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
The important bits of cloud computing
The important bits of cloud computingThe important bits of cloud computing
The important bits of cloud computing
 
L01 ecture 01-
L01 ecture 01-L01 ecture 01-
L01 ecture 01-
 
Bioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyBioinformatics in dermato-oncology
Bioinformatics in dermato-oncology
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blast
 
Biological Database Systems
Biological Database SystemsBiological Database Systems
Biological Database Systems
 
B.sc biochem i bobi u 4 gene prediction
B.sc biochem i bobi u 4 gene predictionB.sc biochem i bobi u 4 gene prediction
B.sc biochem i bobi u 4 gene prediction
 
Features of biological databases
Features of biological databasesFeatures of biological databases
Features of biological databases
 
September 1 Day Workshop
September 1 Day WorkshopSeptember 1 Day Workshop
September 1 Day Workshop
 
DRUG DESIGN BASED ON BIOINFORMATICS TOOLS
DRUG DESIGN BASED ON BIOINFORMATICS TOOLSDRUG DESIGN BASED ON BIOINFORMATICS TOOLS
DRUG DESIGN BASED ON BIOINFORMATICS TOOLS
 
Dotplots for Bioinformatics
Dotplots for BioinformaticsDotplots for Bioinformatics
Dotplots for Bioinformatics
 
Bioinformatics and Drug Discovery
Bioinformatics and Drug DiscoveryBioinformatics and Drug Discovery
Bioinformatics and Drug Discovery
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Computer aided drug designing
Computer aided drug designing Computer aided drug designing
Computer aided drug designing
 

Ähnlich wie BITS: Overview of important biological databases beyond sequences

Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...Elufer Akram
 
Sequencedatabases
SequencedatabasesSequencedatabases
SequencedatabasesAbhik Seal
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformaticsVinaKhan1
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introductionDrGopaSarma
 
Data retriveal ,srg and dbget
Data retriveal ,srg and dbgetData retriveal ,srg and dbget
Data retriveal ,srg and dbgetSurendraKumar338
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformaticsAtai Rabby
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBioinformaticsCentre
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfkigaruantony
 
Introduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptxIntroduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptxRAJESHKUMAR428748
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchAnshika Bansal
 

Ähnlich wie BITS: Overview of important biological databases beyond sequences (20)

Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Sequencedatabases
SequencedatabasesSequencedatabases
Sequencedatabases
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Biological database
Biological databaseBiological database
Biological database
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
Databases
DatabasesDatabases
Databases
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Data retriveal ,srg and dbget
Data retriveal ,srg and dbgetData retriveal ,srg and dbget
Data retriveal ,srg and dbget
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdf
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdf
 
Chibucos annot go_final
Chibucos annot go_finalChibucos annot go_final
Chibucos annot go_final
 
Introduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptxIntroduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptx
 
bioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics databioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics data
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 

Mehr von BITS

RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5BITS
 
RNA-seq for DE analysis: extracting counts and QC - part 4
RNA-seq for DE analysis: extracting counts and QC - part 4RNA-seq for DE analysis: extracting counts and QC - part 4
RNA-seq for DE analysis: extracting counts and QC - part 4BITS
 
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6BITS
 
RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2BITS
 
RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1BITS
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3BITS
 
Productivity tips - Introduction to linux for bioinformatics
Productivity tips - Introduction to linux for bioinformaticsProductivity tips - Introduction to linux for bioinformatics
Productivity tips - Introduction to linux for bioinformaticsBITS
 
Text mining on the command line - Introduction to linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformaticsText mining on the command line - Introduction to linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformaticsBITS
 
The structure of Linux - Introduction to Linux for bioinformatics
The structure of Linux - Introduction to Linux for bioinformaticsThe structure of Linux - Introduction to Linux for bioinformatics
The structure of Linux - Introduction to Linux for bioinformaticsBITS
 
Managing your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformaticsManaging your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformaticsBITS
 
Introduction to Linux for bioinformatics
Introduction to Linux for bioinformaticsIntroduction to Linux for bioinformatics
Introduction to Linux for bioinformaticsBITS
 
BITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics dataBITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics dataBITS
 
BITS - Comparative genomics: the Contra tool
BITS - Comparative genomics: the Contra toolBITS - Comparative genomics: the Contra tool
BITS - Comparative genomics: the Contra toolBITS
 
BITS - Comparative genomics on the genome level
BITS - Comparative genomics on the genome levelBITS - Comparative genomics on the genome level
BITS - Comparative genomics on the genome levelBITS
 
BITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysisBITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysisBITS
 
BITS - Introduction to comparative genomics
BITS - Introduction to comparative genomicsBITS - Introduction to comparative genomics
BITS - Introduction to comparative genomicsBITS
 
BITS - Protein inference from mass spectrometry data
BITS - Protein inference from mass spectrometry dataBITS - Protein inference from mass spectrometry data
BITS - Protein inference from mass spectrometry dataBITS
 
BITS - Overview of sequence databases for mass spectrometry data analysis
BITS - Overview of sequence databases for mass spectrometry data analysisBITS - Overview of sequence databases for mass spectrometry data analysis
BITS - Overview of sequence databases for mass spectrometry data analysisBITS
 
BITS - Search engines for mass spec data
BITS - Search engines for mass spec dataBITS - Search engines for mass spec data
BITS - Search engines for mass spec dataBITS
 
BITS - Introduction to proteomics
BITS - Introduction to proteomicsBITS - Introduction to proteomics
BITS - Introduction to proteomicsBITS
 

Mehr von BITS (20)

RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5
 
RNA-seq for DE analysis: extracting counts and QC - part 4
RNA-seq for DE analysis: extracting counts and QC - part 4RNA-seq for DE analysis: extracting counts and QC - part 4
RNA-seq for DE analysis: extracting counts and QC - part 4
 
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
 
RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2
 
RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3
 
Productivity tips - Introduction to linux for bioinformatics
Productivity tips - Introduction to linux for bioinformaticsProductivity tips - Introduction to linux for bioinformatics
Productivity tips - Introduction to linux for bioinformatics
 
Text mining on the command line - Introduction to linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformaticsText mining on the command line - Introduction to linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformatics
 
The structure of Linux - Introduction to Linux for bioinformatics
The structure of Linux - Introduction to Linux for bioinformaticsThe structure of Linux - Introduction to Linux for bioinformatics
The structure of Linux - Introduction to Linux for bioinformatics
 
Managing your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformaticsManaging your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformatics
 
Introduction to Linux for bioinformatics
Introduction to Linux for bioinformaticsIntroduction to Linux for bioinformatics
Introduction to Linux for bioinformatics
 
BITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics dataBITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics data
 
BITS - Comparative genomics: the Contra tool
BITS - Comparative genomics: the Contra toolBITS - Comparative genomics: the Contra tool
BITS - Comparative genomics: the Contra tool
 
BITS - Comparative genomics on the genome level
BITS - Comparative genomics on the genome levelBITS - Comparative genomics on the genome level
BITS - Comparative genomics on the genome level
 
BITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysisBITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysis
 
BITS - Introduction to comparative genomics
BITS - Introduction to comparative genomicsBITS - Introduction to comparative genomics
BITS - Introduction to comparative genomics
 
BITS - Protein inference from mass spectrometry data
BITS - Protein inference from mass spectrometry dataBITS - Protein inference from mass spectrometry data
BITS - Protein inference from mass spectrometry data
 
BITS - Overview of sequence databases for mass spectrometry data analysis
BITS - Overview of sequence databases for mass spectrometry data analysisBITS - Overview of sequence databases for mass spectrometry data analysis
BITS - Overview of sequence databases for mass spectrometry data analysis
 
BITS - Search engines for mass spec data
BITS - Search engines for mass spec dataBITS - Search engines for mass spec data
BITS - Search engines for mass spec data
 
BITS - Introduction to proteomics
BITS - Introduction to proteomicsBITS - Introduction to proteomics
BITS - Introduction to proteomics
 

Kürzlich hochgeladen

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 

Kürzlich hochgeladen (20)

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 

BITS: Overview of important biological databases beyond sequences

  • 1. Basic bioinformatics concepts, databases and tools Module 4 Beyond the sequences Dr. Joachim Jacob http://www.bits.vib.be Updated Nov 2011 http://dl.dropbox.com/u/18352887/BITS_training_material/Link%20to%20mod4-intro_H1_2011_otherRelevantData.pdf
  • 2. Module 4 broadens our view
  • 3. To understand life, we need not only sequences, but many other concepts  Bioinformatics is also storing and analyzing − gene information: variations, isoforms,... − Expression data − 3D protein structure data − Interaction data − Pathways and network “Storing all relevant biological data”
  • 4. Schematic view II GeneA sequence annotations – gene expr – pathway – struct,... GeneB sequence annotations – gene expr – pathway – struct,... GeneC sequence annotations – gene expr – pathway – struct,... analysis Additional information sources results results Primary database Other sequence databases
  • 5. The indispensable databases  Gene Ontology – structuring  KEGG – biochemical pathways  PDB – Structure of proteins  Intact – Interaction data  dbSNP – database of genomic variation  Expression sources – Microarray data
  • 6. Gene Ontology structures the way we communicate about life Gene translation Protein production Protein synthesis http://www.arabidopsis.org/help/tutorials/go1.jsp http://www.geneontology.org/teaching_resources/tutorials/2005-09_BiB-journal-tutorial_jlomax
  • 7. Gene Ontology structures life http://www.geneontology.org/ Agreement on standardized keywords (often referred to as 'controlled vocabularies'), describing all natural processes in an hierarchical way (ontology). Keywords are assigned to genes based different evidence Keywords are ordered in a hierarchical tree-like structure ( 'directed acyclic graphs') Three GO 'trees' exists, describing: "Biological Process" "Cellular Component" "Molecular Function" http://www.arabidopsis.org/help/tutorials/go1.jsp http://www.geneontology.org/teaching_resources/tutorials/2005-09_BiB-journal-tutorial_jlomax
  • 8. A gene can be given different GO terms Example, cytochrome c: molecular function: oxidoreductase activity, biological process: oxidative phosphorylation and induction of cell death, cellular component: mitochondrial matrix and mitochondrial inner membrane. In each tree, the terms are organised in a directed acyclic graph: a network consisting of parents and child-terms (as nodes) and lines between them as relationships.
  • 9.
  • 10. Different evidence codes can assign a degree of confidence to the assignment http://www.geneontology.org/GO.evidence.shtml Evidence codes can be grouped by:  Experimental (e.g. IDA – inferred from direct assay)  Computational analysis  Author statement  Curator statement  Inferred from electronic annotation (IEA) If available, each annotation has also a reference
  • 11. Different evidence codes can assign a degree of confidence to the assignment
  • 12. Gene Ontology structures all genes according to their biological significance The GO structure and the terms can be browsed by a browser called AmiGO. The Quick Go from EBI has some nice visualisation Excellent GO-wiki for all your questions
  • 13. GO can be used to retrieve all gene (products) related to one specific term You can search broad, e.g. Amigo search for Diabetes leads to following GO term http://amigo.geneontology.org/
  • 14. GO can be used to retrieve all gene (products) related to one specific term Amigo search for Diabetes
  • 15. GO can be used to retrieve all gene (products) related to one specific term Amigo search for Diabetes
  • 16. GO is also useful to analyze and compare different gene lists A lot of tools on GO are available on website. http://www.geneontology.org/GO.tools.shtml
  • 17. Some things to know about GO For analyses, one can make use of 'shrinked' GO sets, the so-called GO-slims – GO slims are a subset of biologically more relevant GO terms (available per species) – GO ontologies can be downloaded in .obo format. Not all information is captured by GO and need to be retrieved in other databases Metabolic pathways: KEGG, … Phenotype/diseases • Mapping files exists e.g. kegg2go http://www.geneontology.org/GO.slims.shtml
  • 18. Biological pathways databases organise genes by molecular reactions 3 important databases on biological pathways  http://www.kegg.jp/  http://www.reactome.org/ - EBI  http://metacyc.org
  • 19. Proteins with enzymatic function receive an Enzyme Commission (EC) number http://www.chem.qmul.ac.uk/iubmb/enzyme/ EC 6 Ligases EC 5 Isomerases EC 4 Lyases EC 3 Hydrolases EC 2 Transferases EC 1 Oxidoreductases
  • 20. IntAct database contains interaction information of proteins http://www.ebi.ac.uk/intact Three types of interactions stored  Protein-protein  Protein-dna  Protein-small molecule
  • 21. IntAct database represents all interactions as binary: caution!
  • 22. Interaction networks can be analysed on your computer using Cytoscape Cytoscape training material on the BITS website
  • 24. PDB hosts 3-dimensional structural data on molecules PDB = Protein DataBank http://www.pdb.org/pdb/home/home.do Only structures resolved through NMR and X-ray (or other accurate techniques)  Proteins  DNA  RNA  Ligands Understanding PDB data: tutorial
  • 25. PDB files can be read by a lot of different tools to display the structure Every entry in PDB contains its own PDB accession number (often 1 digit and three letters) The PDB file contains 3D coordinates from every single atom in the structure, together with variability of that position (last two digits) http://www.bits.vib.be/index.php?option=com_content&view=article&id=17203817:protein-structure-
  • 26. PDB files can be read by a lot of different tools to display the structure Tools to visualize (and some to analyze structures) (see BITS wiki) http://www.bits.vib.be/wiki/index.php/Protein_structure
  • 27. To find a structure for your protein sequence is to search for similarity Homology modeling Similarity on sequence level projected to a structure  Blast your query against PDB db by cblast , or at expasy  PSI-BLAST - can detect sequences with similar structures (twilight zone!)  If still no success: 3D-jury (a meta approach, including fold recognition and local structure prediction) Similarity on structural level: aligning structures  VAST (structure)  Distance mAtrix aLIgnment DALI BITS training on protein structure analysis http://www.ii.uib.no/~slars/bioinfocourse/PDFs/structpred_tutorial.pdf Tools at EBI http://consurf.tau.ac.il/pe/protexpl/psbiores.htm
  • 28. Structural information is used to classify proteins Database cross-references in PDB entry  SCOP Groups proteins based on evolutionary, domain architecture and structural information.  CATH Manually curated classification on protein domains http://scop.mrc-lmb.cam.ac.uk/scop/ http://www.cathdb.info/
  • 29. dbSNP is a public-domain archive for simple genetic polymorphisms  Single Nucleotide Polymorphism database (NCBI)  Each dbSNP entry has a code rsxx (RefSNP) or ssxx (submitted SNP)  single-base nucleotide substitutions (also known as single nucleotide polymorphisms or SNPs),  small-scale multi-base deletions or insertions (also called deletion insertion polymorphisms or DIPs)  retroposable element insertions and microsatellite repeat variations (also called short tandem repeats or STRs).  Synchronized with new genome builds
  • 30. Expression data can be sequence-based or hybridisation-based Sequence-based (ESTs - RNA seq - SAGE)  Digital gene expression/northern Microarray databases – hybridisation based:  GEO: gene expression omnibus (NCBI) − Platform: GPLxxxxxxx − Experiment: GSExxxxxx (= several samples) − Sample: GSMxxxxxxxx − Some experiments are curated: GDSxxxxx (online analysis possible)  ArrayExpress (EBI)
  • 31. Example of expression data at GEO
  • 32. Example of expression data at GEO
  • 33. Example of expression data at GEO
  • 36. Entrez interconnects the databases at NCBI for easy querying  UniGene : sequences grouped by gene  PopSet : sequence alignments for population studies and phylogeny  Structure : 3D structures (PDB)  Genome : genomic maps of chromosomes and plasmids  UniSTS (Sequence Tagged Sites)  PubMed : literature abstracts (MEDLINE,…)  OMIM (Online Mendelian Inheritance in Man) : literature reviews,  Mesh (Medical Subject Headings) : keywords  Taxonomy
  • 38. Summarizing most important links to discover everything you need ... Protein data Interpro (heavily integrated with EBI resources) http://www.interpro.org Gene data Entrez at NCBI : 'Entrez Gene' http://www.ncbi.nlm.nih.gov/Entrez/ Ebeye Search at EBI : excellent for cross-species http://www.ebi.ac.uk/ebisearch/
  • 39. Hold back your horses! Phew, where do I place this all?
  • 40. Bioinformatics is all about different data, as versatile as life itself Due to the strong cross-references between different databases, new databases and relevant info are rapidly integrated in existing databases. You can discover them by taking time to read the entries.
  • 41. New tools are emerging everyday to enable you to browse all data sources... BioGPS, all in one window!
  • 42. New tools are emerging everyday to enable you to browse all data sources...
  • 43. Integrative resources are increasingly being organised on a species basis  EMAGE database of in situ gene expression in mouse  OMIM Database of diseases in man  Websites providing an interface to integrate all this data is increasingly important  Often organized on a species basis − TAIR − Flybase − Wormbase
  • 44. The organizing biological data information by species By species, why? There is one biological information resource which stays more or less unchanged per species ...

Hinweis der Redaktion

  1. 'translation', whereas another uses the phrase 'protein synthesis',
  2. 'translation', whereas another uses the phrase 'protein synthesis',
  3. 'translation', whereas another uses the phrase 'protein synthesis',
  4. GO hierarchy can be downloaded (obo format) GO Slim: selection of categories
  5. GO hierarchy can be downloaded (obo format) GO Slim: selection of categories
  6. Different types: Ribbon Cartoon Ball and stick Space filling
  7. Different types: Ribbon Cartoon Ball and stick Space filling