SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Microarray Dataset: quick mining
and gene profile analysis using
online tools
Dr. Etienne Z. GNIMPIEBA
Sioux Falls, March 2013
Etienne.gnimpieba@usd.edu
Plan
 Gene expression measurement
 Microarray process
 Gene expression data stores
 Data mining / quering
 Data analysis
 Example: ATP13A2 profile in stress
conditions
Gene expression
measurement
Higher-plex techniques:
SAGE
DNA microarray
Tiling array
RNA-Seq
NGS
Low-to-mid-plex techniques:
Reporter gene
Northern blot
Western blot
Fluorescent in situ
hybridization
Reverse transcription PCR
What is a Microarray?
“A DNA microarray is a multiplex technology
consisting of thousands of oligonucleotide
spots, each containing picomoles of a
specific DNA sequence.”
 Used to quantitate mRNA or DNA
 Many applications:
◦ mRNA or DNA levels
◦ SNP identification
◦ ChIP-on-Chip
Hypotheses
 Microarrays are usually hypothesis-generating:
◦ They highlight specific genes or features that are
particularly interesting for follow-up experiments
◦ There are many interesting exceptions
 Biomarkers
 Pathway analyses
 This does not reduce the importance of
experimental design
◦ the low statistical power of array studies make good
design even more important and very challenging
Microarray process (1/3)
• Image analysis
(genepix)
• Normalization (R)
• Pre-treatment
• Differential expression
• Clustering
• Data mining
• Annotation
Microarray process (2/3)
Microarray process (3/3)
High density
filters(macroarrays)
Glass slides
(microarrays)
Oligonucleotides
chips
Detail: Detail: Detail:
Size: 12cm x 8cm Size: 5,4cm x 0,9cm Size: 1,28cm x 1,28cm
•2400 clones by
membrane
•radioactive labelling
•1 experimental
condition by membrane
•10000 clones by slide
•fluorescent labelling
•2 experimental
conditions by slide
•300000
oligonucleotides by
slide
•fluorescent labelling
•1 experimental
condition by slide
Gene expression data
management
Database
Microarray
Experiment
Sets
Sample
Profiles
Date Reported
ArrayExpress at EBI 24,838 708,914 October 28, 2011
ArrayTrack™ 1,622 50,953 February 11, 2012
caArray at NCI 41 1,741 November 15, 2006
Gene Expression Omnibus -
NCBI
25,859 641,770 October 28, 2011
Genevestigator database 2,500 65,000 January 2012
MUSC database ~45 555 April 1, 2007
Stanford Microarray database 82,542 Not reported October 23, 2011
UNC Microarray database ~31 2,093 April 1, 2007
UNC modENCODE Microarray
database
~6 180 July 17, 2009
UPenn RAD database ~100 ~2,500 September 1, 2007
UPSC-BASE ~100 Not reported November 15, 2007
SAGE
GEO
GUDMAP (421)
MGI
BIOGPS
Data mining / querying
 Problem specification
 Query
 Extraction
 Storage
 Load
 Pretreat / prepare for analysis
Data analysis (1/3)
 Question-Answer
◦ Experimental condition profile: group
comparison
◦ Annotation profile: systems biological involved
◦ Clustering profile: co-regulation
◦ Time course profile: time variation
◦ …
 Descriptive
◦ Boxplot (SD, MEAN, MEDIAN, )
◦ Scatter plot
 Predictive / inference (clustering)
 Modeling (machine learning, simulation)
Data analysis (2/3)
 3 Questions
◦ What is the right dataset (experimental condition)?
◦ Is dataset is ready for analysis (quality)?
◦ What is the expression profile for a given gene?
◦ Significant differential expression in groups
comparison
 Tools
◦ ArrayExpress (EBI)
◦ Boxplot
◦ GEO2R (LIMMA, profile graph,)
◦ ….
Data analysis (3/3)
Boxplot
Example: ATP13A2 profile in stress
conditions
 Specification: ATP13A2 profile in
stress conditions
 Data querying:
◦ GEO
◦ Array Express
◦ Gene Atlas
 Data analysis:
◦ Online: GEO2R, Genospace, …
◦ Desktop: R, ArrayTrack, …
Significant differential expression
!!!
Kerry Bemis slides

Weitere ähnliche Inhalte

Was ist angesagt?

ADARSH JOSE_Resume
ADARSH JOSE_ResumeADARSH JOSE_Resume
ADARSH JOSE_Resume
Adarsh Jose
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
nadeem akhter
 
Bioinformatics Analysis of Nucleotide Sequences
Bioinformatics Analysis of Nucleotide SequencesBioinformatics Analysis of Nucleotide Sequences
Bioinformatics Analysis of Nucleotide Sequences
Adrian Gustavo Avellaneda Vergara
 

Was ist angesagt? (20)

NCBI
NCBINCBI
NCBI
 
Databases ii
Databases iiDatabases ii
Databases ii
 
Rishi
RishiRishi
Rishi
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Genomic databases
Genomic databasesGenomic databases
Genomic databases
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databases
 
NCBI
NCBINCBI
NCBI
 
Ncbi basic intro_v_pitt_kent_osu
Ncbi basic intro_v_pitt_kent_osuNcbi basic intro_v_pitt_kent_osu
Ncbi basic intro_v_pitt_kent_osu
 
Biological data base
Biological data baseBiological data base
Biological data base
 
ADARSH JOSE_Resume
ADARSH JOSE_ResumeADARSH JOSE_Resume
ADARSH JOSE_Resume
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners Slides
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Designing Biological Databases
Designing Biological DatabasesDesigning Biological Databases
Designing Biological Databases
 
Gene bank by kk sahu
Gene bank by kk sahuGene bank by kk sahu
Gene bank by kk sahu
 
Intro bioinfo
Intro bioinfoIntro bioinfo
Intro bioinfo
 
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
 
Biological databases
Biological databasesBiological databases
Biological databases
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
Bioinformatics Analysis of Nucleotide Sequences
Bioinformatics Analysis of Nucleotide SequencesBioinformatics Analysis of Nucleotide Sequences
Bioinformatics Analysis of Nucleotide Sequences
 

Andere mochten auch

Session i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmcSession i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmc
USD Bioinformatics
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mcc
USD Bioinformatics
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corr
USD Bioinformatics
 
Lab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisLab Gene Expression Data Analysis
Lab Gene Expression Data Analysis
USD Bioinformatics
 
Nuevos enfoques en el análisis de datos de microarrays.
Nuevos enfoques en el análisis de datos de microarrays.Nuevos enfoques en el análisis de datos de microarrays.
Nuevos enfoques en el análisis de datos de microarrays.
Alberto Labarga
 
Micro arreglos o microarrays
Micro arreglos o microarraysMicro arreglos o microarrays
Micro arreglos o microarrays
Victor González
 
Diagnostico Molecular De Las Enfermedades Pdf
Diagnostico Molecular De Las Enfermedades PdfDiagnostico Molecular De Las Enfermedades Pdf
Diagnostico Molecular De Las Enfermedades Pdf
CESI-DESAN
 

Andere mochten auch (19)

Session i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmcSession i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmc
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mcc
 
Huber brin pb1_f2_poster_2012
Huber brin pb1_f2_poster_2012Huber brin pb1_f2_poster_2012
Huber brin pb1_f2_poster_2012
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corr
 
Lab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisLab Gene Expression Data Analysis
Lab Gene Expression Data Analysis
 
Visualization Tools
Visualization ToolsVisualization Tools
Visualization Tools
 
Seminario investigacion
Seminario investigacionSeminario investigacion
Seminario investigacion
 
Nuevos enfoques en el análisis de datos de microarrays.
Nuevos enfoques en el análisis de datos de microarrays.Nuevos enfoques en el análisis de datos de microarrays.
Nuevos enfoques en el análisis de datos de microarrays.
 
Biochips
BiochipsBiochips
Biochips
 
prediction methods for ORF
prediction methods for ORFprediction methods for ORF
prediction methods for ORF
 
Microarreglos de dna completa
Microarreglos de dna completaMicroarreglos de dna completa
Microarreglos de dna completa
 
Micro arreglos o microarrays
Micro arreglos o microarraysMicro arreglos o microarrays
Micro arreglos o microarrays
 
transposon mediated mutagenesis
transposon mediated mutagenesistransposon mediated mutagenesis
transposon mediated mutagenesis
 
PCR
PCRPCR
PCR
 
Genómica
GenómicaGenómica
Genómica
 
DNA microarray
DNA microarrayDNA microarray
DNA microarray
 
DNA microarray
DNA microarrayDNA microarray
DNA microarray
 
Diagnostico Molecular De Las Enfermedades Pdf
Diagnostico Molecular De Las Enfermedades PdfDiagnostico Molecular De Las Enfermedades Pdf
Diagnostico Molecular De Las Enfermedades Pdf
 
Tema 16: El ADN y la ingeniería genética
Tema 16: El ADN y la ingeniería genéticaTema 16: El ADN y la ingeniería genética
Tema 16: El ADN y la ingeniería genética
 

Ähnlich wie Session ii g1 overview genomics and gene expression mmc-good

20100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_020100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_0
Computer Science Club
 
American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013
Dmitry Grapov
 
DNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data PerspectiveDNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data Perspective
Palaniappan SP
 

Ähnlich wie Session ii g1 overview genomics and gene expression mmc-good (20)

The UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overviewThe UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overview
 
20100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_020100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_0
 
Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...
 
American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
Reconstructing paleoenvironments using metagenomics
Reconstructing paleoenvironments using metagenomicsReconstructing paleoenvironments using metagenomics
Reconstructing paleoenvironments using metagenomics
 
Graziano Pesole - il progetto EPIGEN
Graziano Pesole - il progetto EPIGENGraziano Pesole - il progetto EPIGEN
Graziano Pesole - il progetto EPIGEN
 
Introduction to Biological databases
Introduction to Biological databasesIntroduction to Biological databases
Introduction to Biological databases
 
Health Sciences Driving UCSD Research Cyberinfrastructure
Health Sciences Driving UCSD Research CyberinfrastructureHealth Sciences Driving UCSD Research Cyberinfrastructure
Health Sciences Driving UCSD Research Cyberinfrastructure
 
Evolution of DNA Sequencing by Jonathan Eisen
Evolution of DNA Sequencing by Jonathan EisenEvolution of DNA Sequencing by Jonathan Eisen
Evolution of DNA Sequencing by Jonathan Eisen
 
ChIP-seq Theory
ChIP-seq TheoryChIP-seq Theory
ChIP-seq Theory
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...
 
AIQC - ISCB 2022.pdf
AIQC - ISCB 2022.pdfAIQC - ISCB 2022.pdf
AIQC - ISCB 2022.pdf
 
Introduction to Next Generation Sequencing
Introduction to Next Generation SequencingIntroduction to Next Generation Sequencing
Introduction to Next Generation Sequencing
 
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
 
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
 
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
 
DNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data PerspectiveDNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data Perspective
 
Next generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasNext generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad Abbas
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)
 

Mehr von USD Bioinformatics

Session ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcSession ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmc
USD Bioinformatics
 
Session ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcSession ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmc
USD Bioinformatics
 
Session ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcSession ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmc
USD Bioinformatics
 
Session ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcSession ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmc
USD Bioinformatics
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmc
USD Bioinformatics
 
Session i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcSession i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmc
USD Bioinformatics
 

Mehr von USD Bioinformatics (20)

Clinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder CancerClinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder Cancer
 
Clinical Application 1.0
Clinical Application 1.0Clinical Application 1.0
Clinical Application 1.0
 
Clinical Application 2.0
Clinical Application 2.0Clinical Application 2.0
Clinical Application 2.0
 
Bridge Amplification Part 2
Bridge Amplification Part 2Bridge Amplification Part 2
Bridge Amplification Part 2
 
Bridge Amplification Part 1
Bridge Amplification Part 1Bridge Amplification Part 1
Bridge Amplification Part 1
 
Basic Steps of the NGS Method
Basic Steps of the NGS MethodBasic Steps of the NGS Method
Basic Steps of the NGS Method
 
True Single Molecule Sequencing
True Single Molecule SequencingTrue Single Molecule Sequencing
True Single Molecule Sequencing
 
Small Molecule Real Time Sequencing
Small Molecule Real Time SequencingSmall Molecule Real Time Sequencing
Small Molecule Real Time Sequencing
 
Sanger Dideoxy Method
Sanger Dideoxy MethodSanger Dideoxy Method
Sanger Dideoxy Method
 
Pyrosequencing 454
Pyrosequencing 454Pyrosequencing 454
Pyrosequencing 454
 
Ion Torrent Sequencing
Ion Torrent SequencingIon Torrent Sequencing
Ion Torrent Sequencing
 
Next Generation Sequencing - the basics
Next Generation Sequencing - the basicsNext Generation Sequencing - the basics
Next Generation Sequencing - the basics
 
Illumina Sequencing
Illumina SequencingIllumina Sequencing
Illumina Sequencing
 
Session ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcSession ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmc
 
Session ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcSession ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmc
 
Session ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcSession ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmc
 
Session ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcSession ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmc
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmc
 
Session ii g2 lab modeling mmc
Session ii g2 lab modeling mmcSession ii g2 lab modeling mmc
Session ii g2 lab modeling mmc
 
Session i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcSession i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmc
 

Kürzlich hochgeladen

Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 

Kürzlich hochgeladen (20)

Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxBT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 

Session ii g1 overview genomics and gene expression mmc-good

  • 1. Microarray Dataset: quick mining and gene profile analysis using online tools Dr. Etienne Z. GNIMPIEBA Sioux Falls, March 2013 Etienne.gnimpieba@usd.edu
  • 2. Plan  Gene expression measurement  Microarray process  Gene expression data stores  Data mining / quering  Data analysis  Example: ATP13A2 profile in stress conditions
  • 3. Gene expression measurement Higher-plex techniques: SAGE DNA microarray Tiling array RNA-Seq NGS Low-to-mid-plex techniques: Reporter gene Northern blot Western blot Fluorescent in situ hybridization Reverse transcription PCR
  • 4. What is a Microarray? “A DNA microarray is a multiplex technology consisting of thousands of oligonucleotide spots, each containing picomoles of a specific DNA sequence.”  Used to quantitate mRNA or DNA  Many applications: ◦ mRNA or DNA levels ◦ SNP identification ◦ ChIP-on-Chip
  • 5. Hypotheses  Microarrays are usually hypothesis-generating: ◦ They highlight specific genes or features that are particularly interesting for follow-up experiments ◦ There are many interesting exceptions  Biomarkers  Pathway analyses  This does not reduce the importance of experimental design ◦ the low statistical power of array studies make good design even more important and very challenging
  • 6. Microarray process (1/3) • Image analysis (genepix) • Normalization (R) • Pre-treatment • Differential expression • Clustering • Data mining • Annotation
  • 8. Microarray process (3/3) High density filters(macroarrays) Glass slides (microarrays) Oligonucleotides chips Detail: Detail: Detail: Size: 12cm x 8cm Size: 5,4cm x 0,9cm Size: 1,28cm x 1,28cm •2400 clones by membrane •radioactive labelling •1 experimental condition by membrane •10000 clones by slide •fluorescent labelling •2 experimental conditions by slide •300000 oligonucleotides by slide •fluorescent labelling •1 experimental condition by slide
  • 9. Gene expression data management Database Microarray Experiment Sets Sample Profiles Date Reported ArrayExpress at EBI 24,838 708,914 October 28, 2011 ArrayTrack™ 1,622 50,953 February 11, 2012 caArray at NCI 41 1,741 November 15, 2006 Gene Expression Omnibus - NCBI 25,859 641,770 October 28, 2011 Genevestigator database 2,500 65,000 January 2012 MUSC database ~45 555 April 1, 2007 Stanford Microarray database 82,542 Not reported October 23, 2011 UNC Microarray database ~31 2,093 April 1, 2007 UNC modENCODE Microarray database ~6 180 July 17, 2009 UPenn RAD database ~100 ~2,500 September 1, 2007 UPSC-BASE ~100 Not reported November 15, 2007 SAGE GEO GUDMAP (421) MGI BIOGPS
  • 10. Data mining / querying  Problem specification  Query  Extraction  Storage  Load  Pretreat / prepare for analysis
  • 11. Data analysis (1/3)  Question-Answer ◦ Experimental condition profile: group comparison ◦ Annotation profile: systems biological involved ◦ Clustering profile: co-regulation ◦ Time course profile: time variation ◦ …  Descriptive ◦ Boxplot (SD, MEAN, MEDIAN, ) ◦ Scatter plot  Predictive / inference (clustering)  Modeling (machine learning, simulation)
  • 12. Data analysis (2/3)  3 Questions ◦ What is the right dataset (experimental condition)? ◦ Is dataset is ready for analysis (quality)? ◦ What is the expression profile for a given gene? ◦ Significant differential expression in groups comparison  Tools ◦ ArrayExpress (EBI) ◦ Boxplot ◦ GEO2R (LIMMA, profile graph,) ◦ ….
  • 14. Example: ATP13A2 profile in stress conditions  Specification: ATP13A2 profile in stress conditions  Data querying: ◦ GEO ◦ Array Express ◦ Gene Atlas  Data analysis: ◦ Online: GEO2R, Genospace, … ◦ Desktop: R, ArrayTrack, …

Hinweis der Redaktion

  1. I can not say that I'm into Statistician 20 min. I give you just a few items to give rapid analysis of microarray.
  2. The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies. They are divided into two groups based on their degree of multiplexity.
  3. The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies. They are divided into two groups based on their degree of multiplexity.
  4. The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies. They are divided into two groups based on their degree of multiplexity.
  5. ArrayTrack™ provides an integrated solution for managing, analyzing, and interpreting microarray gene expression data. Specifically, ArrayTrack™ is MIAME (Minimum Information About A Microarray Experiment)-supportive for storing both microarray data and experiment parameters associated with a pharmacogenomics or toxicogenomics study. Many statistical and visualization tools are available with ArrayTrack™ which provides a rich collection of functional information about genes, proteins, and pathways for biological interpretation.  The primary emphasis of ArrayTrack™ is the direct linking of analysis results with functional information to facilitate the interaction between the choice of analysis methods and the biological relevance of analysis results. Using ArrayTrack™, users can easily select a statistical method applied to stored microarray data to determine a list of differentially expressed genes. The gene list can then be directly linked to pathways and gene ontology for functional analysis.
  6. Boxplots are useful for determining where the majority of the data lies