Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Session ii g1 overview genomics and gene expression mmc-good
1. Microarray Dataset: quick mining
and gene profile analysis using
online tools
Dr. Etienne Z. GNIMPIEBA
Sioux Falls, March 2013
Etienne.gnimpieba@usd.edu
2. Plan
Gene expression measurement
Microarray process
Gene expression data stores
Data mining / quering
Data analysis
Example: ATP13A2 profile in stress
conditions
4. What is a Microarray?
“A DNA microarray is a multiplex technology
consisting of thousands of oligonucleotide
spots, each containing picomoles of a
specific DNA sequence.”
Used to quantitate mRNA or DNA
Many applications:
◦ mRNA or DNA levels
◦ SNP identification
◦ ChIP-on-Chip
5. Hypotheses
Microarrays are usually hypothesis-generating:
◦ They highlight specific genes or features that are
particularly interesting for follow-up experiments
◦ There are many interesting exceptions
Biomarkers
Pathway analyses
This does not reduce the importance of
experimental design
◦ the low statistical power of array studies make good
design even more important and very challenging
8. Microarray process (3/3)
High density
filters(macroarrays)
Glass slides
(microarrays)
Oligonucleotides
chips
Detail: Detail: Detail:
Size: 12cm x 8cm Size: 5,4cm x 0,9cm Size: 1,28cm x 1,28cm
•2400 clones by
membrane
•radioactive labelling
•1 experimental
condition by membrane
•10000 clones by slide
•fluorescent labelling
•2 experimental
conditions by slide
•300000
oligonucleotides by
slide
•fluorescent labelling
•1 experimental
condition by slide
9. Gene expression data
management
Database
Microarray
Experiment
Sets
Sample
Profiles
Date Reported
ArrayExpress at EBI 24,838 708,914 October 28, 2011
ArrayTrack™ 1,622 50,953 February 11, 2012
caArray at NCI 41 1,741 November 15, 2006
Gene Expression Omnibus -
NCBI
25,859 641,770 October 28, 2011
Genevestigator database 2,500 65,000 January 2012
MUSC database ~45 555 April 1, 2007
Stanford Microarray database 82,542 Not reported October 23, 2011
UNC Microarray database ~31 2,093 April 1, 2007
UNC modENCODE Microarray
database
~6 180 July 17, 2009
UPenn RAD database ~100 ~2,500 September 1, 2007
UPSC-BASE ~100 Not reported November 15, 2007
SAGE
GEO
GUDMAP (421)
MGI
BIOGPS
10. Data mining / querying
Problem specification
Query
Extraction
Storage
Load
Pretreat / prepare for analysis
12. Data analysis (2/3)
3 Questions
◦ What is the right dataset (experimental condition)?
◦ Is dataset is ready for analysis (quality)?
◦ What is the expression profile for a given gene?
◦ Significant differential expression in groups
comparison
Tools
◦ ArrayExpress (EBI)
◦ Boxplot
◦ GEO2R (LIMMA, profile graph,)
◦ ….
I can not say that I'm into Statistician 20 min. I give you just a few items to give rapid analysis of microarray.
The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies. They are divided into two groups based on their degree of multiplexity.
The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies. They are divided into two groups based on their degree of multiplexity.
The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies. They are divided into two groups based on their degree of multiplexity.
ArrayTrack™ provides an integrated solution for managing, analyzing, and interpreting microarray gene expression data. Specifically, ArrayTrack™ is MIAME (Minimum Information About A Microarray Experiment)-supportive for storing both microarray data and experiment parameters associated with a pharmacogenomics or toxicogenomics study. Many statistical and visualization tools are available with ArrayTrack™ which provides a rich collection of functional information about genes, proteins, and pathways for biological interpretation. The primary emphasis of ArrayTrack™ is the direct linking of analysis results with functional information to facilitate the interaction between the choice of analysis methods and the biological relevance of analysis results. Using ArrayTrack™, users can easily select a statistical method applied to stored microarray data to determine a list of differentially expressed genes. The gene list can then be directly linked to pathways and gene ontology for functional analysis.
Boxplots are useful for determining where the majority of the data lies