Summary slides by Prabhakar Chalise of the Oberg et al. 2012 article "Technical and biological variance structure in mRNA-Seq data:life in the real world" by
Summary slides by Prabhakar Chalise of the Oberg et al. 2012 article "Technical and biological variance structure in mRNA-Seq data:life in the real world" by
RNA-Seq transcriptome analysis of Gonium pectorale cell cycle.
Similar to Summary slides by Prabhakar Chalise of the Oberg et al. 2012 article "Technical and biological variance structure in mRNA-Seq data:life in the real world" by
Similar to Summary slides by Prabhakar Chalise of the Oberg et al. 2012 article "Technical and biological variance structure in mRNA-Seq data:life in the real world" by (20)
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Summary slides by Prabhakar Chalise of the Oberg et al. 2012 article "Technical and biological variance structure in mRNA-Seq data:life in the real world" by
1. Technical and biological variance structure in
mRNA-Seq data:life in the real world
Paper by
Ann Oberg, et al.
October 2, 2013
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
2. Concept
Suppose x is helpful in predicting y.
y = β0 + β1x + (1)
∼ N(0, σ2
)
No variation, no model
◦
C = (◦
F − 32) ×
5
9
(2)
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
3. Concept
RNASeq studies, sources of variation
Technical variation: flowcell, replication in lanes, library
preparation etc
Biological variation: person to person
Observed count data: combination of both types of variation.
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
4. Concept
Technical variation Poisson distribution: Var(Y ) = µ
Total variation over-dispersion: Var(Y ) > µ
within sample variation ∼ Poisson distribution
between sample variation ∼ Gamma distribution
This gives rise to Negative Binomial distribution
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
5. Purpose of the paper
Describe the mean variance relationship in mRNA Seq data
1. Var(Y ) = µ: Poisson
2. Var(Y ) = kµ: Overdispersed Poisson (OD)
3. Var(Y ) = µ + φµ2: Negative-Binomial distribution
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
6. Purpose of the paper
Estimation of φ is very crucial step
1. per gene, glm.nb function MASS
2. local, empirical Bayes estimate shrinking per gene estimate
towards global, edgeR
3. global, quantile adjusted conditional maximum likelihood,
edgeR
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
7. Data and Statistical Experimental Design, Figure 1
25 study subjects (all female caucasians): 12 high and 13 low
antibody responders
13 flow cells, each with 8 lanes: 4 for High response, 4 for Low
response
For each response group, two specimens: unstimulated and
stimulated
2 replicates for unstimulated and stimulated specimens each
2 subjects failed from High response group; leaving 10 subjects
high and 13 subjects low
Only the unstimulated specimens were used, to avoid correlation
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
8. Figure: 2
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
9. Statistical Analysis
Models were fit to unstimulated specimens only to focus on
biological variation
Counts for the two technical replicates were summed for the
models.
No normalization with total count per lane-pair OR 75th percentile
count per lane pair as normalization constant.
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
10. Technical variation
Representative scatter plot of technical replicate 1 versus technical
replicate 2 for one subject. Spearman correlation was 0.9941 for
this pair.
Figure: Supplementary plot
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
11. Technical variation
The vertical axis is difference between the counts in the two
replicates on the log2 scale and the horizontal axis is the average
of the two counts on the log2 scale.
Figure: Bland Altman plot: Supplementary plot
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
12. Technical variation
QQ plots assuming poisson distribution in addition files.
Technical variation in general follows Poisson distribution.
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
13. Biological variation, Figure 3
A. Plot of Mean (x) and Variance (S2)
B. Local estimates of φ and per group mean count
Figure: 3
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
14. Goodness of fit
QQ plots
1. Standard Poisson
2. NB with global estimate of φ
3. NB with per-gene estimate of φ
4. NB with local estimate of φ
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
15. Figure: 4
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
16. Experimental variation
Potential sources of experimental variation examined (When
experimental factors were included in the model):
flow-cell, lane-pair and library preparation batch
Figure 5
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
17. Figure: 5
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
18. Flow-cell, the entire observed counts were smaller than the
expected count.
Reason was the software upgrade mid-way through the experiment.
Number of read increased with the software upgrade, Figure 6A.
After 75th percentile offset was used, no clear flow-cell effect.
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
19. Figure: 6
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
20. Characterizing genes with poor model fit
Effect of genes with small counts.
1. smallest GOF statistics: indicative of overfitting
2. largest GOF statistics: indicative of underfitting (not
explaining enough variance)
Filtering out up to 10,000 total count had minor impact
GOF statistics for gene with average gene count < 5 per subject
were distributed through out the range.Figure 7A
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
21. Figure: 7
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world
22. Data records of genes with very small GOF statistics.
1. All 0 counts in one response group and non zero counts in
other
2. counts very consistent and small variance
Data records of genes with very large GOF statistics.
1. The variance is very high. Example of one such gene in Figure
7b
Paper byAnn Oberg, et al.
Technical and biological variance structure in mRNA-Seq data:life in the real world