7. Statistics are scary
cool
We have to deal with
them anyways, so we
had better enjoy them!
Statistics
(You at the end of the talk)
8. Press the
t-test button and
youâll be done!
Did you check
the normality of
your data ïŹrst?
9. Why should you care about statistics?
http://www.nature.com/nature/authors/gta/2e_Statistical_checklist.pdf
10. Why should you care about statistics?
Advances in Physiological Education
âExplorations in Statisticsâ series (2008-present)
(Douglas Curran-Everett)
11. Why should you care about statistics?
âStatistical Perspectivesâ series (2011-present)
(Gordon Drummond)
The Journal of Physiology
Experimental Physiology
The British Journal of Pharmacology
Microcirculation
The British Journal of Nutrition
http://jp.physoc.org/cgi/collection/stats_reporting
12. Why should you care about statistics?
Importance of being uncertain â September 2013âš
How samples are used to estimate population statistics and what this means in terms of
uncertainty.
Error Bars â October 2013âš
The use of error bars to represent uncertainty and advice on how to interpret them.
SigniïŹcance, P values and t-tests â November 2013âš
Introduction to the concept of statistical signiïŹcance and the one-sample t-test.
http://blogs.nature.com/methagora/2013/08/giving_statistics_the_attention_it_deserves.html
13. Why should you care about statistics?
âJournals [âŠ] fail to exert sufïŹcient scrutiny over the results
that they publishâ
âNature research journals will introduce editorial measures to
address the problem by improving the consistency and quality of
reporting in life-sciences articlesâ
âWe will examine statistics more closely and encourage authors
to be transparent, for example by including their raw dataâ
15. A picture is worth a thousand words
John Snow
(1813-1858)
Location of deaths in the 1854 London Cholera Epidemic
16. Why visualize your data?
The Anscombeâs quartet example
Dataset #1
Dataset #2
Dataset #3
Dataset #4
x
y
x
y
x
y
x
y
10
8.04
10
9.14
10
7.46
8
6.58
8
6.95
8
8.14
8
6.77
8
5.76
13
7.58
13
8.74
13 12.74
8
7.71
9
8.81
9
8.77
9
7.11
8
8.84
11
8.33
11
9.26
11
7.81
8
8.47
14
9.96
14
8.1
14
8.84
8
7.04
6
7.24
6
6.13
6
6.08
8
5.25
4
4.26
4
3.1
4
5.39
19
12.5
12 10.84
12
9.13
12
8.15
8
5.56
7
4.82
7
7.26
7
6.42
8
7.91
5
5.68
5
4.74
5
5.73
8
6.89
Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17â21
17. Why visualize your data?
The Anscombeâs quartet example
Property in each case
Value
Mean of x
9 (exact)
Variance of x
11 (exact)
Mean of y
7.5
Variance of y
4.122 or 4.127
Correlation of x and y
0.816
Linear regression line
y = 3.00 + 0.500x
Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17â21
18. Why visualize your data?
The Anscombeâs quartet example
Dataset #1
Dataset #2
Dataset #3
Dataset #4
Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17â21
19. Why visualize your data?
The Anscombeâs quartet example
Dataset #1
Dataset #2
Dataset #3
Dataset #4
Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17â21
20. Visualize your data in their raw form!
Aim for revelation rather than mere summary
A great graphic with raw data will reveal
unexpected patterns and invites us to
make comparisons we might not have
thought of beforehand.
21. If you are still not convinced âŠ
Mean: 16 / Stdv: 5
22. If you are still not convinced âŠ
Mean: 16 / Stdv: 5
23. If you are still not convinced âŠ
Mean: 16 / Stdv: 5
e
WBM secondary transplantation
(16 weeks)
Danielâs Journal Club paper
Donor engraftment (%)
80
P < 0.05
60
40
20
0
ïŹDMR/+
DMR/+
mH19
24. Avoid making bar graphs
âTo maintain the highest level of trustworthiness of data,
we are encouraging authors to display data in their raw
form and not in a fashion that conceals their variance.
Presenting data as columns with error bars (dynamite
plunger plots) conceals data. We recommend that
individual data be presented as dot plots shown next to
the average for the group with appropriate error bars
(Figure 1).â
Rockman H.A. (2012). "Great expectations". J Clin Invest 122 (4): 1133
25. Avoid making bar graphs
Error bars
Different types, different meanings
100
SORRY
,
WE JUST
75
YOU...
âą descriptive statistics (Range, SD)
âą inferential statistics (SE, CI)
50
25
0
Cumming, G. et al. (2007). "Error bars in experimental biology". J Cell Biol 177 (1): 7â11
26. Avoid making bar graphs
Error bars
Different types, different meanings
âą descriptive statistics (Range, SD)
âą inferential statistics (SE, CI)
Often, they also imply a
symmetrical distribution of the
data.
Cumming, G. et al. (2007). "Error bars in experimental biology". J Cell Biol 177 (1): 7â11
27. Avoid making bar graphs
Mean and Standard deviation are only useful in the
context of a ânormal distributionâ
95%
”
95% of a normal distribution lies within two
standard deviations (Ï) of the mean (”)
28. Avoid making bar graphs
symmetrical
distribution
skewed
distribution
Data presentation to reveal the distribution of the data
âą Display data in their raw form.
âą A dot plot is a good start.
âą âDynamite plunger plotsâ conceal data.
âą Check the pattern of distribution of the values.
29. Avoid making bar graphs
symmetrical
distribution
skewed
distribution
âą First set: Gaussian (or normal) distribution (symmetrically distributed)
âą Second set: right skewed, lognormal (few large values)
â This type of distribution of values is quite common in biology (ex: plasma concentrations
of immune or inflammatory mediators)â
âPlunger plots only: who would know that the values were skewed â ...
... and that the common statistical tests would be inappropriate?â
30. Avoid making bar graphs
Don't tell me no one warned you before!
Bar graph
Dynamite plunger
31. Summary
Why visualize your data?
For others ...
Providing a narrative for the reader
But primarily for you ...
Looking for patterns and relationships
Summarize complex data structures
Help avoid erroneous conclusions based upon questionable or
unexpected data
37. Is the mean always a good descriptor?
# of children per household in China (2012)
âą mean: 1.35
http://www.globalhealthfacts.org/data/topic/map.aspx?ind=87
38. Is the mean always a good descriptor?
# of children per household in China (2012)
âą mean: 1.35
âą median: 1
more representative of the
âtypicalâ family (One child policy)
http://www.globalhealthfacts.org/data/topic/map.aspx?ind=87
39. Any measure is wrong!
âWhenever you make a measurement, you must
know the uncertainty otherwise it is meaninglessâ
Walter Lewis (MIT)
183.3cm
185.7cm
http://www.youtube.com/watch?v=JUxHebuXviM
40. Any measure is wrong!
âWhenever you make a measurement, you must
know the uncertainty otherwise it is meaninglessâ
Walter Lewis (MIT)
The same concept applies when you
report your data!
Provide the uncertainty of your descriptor
hint: this is NOT the standard deviation
41. Any measure is wrong!
âWhenever you make a measurement, you must
know the uncertainty otherwise it is meaninglessâ
Walter Lewis (MIT)
The same concept applies when you
report your data!
Provide the uncertainty of your descriptor
hint: this is NOT the standard deviation
Report the ConïŹdence Interval of your descriptor
42. The Bootstrap: origin
Modern electronic computation has encouraged a host of new statistical methods
that require fewer distributional assumptions than their predecessors and
can be applied to more complicated statistical estimators. These methods allow
[...] to explore and describe data and draw valid statistical inferences without the
usual concerns for mathematical tractability.
Efron B. and Tibshirani R. (1991), Science, Jul 26;253(5018):390-5
43. Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
Calmettes G. and al. (2012), âMaking do with what we have: use your bootstrapâ, J Physiol, 590(15):3403-3406
44. Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2
A2
an
a1
an
a1
a3
a4
mA3
A2
a4
a3
an
a5
a1
a3
mA4
...
Calmettes G. and al. (2012), âMaking do with what we have: use your bootstrapâ, J Physiol, 590(15):3403-3406
45. Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2
A2
an
a1
an
a1
a3
a4
mA3
A2
a4
a3
an
a5
a1
a3
mA4
...
...
Calmettes G. and al. (2012), âMaking do with what we have: use your bootstrapâ, J Physiol, 590(15):3403-3406
46. Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2
A2
an
a1
an
a1
a3
a4
mA3
A2
a4
a3
an
a5
a1
a3
mA4
...
Calmettes G. and al. (2012), âMaking do with what we have: use your bootstrapâ, J Physiol, 590(15):3403-3406
47. Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2
A2
an
a1
an
a1
a3
a4
mA3
A2
a4
a3
an
a5
a1
a3
mA4
...
5.18 [4.91, 4.47]
Calmettes G. and al. (2012), âMaking do with what we have: use your bootstrapâ, J Physiol, 590(15):3403-3406
49. Choose your statistical test wisely
Authors Guidelines
Every paper that contains statistical testing should state
[...] a justification for the use of that test (including, for
example, a discussion of the normality of the data when the
test is appropriate only for normal data), [...], whether the
tests were one-tailed or two-tailed, and the actual P value
for each test (not merely "significant" or "P < 0.5").
http://www.nature.com/nature/authors/gta/#a5.6
50. The simple case (How to)
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
51. The simple case (How to)
Distribution of the data?
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
52. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
53. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
âą ïŹt of the histogram
54. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
âą ïŹt of the histogram
55. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
âą ïŹt of the histogram
âą QQ plot
Male
ith point
A(i)
Theoretical quantiles of the distribution
Ί
â1
i â 3/8
n + 1/4
56. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
âą ïŹt of the histogram
âą QQ plot
not ânormalâ
57. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
âą ïŹt of the histogram
âą QQ plot
Female
Male
Male
58. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
visual
inspection
mean/std
187.0 ± 19.8
âą ïŹt of the histogram
âą QQ plot
Female
Male
Male
59. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
visual
inspection
mean/std
test
187.0 ± 19.8
Male
âą ïŹt of the histogram
âą QQ plot
âą Shapiro-Wilk test
60. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
visual
inspection
mean/std
test
187.0 ± 19.8
Male
âą ïŹt of the histogram
âą QQ plot
âą Shapiro-Wilk test
Null Hypothesis for the SW test:
Data are normally distributed
Female
p-value: 0.9195
Male
p-value: 0.3866
61. The simple case (How to)
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
Distribution of the data?
Normally distributed
62. The simple case (How to)
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
Distribution of the data?
Normally distributed
63. The simple case (How to)
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
Distribution of the data?
Normally distributed
64. The simple case (How to)
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
Distribution of the data?
Normally distributed
Statistical test?
t-test
65. The simple case (How to)
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
Distribution of the data?
Normally distributed
Statistical test?
t-test
Null Hypothesis for the t-test:
Data belong to the same population
t-test
p-value < 2.2e-16
75. Computing the bootstrap p-value
Are the two samples different?
Observed difference = 0.44
If the two samples were from the same population,
what would the probabilities be that the observed
difference was from chance alone?
82. Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an
D0 = mA-mB
(0.44)
B0
b2 b3 b1
b4 b5 bn
a4 b5 bn
b3 a b2 an b4
1b
a2 1 a3 a5
A1
B1
a4
b5
b3
b2
a1
an
a2
b1
b2
a3
b1
a5
mA1
mB1
D1 = mA1-mB1
Repeat
10000 times
(D1 ... D10000)
How many pseudo-differences are
greater or equal than the observed
difference D0 ?
(0.44)
83. Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an
D0 = mA-mB
(0.44)
B0
b2 b3 b1
b4 b5 bn
a4 b5 bn
b3 a b2 an b4
1b
a2 1 a3 a5
A1
B1
a4
b5
b3
b2
a1
an
a2
b1
b2
a3
b1
a5
mA1
mB1
D1 = mA1-mB1
How many pseudo-differences are
greater or equal than the observed
difference D0 ?
Repeat
10000 times
(D1 ... D10000)
(0.44)
9829<D0
171>D0
84. Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an
D0 = mA-mB
(0.44)
B0
b2 b3 b1
b4 b5 bn
a4 b5 bn
b3 a b2 an b4
1b
a2 1 a3 a5
A1
B1
a4
b5
b3
b2
a1
an
a2
b1
b2
a3
b1
a5
mA1
mB1
D1 = mA1-mB1
How many pseudo-differences are
greater or equal than the observed
difference D0 ?
171
= 0.0171
p=
10000
(one-tailed)
Repeat
10000 times
(D1 ... D10000)
(0.44)
9829<D0
171>D0
85. Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an
D0 = mA-mB
(0.44)
B0
b2 b3 b1
b4 b5 bn
MW: p = 0.0169
171
= 0.0171
p=
10000
(one-tailed)
a4 b5 bn
b3 a b2 an b4
1b
a2 1 a3 a5
A1
B1
a4
b5
b3
b2
a1
an
a2
b1
b2
a3
b1
a5
mA1
mB1
D1 = mA1-mB1
How many pseudo-differences are
greater or equal than the observed
difference D0 ?
Repeat
10000 times
(D1 ... D10000)
(0.44)
9829<D0
171>D0
86. Summary
How do my data look like?
Distribution?
âą visual inspection (hist. / QQ plot)
âą normality test
What do I want to compare?
âą parametric test
Right statistical test? âą non parametric test
âą resampling statistics
90. Statistical significance (example)
âThe percentage of neurons showing cue-related activity
increased with training in the mutant mice (P<0.05) but
not in the control mice (P>0.05).â
91. Statistical significance (example)
âThe percentage of neurons showing cue-related activity
increased with training in the mutant mice (P<0.05) but
not in the control mice (P>0.05).â
Training has a larger effect in the mutant
mice than in the control mice!
92. Statistical significance (example)
âThe percentage of neurons showing cue-related activity
increased with training in the mutant mice (P<0.05) but
not in the control mice (P>0.05).â
Training has a larger effect in the mutant
mice than in the control mice!
93. Statistical significance (example)
âThe percentage of neurons showing cue-related activity
increased with training in the mutant mice (P<0.05) but
not in the control mice (P>0.05).â
*
Activity
Extreme scenario:
- training-induced activity barely reaches
signiïŹcance in mutant mice (e.g., 0.049) and
barely fails to reach signiïŹcance for control
mice (e.g., 0.051)
-
+
-
+
control
mutant
Does not test whether training effect for mutant mice differs
statistically from that for control mice.
94. Statistical significance (example)
âThe percentage of neurons showing cue-related activity
increased with training in the mutant mice (P<0.05) but
not in the control mice (P>0.05).â
When making a comparison between two
effects, always report the statistical
signiïŹcance of their difference rather than
the difference between signiïŹcance levels.
Nieuwenhuis S. and al. (2011), âErroneous analyses of interactions in neuroscience: a problem of significanceâ,
Nat Neuroscience, 14(9):1105-1107
95. P-values do not convey information
Mean: 16
SD: 5
Mean: 20
SD: 5
Difference = 4
p-value = 0.1090
96. P-values do not convey information
Mean: 16
SD: 5
Mean: 20
SD: 5
Difference = 4
p-value = 0.1090
0.0367
97. P-values do not convey information
Mean: 16
SD: 5
Mean: 20
SD: 5
Difference = 4
p-value = 0.1090
0.0367
0.0009
98. P-values do not convey information
Fact: Most applied scientists use p-values as a measure of evidence
and of the size of the effect
- The probability of hypotheses depends on much more than just the p-value.
- This topic has renewed importance with the advent of the massive multiple
testing often seen in genomics studies
8
âManhattan plotâ
-log10(P)
6
4
2
Loannidis JP, (2005) PLoS Med 2(8):e124
0
1
2
3
4
5
6
7
8
9
10 11 12
13 14 15 16 17 18 19
20
100. P-value is function of the sample size
Measured Effect Size:
difference = 0.018 mV
Amplitude (mV)
Control
Atropine
0.5 mV
100 ms
0.4
0.2
0
control
atropine
(n=6777) (n=5272)
Hentschke, H. et al. (2011). "Computation of measures of effect size for neuroscience data sets". Eur J Neurosci. 34(12):1887â94
101. P-value is function of the sample size
Measured Effect Size:
difference = 0.018 mV
Amplitude (mV)
Control
Atropine
0.5 mV
100 ms
p = 10-5
0.4
0.2
0
control
atropine
(n=6777) (n=5272)
Hentschke, H. et al. (2011). "Computation of measures of effect size for neuroscience data sets". Eur J Neurosci. 34(12):1887â94
102. P-value is function of the sample size
P (t-test)
100
not signiïŹcant
10â2
signiïŹcant
10â4
101
102
103
Hedges' g
0.4
0.2
0.018 mV
0
â0.2
â0.4
101
102
103
Sample size
Hentschke, H. et al. (2011). "Computation of measures of effect size for neuroscience data sets". Eur J Neurosci. 34(12):1887â94
107. Bootstrap effect size and 95% CIs
Do the 95% confidence intervals of
the observed effect size include
zero (no difference)?
0.44 [0.042, 0.853]
Eff. size = 0.44
A
B
250th
9750th
109. Statistical vs Biological significance
âThe P value reported by tests is a probabilistic significance, not a
biological one.â
âStatistical significance suggests but does not imply biological
significance.â
Krzywinski M and Altman N (2013) "Points of significance: Significance, P values and t-testsâ.
Nature Methods 10, 1041â1042
110. Statistical vs Biological significance
Statistical significance has a meaning in a specific context
No change
Small change
Large change
Biological consequences?
111. Statistical vs Biological significance
AB
PD
LP
LP 1
PY
LP 2
âGood enoughâ solutions
0.60
1,600
0.50
mRNA copy number
Conductances at +15 mV (”S/nF)
Somato-gastric ganglion
0.40
0.30
0.20
0.10
0
1,400
1,200
1,000
800
600
400
200
Kd
K Ca
A-type
0
shab
BK-KC
shal
Schulz D.J. et al. (2006) "Variable channel expression in identified single and electrically coupled neurons
in different animals". Nat Neurosci. 9: 356â 362
112. Statistical vs Biological significance
Madhvani R.V. et al. (2011) "Shaping a new Ca2+ conductance to suppress early afterdepolarizations in
cardiac myocytes". J Physiol 589(Pt 24):6081-92
113. Statistical vs Biological significance
Breast cancer study
Difference in cancer returning between control vs
low-fat diet groups.
Authors conclusions:
People with low-fat diets had a 25% less chance of cancer returning
114. Statistical vs Biological significance
Breast cancer study
Difference in cancer returning between control vs
low-fat diet groups.
Authors conclusions:
People with low-fat diets had a 25% less chance of cancer returning
Actual return rates:
- control: 12.4%
- low-fat diet: 9.8%
Difference
2.6%
2.6
9.8 =
26.5%
115. Beware of false positives
(from the authors)
Bennett C. et al. (2010) âNeural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic
Salmon: An Argument For Proper Multiple Comparisons Correctionâ. JSUR, 2010. 1(1):1-5
116. Beware of false positives
Bennett C. et al. (2010) âNeural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic
Salmon: An Argument For Proper Multiple Comparisons Correctionâ. JSUR, 2010. 1(1):1-5
117. Beware of false positives
2012
Bennett C. et al. (2010) âNeural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic
Salmon: An Argument For Proper Multiple Comparisons Correctionâ. JSUR, 2010. 1(1):1-5
122. Know your audience
who is my audience? level of understanding?
Who? what do they already know?
Why?
What?
How?
123. Know your audience
who is my audience? level of understanding?
Who? what do they already know?
why am I presenting?
Why? what do my audience want to achieve?
What?
How?
124. Know your audience
who is my audience? level of understanding?
Who? what do they already know?
why am I presenting?
Why? what do my audience want to achieve?
what do I want my audience to know?
What? which story will captivate the audience?
How?
125. Know your audience
who is my audience? level of understanding?
Who? what do they already know?
why am I presenting?
Why? what do my audience want to achieve?
what do I want my audience to know?
What? which story will captivate the audience?
what medium will support the message the best?
How? what format/layout will appeal to the audience?
126. Color blindness is a common disease
Males: one in 12 (8%) / Females: one in 200 (0.5%)
127. Color blindness is a common disease
âAnyone who needs to be convinced that making scientific
images more accessible is a worthwhile task [...]: if your next
grant or manuscript submission contains color figures, what if
some of your reviewers are color blind? Will they be able to
appreciate your figures? Considering the competition for funding
and for publication, can you afford the possibility of frustrating
your audience? The solution is at hand."
Clarke, M. (2007). "Making figures comprehensible for color-blind readers" Nature blog
(http://blogs.nature.com/nautilus/2007/02/post_4.html)
128. Making figures for color blind people
Wong, B. (2011). "Points of view: Color blindness". Nature Methods 8, 441
131. Telling stories with data
âThe Martini Glass Structureâ
http://vis.stanford.edu/files/2010-Narrative-InfoVis.pdf
132. Telling stories with data
âThe Martini Glass Structureâ
GUIDED
START
!
EXPLORE
NARRATIVE
http://vis.stanford.edu/files/2010-Narrative-InfoVis.pdf
141. Common mistakes in data reporting
E. Tufteâs âLie Factorâ
Make things appear to be âbetterâ than they are
by fiddling with the scales of things
147. Common mistakes in data reporting
Fig 1I
âWe found that relative to WT mice, the luminal
microbiota of Il10â/â mice exhibited a ~100-fold
increase in E. coli (Fig. 1I)â
Arthur et al, (2012) Science 5;338(6103):120-3
152. Common mistakes in data reporting
Percent Return on Investment
40
30
20
10
0
year1
40
year2
year3
Group
year4 Group A B
Percent Return on Investment
Group A
30
Group B
20
10
0
year1
year2
year3
year4
153. Thank you!
âThe important thing is not to stop questioning.
Curiosity has its own reason for existingâ
- Albert Einstein-