1. How to improve
the chance of getting your manuscript
accepted for publication
Jonas Ranstam PhD
2.
3. Cohort study of smoking
and lung cancer (1954)
(Bradford Hill) Evidence based
medicine
Case-control study of (The Cochrane
smoking and lung collaboration 1993)
cancer (1950)
(Bradford Hill)
Randomised clinical
trial of streptomycin
and tubercolosis
(1948)
(Bradford Hill)
Anecdotal
evidence
(Case reports)
4. Trial registration (2005)
EU directive (2001) Mandatory disclosure
of trial results (2008)
ICH GCP (1996)
CONSORT (1996)
WHO CIOMS (1993)
ICMJE Uniform Requirements (1978)
Helsinki declaration (1964)
Nürnberg convention (1949)
5. Plan
1. Methodological background
2. General guidelines
3. Special recommendations
a) case reports
b) mechanical experiments
c) in vitro/cadaver experiments
d) cross-sectional studies
e) epidemiological studies
f) randomized trials
4. Summary
7. What is statistics used for?
1. Describing data (statistics in the plural)
2. Interpreting uncertain data (statistics in the singular)
8. Two kinds of uncertainty
1. Uncertainty of measurement
2. Uncertainty of sampling
9. 1. Uncertainty of measurement
The precision of the used measurement instrument.
The precision of the Finapres non-invasive blood pressure monitor
is on the average 12.1 mm Hg.
10. 2. Uncertainty of sampling
Individual effects vary between subjects. Different
samples of subjects yield different observed mean
effects.
11. Example
Assume that the cumulative 10-year revision rate
of the Oxford knee prosthesis is 8% and that two
groups of 100 patients receiving the prosthesis are
randomly selected and followed over time.
The two groups are likely to get different numbers
of patients revised during follow up.
15. 6% revised
12% revised
H0: The two samples represent the same population
H1: The two samples represent different populations
16. P-value
The probability that an observed effect only reflects
sampling uncertainty.
12/100 vs. 6/100, Fisher's exact test p = 0.22
17. P-values are often misunderstood
They cannot
- describe clinical relevance (they depend on sample
size)
- show that a difference “does not exist”, because
n.s. is absence of evidence, not evidence of
absence
18. Confidence interval
A range of values, which with the specified confidence
level describes how likely it is that the estimated
population parameter is included.
12/100 vs. 6/100, RR = 2.0 (95%Ci: 0.7 - 5.6)
1/2 1 2 Relative Risk
19. Confidence interval
A range of values, which with the specified confidence
level describes how likely it is that the estimated
population parameter is included.
12/100 vs. 6/100, RR = 2.0 (95%Ci: 0.7 - 5.6)
p < 0.05
n.s.
1/2 1 2 Relative Risk
20. Important assumptions
Many statistical methods like the Student's t-test and
ANOVA are based on the assumption of Gaussian
distribution and homogeneous variance.
21. Important assumptions
Many statistical methods like the Student's t-test and
ANOVA are based on the assumption of Gaussian
distribution and homogeneous variance.
If the assumptions are not met, use alternative (non-
parametric) methods, like the Mann-Whitney U-test or
Kruskal-Wallis non-parametric anova).
23. Important assumptions
Most conventional methods (both parametric and non-
parametric) require independent observations.
- Patients are independent
- Patients' knees, hips, shoulders, feet, etc. are not
25. How Many Patients? How Many Limbs? Analysis
of Patients or Limbs in the Orthopaedic Literature:
A Systematic Review
Bryant et al. JBJS Am. 2006;88:41-45.
Our findings suggest that a high proportion (42%) of
clinical studies in high-impact-factor orthopaedic journals
involve the inappropriate use of multiple observations from
single individuals, potentially biasing results. Orthopaedic
researchers should attend to this issue when reporting
results.
26. Important assumptions
Most conventional methods (both parametric and
non-parametric) require independent observations.
Include only one observation per patient, or use a
statistical method that can handle dependant data,
e.g. multilevel or mixed effects models.
Always present both number of observations and
patients.
27. Multiplicity
In contrast to many other forms of precision,
statistical precision depends on the number of
performed measurements (significance tests).
28. Multiplicity
Each significance test at a 5% significance level
has 5% risk of a false positive test.
Repeated testing increases the risk of at least one
false positive test.
Number of tests Risk of at least one false positive
1 0.05
2 0.10
5 0.23
10 0.40
35. Statistical Methods
“Describe statistical methods with enough detail to
enable a knowledgeable reader with access to the
original data to verify the reported results.”
36. Statistical Methods
“Describe statistical methods with enough detail to
enable a knowledgeable reader with access to the
original data to verify the reported results.”
Required for analytical methods (statistical models,
hypothesis tests, confidence intervals).
Descriptions are often unclear, vague or ambiguous.
They need to be clear and detailed.
37. Results
“When possible, quantify findings and present them
with appropriate indicators of measurement error or
uncertainty (such as confidence intervals).”
38. Results
“When possible, quantify findings and present them
with appropriate indicators of measurement error or
uncertainty (such as confidence intervals).”
Statistical precision (p-values and confidence inter-
vals) are necessary for generalization of results beyond
examined patients.
39. Results
“Avoid relying solely on statistical hypothesis testing,
such as the use of P values, which fails to convey
important information about effect size.”
40. Results
“Avoid relying solely on statistical hypothesis testing,
such as the use of P values, which fails to convey
important information about effect size.”
Describe both your observations and how you interpret
them (use confidence intervals or p-values).
41. Clinically Statistically significant
significant yes no
yes a b
no c d
There was, or was no, (statistically significant) difference is too simplistic
42. Example
Two side effects with a new osteoporosis treatment:
- A statistically significant reduction in body hair
growth rate by 5% (p = 0.04)
- A statistically insignificant increase in systolic
blood pressure by 25 mmHg (p = 0.06)
43. Confidence intervals are better
than p-values
In contrast to p-values they do
- relate to clinical significance
- show when a difference “does not exist”
because they present lower and upper limits of
potential clinical effects/differences
44. P-value and confidence interval
P-values Conclusion from confidence intervals
[2 alternatives] [6 alternatives]
p < 0.05 Statistically but not clinically significant effect
Statistically and clinically significant effect
p < 0.05
p < 0.05 Statistically, but not necessarily clinically, significant effect
n.s.
Inconclusive
n.s. Neither statistically nor clinically significant effect
p < 0.05 Statistically significant reversed effect
Effect
0
Clinically significant effects
45. When there is a difference in data
Do not write that there is not a difference!
47. There were indeed
differences, they are
0.45 and 0.57
Better alternative:
“The observed differences
in extraction torques
between the two types of
uncoated distal pins can
be explained by chance.”
48. Avoid non-technical use of technical
terms and use clear expressions
- significant clinically or statistically?
- no difference statistically insignificant?
- statistical difference statistically significant?
- matched selected or just comparable?
- correlation relation, regression?
- normal Gaussian distribution?
- random mathematical algorithm?
- etc.
55. Mechanical experiments
What do p-values and confidence intervals
relate to?
- Measurement uncertainty (Perhaps)
- Sampling uncertainty (No, there is no
information on subject variation. The
findings cannot be generalized beyond
the device).
57. In vitro/cadaver experiments
What do p-values and confidence intervals relate
to?
- Measurement uncertainty (Perhaps)
- Sampling uncertainty (Perhaps, if the
observations provide information on
variation between subjects)
58. Example
In a study with 60 observations 20 specimens
had been taken from each of 3 subjects.
The specimens were distributed randomly
between one control group and one
experimental group.
What do significance tests of these two groups
tell us?
62. Epidemiological studies
- Exploratory, hypothesis generating,
multiplicity issues considered less
important than validity issues
- External validity (source of subjects)
- Internal validity (confounding)
64. Results
Uniform Requirements: “Where scientifically
appropriate, analyses of the data by variables such as
age and sex should be included.”
Observational studies require adjustment for known
and suspected confounding factors to produce valid
effect estimates.
This adjustment is usually performed using statistical
modelling (e.g. ANCOVA or regression analysis). The
purpose is to increase validity.
67. Clinical trials
“The ICMJE member journals will require, as a
condition of consideration for publication in their
journals, registration in a public trials registry.”
“The ICMJE recommends that journals publish the trial
registration number at the end of the Abstract.”
68. Clinical trials
“When reporting experiments on human subjects,
authors should indicate whether the procedures
followed were in accordance with the ethical
standards of the responsible committee on human
experimentation (institutional and national) and with
the Helsinki Declaration of 1975, as revised in 2000
(5).”
69. WORLD MEDICAL ASSOCIATION DECLARATION OF HELSINKI
Ethical Principles for Medical Research Involving Human Subjects
27. ...Reports of experimentation not in accordance
with the principles laid down in this Declaration
should not be accepted for publication.
70. Purpose of a randomized trial
To test a hypothesis with control of random and
systematic errors.
- No bias (randomization & blinding)
- No multiplicity problems
72. Study populations
Intention-to-treat Analyze all randomized subjects
(ITT) principle according to planned treatment
regimen.
Full analysis set The set of subjects that is as close
(FAS) as possible to the ideal implied by
the ITT-principle.
Per protocol The set of subjects who complied
(PP) set with the protocol sufficiently to ensure
that they are likely to exhibit the
effects of treatment according to the
underlying scientific model.
73. FAS vs. PP-set
FAS + no selection bias
- misclassification problem (effect dilution)
PP-set + no contamination problem
- possible selection bias (confounding)
When the FAS and PP-set lead to essentially the same
conclusions, confidence in the trial is supported.
74. Endpoints
Primary The variable capable of providing the
most clinically relevant evidence
directly related to the primary objective
of the trial
Secondary Either measurements supporting the
primary endpoint or effects related to
secondary objectives
75. Statistical analyses
Confirmatory The result concerns a primary endpoint
and the p-value or confidence interval
accounts for potential multiplicity.
The result can support a claim of
superiority, equivalence or non-
inferiority.
Exploratory All other analyses.
The result is either supporting or
explanatory, or simply just a new
hypothesis.
76. Reporting
“For reports of randomized controlled trials authors
should refer to the CONSORT statement.”
77.
78.
79.
80. Include with the manuscript
Study Protocol
Statistical Analysis Plan
81. Clinical trials
International regulatory guidelines
ICH Topic E9 - Statistical Principles for Clinical Trials
EMEA Points to consider: baseline covariates
- missing data
- multiplicity issues
- etc.
and similar documents from the FDA
These guidelines can all be found on the internet.
83. The responsibilities of a statistical reviewer
“To make sure that the authors spell out for the reader
the limitations imposed upon the conclusions by the
design of the study, the collection of data, and the
analyses performed.”
Shor S. The responsibilities of a statistical reviewer. Chest 1972;61:486-487.
84. Read the manuscript from end to beginning, and look
for weaknesses in the links between:
1. Conclusion
2. Discussion (Discussion section)
3. Results (Results section)
4. Methods (Material & methods section)
5. Data (Material & methods section)
5. Hypothesis (Introduction)
Make sure the chain holds all the way!
85. Summary
1. Present statistical methods in detail, and the number
of observations included in each analysis.
2. Present data, statistical results and your conclusions
- data description vs. results interpretation
- clinical vs. statistical significance
- absence of evidence is not evidence of
absence
3. Adjust for confounding factors in observational
studies (but do not use stepwise regression)
4. Comply with the CONSORT checklist in randomized
studies