2. What are levels of evidence?
The hierarchy is not an absolute measure of evidence. It
is a logical way to demonstrate the differing strengths of
studies.
2
Study quality and
reliability
Susceptibility to
bias
3. What do you mean by bias?
Bias is any influence (systematic error) in the conduct of
the study that effects the outcome.
Bias in quantitative studies:
Selection bias: how subjects were chosen to be studied
Allocation bias: how the subjects were assembled into groups
Attrition bias: accounting for subjects at the close of the study
Confounding: other issues present that effect the intervention and
outcome being studied (randomisation aims to reduce this risk)
Detection bias: the blinding of assessors to which result comes from
what group aims to reduce this
Data Collection: were valid and reliable instruments used to assess
outcomes?
Statistical Analysis: did the study have enough power (the sample
size) to detect an effect
3Integrity of intervention: was the intervention carried out as planned?
4. That pyramid thing again …
For commonly encountered clinical questions about
interventions, I look for …
4
Not much information in
this category but high
likelihood of clinical
relevance
Lots of information in
this category: ideas
and lab research
category, or expert
opinion category
6. Systematic Reviews & Meta-analysis
A systematic review is a type of literature review that asks a focused
question or questions. Explicit methods are used to identify,
appraise, select and synthesise all high-quality evidence relevant to
the question/s. These methods are rigorous and transparent. Some
systematic reviews include meta-analysis. This is a statistical
process where all study results are pooled and analysed. Systematic
reviews should, but do not always, employ a librarian to develop
exhaustive search strategies that cover relevant information sources,
employ relevant thesaurus terms and keywords, and use appropriate
filters. To assess whether a systematic review is current, the date
when the search strategy was last run should be checked.
Systematic reviews are good for answering intervention questions.
Meta-analysis – A systematic review that summarizes and analyses
statistics from studies included in the review (some statisticians rank
this as above the level of Systematic Review).
6
7. Randomised Controlled Trials
Uses a control and an experimental group to which
participants are randomly assigned, with a comparison
made at the end of the study. Assessors and researchers
administering the interventions may be blinded also.
RCTs are mainly used for intervention/therapy studies.
Advantages: assessing causality, and clearly
demonstrating that the intervention caused the results.
Disadvantages: expense, time-consumption, and risk of
bias if participants are not properly blinded. RCTs also
have the disadvantage in that the patient group selected
may not be clinically relevant.
7
8. Cohort and case control studies
Cohort Studies – Groups of individuals are followed over time
before they develop a disease/s or experience outcome/s of an
exposure. Cohort studies answer diagnostic test accuracy questions,
aetiology/risk questions for common outcomes resulting from
unusual exposure and longitudinal cohorts for prognosis questions.
Advantages: Researchers can identify relative risk of developing a
disease based on different exposures. Disadvantages: time and cost.
Case-control Studies – Patients with the same condition are
matched with controls. These studies begin with the outcomes. The
cases are reviewed to identify what experiences they had. These are
then compared with the control. They compare the odds of having an
experience with the outcome to the odds of having an experience
without the outcome. Case-control studies are good for answering
aetiology/ risk questions where a rare outcome resulted from a
common exposure.
8
9. Applying Levels of Evidence
The type of study you would
look for in the primary
literature depends on the
question being asked. The
table below summarizes
thinking on the best research
study designs corresponding
to common categories of
clinical questions.
9
12. What are n-of-1 trials?
Elizabeth O Lillie, Bradley Patay, Joel Diamant etal. The n-of-1 clinical
trial: the ultimate strategy for individualizing medicine? Per Med.
Mar 2011; 8(2): 161–173.
Abstract: N-of-1 or single subject clinical trials consider an individual
patient as the sole unit of observation in a study investigating the
efficacy or side-effect profiles of different interventions. The ultimate
goal of an n-of-1 trial is to determine the optimal or best intervention
for an individual patient using objective data-driven criteria. Such
trials can leverage study design and statistical techniques associated
with standard population-based clinical trials, including
randomization, washout and crossover periods, as well as placebo
controls. Despite their obvious appeal and wide use in educational
settings, n-of-1 trials have been used sparingly in medical and
general clinical settings.
12
16. JBI. (2000). Appraising systematic
reviews. Changing Practice: evidence
based practice sheets for health
professionals, Supplement 1, 1-6.
http://connect.jbiconnectplus.org/ViewS
ourceFile.aspx?0=4311
16
17. What are grades of recommendation then?
This is a method used by guideline developers to give a
judgement / grade to the body of evidence underpinning
each recommendation per clinical question.
GRADE - Grading of Recommendations Assessment,
Development and Evaluation (short GRADE) Working
Group
NHMRC – 2009 Levels of Evidence and Grades of
Recommendation rev ed
17
21. And another system …
Owens DK, Lohr KN, Atkins D, etal. J Clin Epidemiol. 2010
May;63(5):513-23. AHRQ series paper 5: grading the
strength of a body of evidence when comparing medical
interventions--agency for healthcare research and quality
and the effective health-care program.
http://www.ncbi.nlm.nih.gov/pubmed/19595577
RESULTS: The EPC approach is conceptually similar to the GRADE
system of evidence rating; it requires assessment of four domains:
risk of bias, consistency, directness, and precision. Additional
domains to be used when appropriate include dose-response
association, presence of confounders that would diminish an
observed effect, strength of association, and publication bias.
Strength of evidence receives a single grade: high, moderate, low, or
insufficient. We give definitions, examples, mechanisms for scoring
domains, and an approach for assigning strength of evidence.
21
22. History stuff
First introduced by Stephen Toulmin in 1976 in the
Journal of Medicine and Philosophy: On the Nature of the
Physician's Understanding
http://jmp.oxfordjournals.org/content/1/1/32.extract
In 1979, the Canadian Task Force on the Periodic Health
Examination published one of the first efforts to explicitly
characterise the level and strength of evidence
underlying healthcare recommendations: The periodic
health examination. Canadian Task Force on the Periodic
Health Examination
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1704686/?t
ool=pubmed
22