SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
Journal of Evaluation in Clinical Practice, 7, 2, 135–148




An illustrated guide to the methods of meta-analysis
Alexander J. Sutton BSc MSc1 Keith R. Abrams BSc MSc PhD2 and
David R. Jones BA MSc PhD CStat CMath DipTCDHE3
1
  Lecturer in Medical Statistics, Department of Epidemiology and Public Health, University of Leicester, UK
2
 Reader in Medical Statistics, Department of Epidemiology and Public Health, University of Leicester, UK
3
 Professor of Medical Statistics, Department of Epidemiology and Public Health, University of Leicester, UK




Correspondence                          Abstract
Mr Alex J Sutton
                                        Meta-analysis is now accepted as a necessary tool for the evaluation of
Department of Epidemiology and Public
  Health                                health care. Such analyses have been carried out in virtually every area of
University of Leicester                 medicine to evaluate a wide spectrum of health care interventions and poli-
22-28 Princess Road West                cies. This paper has three broad aims: (1) to describe the basic principles of
Leicester LE1 6TP                       meta-analysis, using a meta-analysis of interventions intended to reduce
UK
                                        hospital re-admission rates for illustration; (2) to consider threats to the
Keywords: Bayesian methods,             internal validity of meta-analysis, and the measures which can be taken to
hospital discharge, meta-analysis,      minimize their impact; and (3) to present an overview of more specialist
methods, re-admission, review           and developing methods for synthesizing data, with the intention of out-
Accepted for publication:
                                        lining the directions meta-analysis may take in the future.The methods used
22 July 2000                            to synthesize studies, which take ‘weighted averages’ of effect sizes have
                                        been refined to a high degree, while the methods for dealing with threats
                                        to the validity of meta-analyses such as publication bias, and variations
                                        in quality of the primary studies, are at a less advanced stage. However,
                                        many consider this standard ‘weighted average’ approach to meta-analysis
                                        not to be ‘state of the art’ in at least some situations, where the use of more
                                        sophisticated methods, generally to explain variation in estimates from
                                        different studies and synthesize a broader base of evidence, would be
                                        advantageous. Currently, approaches which attempt to do this are mainly
                                        still in the experimental stage and, unfortunately, ideas which sound natural
                                        and appealing are often difficult to implement in practice. Clearly, it will be
                                        some time before they are used routinely, but significant steps have been
                                        made.




                                                               Since different studies are carried out using different
1 Introduction
                                                               populations, different designs and a whole range of
Meta-analysis is now accepted as a necessary tool              other study-specific factors, it has been suggested that
for the evaluation of health care. Such analyses have          combining them will produce an estimate that has
been carried out in virtually every area of medicine,          broader generalizability than any single study. Addi-
to evaluate a wide spectrum of health-care interven-           tionally, it may be possible to explain the differences
tions and policies. The primary aim of many meta-              between results from individual studies by carrying
analyses is to produce a more accurate estimate of the         out a meta-analysis. Such an assessment may even
effect of a particular intervention, or group of inter-        provide further insight into the intervention, and
ventions, than is possible using only a single study.          develop our understanding of how it works.

© 2001 Blackwell Science                                                                                           135
A.J. Sutton et al.




   Concurrent with the explosion in the use of meta-            tive has produced a checklist addressing the quality
analysis is the continued development and refine-                of reporting of meta-analyses (QUORUM) (Moher
ment of the methods used to carry out such analyses.            et al. 1999b). This statement is in the same spirit as
This is an important endeavour, because the science             the CONSORT statement for reporting randomized
of meta-analysis is still in its infancy, and in the past       clinical trials (RCTs) (Begg et al. 1996) and is recom-
over-simplistic methods have led to misleading                  mend as reading for those preparing reports of meta-
conclusions (Hunt 1997). A systematic review of                 analyses of RCTs.
methodology for meta-analysis carried out by the
authors (Sutton et al. 1998) informed the writing of
                                                                2 The synthesis of estimates of effectiveness
this paper, and is recommended further reading for
                                                                from multiple primary studies
more technical details on the material presented
here. The reader should note, however, that several             This section focuses on pooling results from a number
important developments which are noted here have                of studies investigating the relative effectiveness of an
been published in the short time since the review was           intervention. Often, meta-analyses of this sort include
written, confirming the speed with which this field               only RCTs, typically with two arms – one arm receiv-
continues to develop.                                           ing experimental treatment and the other control,
   This paper has three broad aims: (1) to describe the         placebo or standard treatment. (The issue of variable
basic principles of meta-analysis using a worked                quality of studies, and the synthesis of studies with
example; (2) to consider the threats to the validity of         different designs is considered in sections 3 and 4,
meta-analysis and the measures which can be taken               respectively). Data from a meta-analysis of interven-
to minimize their impact; and (3) to present an                 tions intended to improve the process of hospital dis-
overview of more specialist and developing methods,             charge of older people, published elsewhere (Parker
with the intention of outlining the directions meta-            et al. 2001), is used to illustrate the methods. Thirty
analysis may take in the future. The term ‘meta-                two-arm RCTs are included in the meta-analysis, and
analysis’ is used to describe different aspects of              the outcome focused on here is the re-admission rate
research synthesis by different people. In some con-            to hospital following discharge. In the remainder of
texts it is used to indicate the whole review process,          this section the principal ideas involved in performing
including aspects such as literature searching and              a meta-analysis are explained and, where possible,
data extraction, as well as the statistical combination         the calculations required are reproduced to aid
of quantitative results. We prefer to use the term ‘sys-        understanding. In practice, the use of computer soft-
tematic review’ to indicate the whole review process,           ware greatly facilitates the analyses required. The
restricting the term ‘meta-analysis’ to describe the            meta-analysis capabilities of many common statistical
synthesis of quantitative data from multiple studies.           analysis packages are limited; however, much
Although many recent advances in pre-synthesis                  specialist software has been developed recently
review methods have been made, such as the devel-               (Sutton et al. 2000b; Sterne et al. 2001).
opment of sophisticated searching methods (Sutton
et al. 1998; Dickersin et al. 1994), this paper focuses
                                                                Calculation of an effect size for each study
solely on aspects of quantitative data synthesis, or
meta-analysis. [Note: very often a systematic review            Broadly speaking, quantitative outcomes from any
will include a meta-analysis; however, if no quantita-          study can be classified as belonging to one of three
tive data are available from the primary reports, or            data types: (i) binary, e.g. often indicating the pres-
that which is available is deemed too heterogeneous             ence or absence of the event of interest in each
to be meaningfully combined, then only a narrative              patient; (ii) continuous, where outcome is measured
description of the studies may be carried out (Sutton           on a continuous scale, e.g. this could be change in
et al. 1998).] Guidelines for good practice for the pre-        blood pressure, etc.; or (iii) ordinal, where outcome
synthesis aspects of systematic reviews have been               is measured on an ordered categorical scale, e.g. a
described comprehensively elsewhere (Deeks et al.               disease severity scale, where a patient can be classi-
1996; Oxman 1996). Very importantly, a recent initia-           fied as belonging to one of several distinct categories.

136                                                   © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148
Meta-analysis methods




The approaches used to combine either binary or                                 calculated by dividing the RRs in the treatment and
continuous outcomes are often similar, while ordinal                            control arms, 0.036/0.162, which produces an RRR of
data is somewhat more complex and requires spe-                                 0.222. This RRR is less than one, which indicates the
cialist methods, discussed elsewhere (Whitehead &                               re-admission rate is lower in the treatment arm, sug-
Jones 1994).                                                                    gesting that the intervention is beneficial. In this
   Table 1 provides a sample of the data extracted                              instance the estimated effect is large (a long way
from reports of 30 RCTs to be included in the meta-                             from 1). The RRs for each arm are provided in
analysis (for a list of references for these RCTs see                           columns 5, 8 and the RRR in column 9.
the original report (Parker et al. 2001) – numbers                                 Although the RRR is the measure of interest, due
used to identify these RCTs in this report are pro-                             to theoretical statistical considerations (including
vided here in the final column of Table 1). Columns                              improved approximate normality), a natural loga-
three and six provide the number of patients ran-                               rithm transformation is used (ln(RRR)) for the
domized to the experimental and control arms of                                 purpose of combining studies via a meta-analysis.
each study, respectively. [Note: analysis should                                (Fleiss 1994) The pooled result can be back-trans-
usually be calculated on the basis of intention to treat                        formed by taking the exponential of the pooled
(Hollis & Campbell 1999) – if the analysis in the orig-                         ln(RRR) (e1n(RRR)) afterwards, to convert the answer
inal study report was not performed using this                                  back to the RRR scale, allowing easier interpreta-
method it may still be possible to extract the cor-                             tion. The ln(RRR) estimates for each study are given
rect figures for the purposes of the meta-analysis.]                             in column 10 of Table 1.
Columns four and seven indicate the number of re-                                  A further value, the standard error (SE) of the
admission episodes. Note that an individual can have                            ln(RRR), is required for the meta-analysis calcula-
multiple re-admissions; for example, the new inter-                             tion. The SE gives an indication of the degree of pre-
vention arm of study 8 included 142 patients, while                             cision to which each study estimates the effect size; a
554 events were reported. [Note: the fact that more                             small SE indicates a precise estimate, usually from a
than one re-admission is permitted for each patient                             large study. The SE for the ln(RRR) is calculated by:
means that an individual’s outcome is not binary.]
Column two indicates the length of follow-up of the                                   SE(ln(RRR)) =
studies, which ranges from 1 to 12 months; it is nec-                                              1                      1
essary to account for follow-up when calculating                                                              +
                                                                                         num. of re - admiss.   num. of re - admiss. in
effect sizes, since the number of re-admissions may
                                                                                         in exp. group          control group
be critically dependent on the length of the observa-
tion period of the trial.
                                                                                  Hence, for study 1 the SE(ln(RRR)) is
   An outcome measure which takes into account
                                                                                ÷1/2 + 1/9 = 0.782. Standard errors for the remain-
length of follow-up is the re-admission rate ratio
                                                                                ing studies are provided in column 11 of Table 1.
(RRR). As the name suggests, this is the ratio of
                                                                                It is common practice to calculate 95% confi-
the re-admission rates (per month) in both arms.
                                                                                dence intervals for each study – these indicate
The re-admission rate (RR) in each arm is calculated
                                                                                the interval in which the estimate of effect size
by:
                                                                                would be expected to fall 95 times out of every
                                                                                100 replications of the trial. Hence, a 95% confidence
                 Number of re - admissions
   RR =                                                                         interval provides a range in which one can be
            Number of patients ¥ length of follow - up
                                                                                reasonably sure the true effect size lies. The formula
                                                                                for calculating a 95% confidence interval for a
For example, there are two re-admissions in 37
                                                                                ln(RRR) is:
patients over 1.5 months in trial 1, so the RR is
2/(3.7 ¥ 1.5) = 0.036. [Note: more decimal places are                                 ln(RRR) ± 1.96 ¥ SE(ln(RRR)).
used in the working of the calculations in this paper
than are printed.] Similarly, the RR in the control                                For study 1 the ln(RRR) 95% confidence interval
group is 0.162. The outcome of interest can now be                              is given by -1.504 ± 1.96(0.782) = (-3.04 - 0.03). Con-

© 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148                                                  137
138




                                                                                      Table 1 Data and calculations for the hospital re-admissions meta-analysis

                                                                                                                     Experimental group                    Control group
                                                                                                                                                                                      Re-                                                                                                      Number
                                                                                              Length of                  Re-         Re-                     Re-         Re-       admission                                                                                         EPOC      used in
                                                                                      Study   follow-up   Patients    admissions   admission   Patients   admissions   admission   rate ratio                 SE          95% CI            95% CI                  Intervention     quality   original
                                                                                      ID      (months)      (n)          (n)         rate        (n)         (n)         rate        (RRR)      ln(RRR)   (ln(RRR))       ln(RRR)            RRR          Weight   administration   measure    report*
                                                                                      1            2         3            4           5           6           7           8            9           10         11             12               13           14            15            16         17

                                                                                       1         1.5         37            2         0.036        37           9           0.162     0.222      -1.504     0.782      (-3.04 - 0.03)      (0.05 - 1.03)     1.64       Single         5           53
                                                                                       2         3          464          102         0.073       439         102           0.077     0.946      -0.055     0.140      (-0.33 - 0.22)      (0.72 - 1.24)    51.00       Single         3           59
                                                                                       3         6          499          347         0.116       502         340           0.113     1.027       0.026     0.076      (-0.12 - 0.18)      (0.88 - 1.19)   171.73       Single         6           60
                                                                                       4         6           86           36         0.070        87          26           0.050     1.401       0.337     0.257      (-0.17 - 0.84)      (0.85 - 2.32)    15.10       Single         4           69
                                                                                       5        12           57            9         0.013        56           6           0.009     1.474       0.388     0.527      (-0.65 - 1.42)      (0.52 - 4.14)     3.60       Team           3           82
                                                                                       6         2           39           29         0.372        41          35           0.427     0.871      -0.138     0.251      (-0.63 - 0.35)      (0.53 - 1.42)    15.86       Single         3           88
                                                                                       7         3           20            3         0.050        20          13           0.217     0.231      -1.466     0.641      (-2.72 to - 0.21)   (0.07 - 0.81)     2.44       Single         6          177
                                                                                       8         3          142          554         1.300       140         868           2.067     0.629      -0.463     0.054      (-0.57 to - 0.36)   (0.57 - 0.70)   338.16       Team           4          187
                                                                                       9         6          695          343         0.082       701         310           0.074     1.116       0.110     0.078      (-0.04 - 0.26)      (0.96 - 1.30)   162.83       Team           6          222
© 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148




                                                                                      10         2          178           43         0.121       176          37           0.105     1.149       0.139     0.224      (-0.30 - 0.58)      (0.74 - 1.78)    19.89       Single         2          228
                                                                                      11         6           30            9         0.050        30           6           0.033     1.500       0.405     0.527      (-0.63 - 1.44)      (0.53 - 4.21)     3.60       Team           5          231
                                                                                      12         6           96           42         0.073        97          62           0.107     0.684      -0.379     0.200      (-0.77 - 0.01)      (0.46 - 1.01)    25.04       Team           3          236
                                                                                      13         3          303          104         0.114       300         109           0.121     0.945      -0.057     0.137      (-0.33 - 0.21)      (0.72 - 1.24)    53.22       Team           4          275
                                                                                      14         6          150           51         0.057        99          32           0.054     1.052       0.051     0.226      (-0.39 - 0.49)      (0.68 - 1.64)    19.66       Team           4          283
                                                                                      15         1           20            4         0.200        20           6           0.300     0.667      -0.405     0.645      (-1.67 - 0.86)      (0.19 - 2.36)     2.40       Team           1          312
                                                                                      16         1.5         29            4         0.092        25           9           0.240     0.383      -0.959     0.601      (-2.14 - 0.22)      (0.12 - 1.24)     2.77       Single         4          334
                                                                                      17        12          333          396         0.099       335         410           0.102     0.972      -0.029     0.070      (-0.17 - 0.11)      (0.85 - 1.12)   201.44       Single         3          339
                                                                                      18         3          140           18         0.043       136          16           0.039     1.093       0.089     0.344      (-0.58 - 0.76)      (0.56 - 2.14)     8.47       Single         4          351
                                                                                      19         9          418          495         0.132       417         549           0.146     0.899      -0.106     0.062      (-0.23 - 0.02)      (0.80 - 1.02)   260.31       Single         3          397
                                                                                      20         6           62           21         0.056        58          35           0.101     0.561      -0.578     0.276      (-1.12 to - 0.04)   (0.33 - 0.96)    13.13       Team           4          403
                                                                                      21        12          199          107         0.045       205         111           0.045     0.993      -0.007     0.135      (-0.2 - 0.26)       (0.76 - 1.30)    54.48       Team           3          416
                                                                                      22        12           63           22         0.029        60          30           0.042     0.698      -0.359     0.281      (-0.91 - 0.19)      (0.40 - 1.21)    12.69       Team           4          691
                                                                                      23         6           35           10         0.048        40          51           0.213     0.224      -1.496     0.346      (-2.17 to - 0.82)   (0.11 - 0.44)     8.36       Single         4         1793
                                                                                      24         6          102           49         0.080       102          51           0.083     0.961      -0.040     0.200      (-0.43 - 0.35)      (0.65 - 1.42)    24.99       Single         7         1796
                                                                                      25         6          140           24         0.029        97          29           0.050     0.573      -0.556     0.276      (-1.10 to - 0.02)   (0.33 - 0.98)    13.13       Team           3.5       2211
                                                                                      26         3           45            5         0.037        46           5           0.036     1.022       0.022     0.632      (-1.22 - 1.26)      (0.30 - 3.53)     2.50       Team           6         2229
                                                                                      27         4           49           11         0.056        51           7           0.034     1.636       0.492     0.483      (-0.46 - 1.44)      (0.63 - 4.22)     4.28       Single         3.5       2657
                                                                                      28         6          177           49         0.046       186         107           0.096     0.481      -0.731     0.172      (-1.07 to - 0.39)   (0.34 - 0.67)    33.61       Single         3         3632
                                                                                      29         3          381          154         0.135       381         197           0.172     0.782      -0.246     0.108      (-0.46 to - 0.04)   (0.63 - 0.97)    86.43       Team           4         3636
                                                                                      30         2           96           22         0.019       110          43           0.033     0.586      -0.534     0.262      (-1.05 to - 0.02)   (0.35 - 0.98)    14.55       Single         6         4460

                                                                                      *Parker et al. 2000. n = number.
Meta-analysis methods




Figure 1 Forest plot of 30 RCTs
examining the effect on re-
admission rates of interventions
aimed at modifying the hospital
discharge process for elderly
people.




fidence intervals for RRR are obtained by taking
                                                                                Combining effect sizes – calculating
the exponential of this ln(RRR) interval; hence,
                                                                                weighted averages
the RRR 95% confidence interval for study 1 is
(0.05–1.03). This interval includes 1, which indicates                          The previous section illustrated how a RRR estimate
that on its own the trial is inconclusive, because both                         and corresponding standard error could be calcu-
beneficial and harmful effect size estimates are                                 lated from summary data extracted from individual
included in the interval and are in some sense plau-                            study reports. In other instances different effect
sible. This highlights the need to consider the preci-                          measures may be more appropriate, but the general
sion of the estimate; the study estimated a very large                          principle that an estimate and SE are required from
treatment effect, but did so very imprecisely; the true                         each study remains. When outcomes are reported
effect could be much smaller (or larger) than the                               on a binary scale, the odds ratio, risk ratio or risk
point estimate. The 95% confidence intervals for                                 difference measures are commonly used, while
ln(RRR) and RRR for the remaining studies are                                   outcomes measured on a continuous scale can be
provided in columns 12 and 13, respectively. To aid                             combined directly, or standardized – if different
examination of the results of the individual studies,                           scales of measurement have been used in the indi-
these intervals can be plotted on the same axis, as in                          vidual studies. Descriptions and formulae for each of
Fig. 1. The RRR estimate for each study is plotted,                             these outcome measures and others are available
with the size of the plotting symbol proportional to                            elsewhere (Fleiss 1993; Sutton et al. 2000c).
the precision of the estimate. The 95% confidence                                   The simplest way to combine estimates is to
interval for each RRR estimate is also plotted (the                             average them. Since different studies estimate the
more precise estimates having the smaller confidence                             true effect size with varying degrees of precision, a
intervals) (other features of this figure will be                                weighted average is used. The weight given to each
explained in due course). This plot highlights the                              study in the re-admissions meta-analysis is calculated
variability in the estimates and in the precisions                              by:
between studies. The issue of variability between esti-
mates from individual studies is considered further in                                                1
                                                                                      weight =                 2
                                                                                                                   .
later sections.                                                                                  SE(ln(RRR))

© 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148                                                     139
A.J. Sutton et al.




The square of the standard error is often known as                  treatment effect. Many people feel that in medical
the variance, so combining studies using this weight-               and related research such an assumption is unrealis-
ing is often called the inverse variance-weighted                   tic (Thompson 1993) because studies are never iden-
method (Fleiss 1993). The weightings for each study                 tical replications of one another, and study design
are provided in Table 1, column 14. If an effect                    and conduct differences will inevitably have some
measure other than the RRR is being used, then the                  degree of influence on study outcome. Models which
weightings are calculated by the same principle, using              account for underlying variability in the treatment
the inverse of the variance of that effect measure.                 effect estimates are considered in the next section.
  Once weight for each study has been calculated, a
pooled estimate of ln(RRR) is calculated by multi-
                                                                    Heterogeneity and random effect models
plying each study’s weight by its ln(RRR) and
summing the resulting values, and then dividing this                When performing a meta-analysis, although the
value by the sum of the weights. Using figures from                  overall aim may be to produce an overall pooled esti-
Table 1, the outline calculation for the re-admissions              mate of treatment effect, it is crucial to assess the
data is:                                                            variation between results of the primary studies and,
                                                                    if possible, to investigate why they differ. Clearly, it
  ln( pooled RRR)                                                   would be remarkable if all studies being meta-
       [(1.64 ¥ (-1.504)) + . . . + (14.55 ¥ (-1.504))]             analysed produced exactly the same treatment effect
     =
                   (1.654 + . . . + 14.55)                          estimate. Some variation in results is expected, due
     = -0.164                                                       simply to the play of chance; this is often called
                                                                    random variation. However, if effect size estimates
The variance for ln(pooled RRR) (or any other
                                                                    vary between studies to a greater extent than
effect measure used) is then calculated by taking the
                                                                    expected on the basis of chance alone the studies are
reciprocal of the sum of the weights (1/sum of
                                                                    considered to be heterogeneous, and it is necessary
weights):
                                                                    to account for the extra variation, above that ex-
   var ( pooled RRR) = 1 (1.64 + . . . + 14.55)                     pected by chance, in the meta-analysis model. The
                     = 0.0006                                       way this is usually performed is through the use of a
                                                                    random-effect model. Essentially, this relaxes the
Using these figures, an approximate 95% confidence                    assumption that each study is estimating exactly the
interval for the pooled estimate can be calculated                  same underlying treatment effect, and instead
in the same manner as confidence intervals were                      assumes that the underlying effect sizes are drawn
produced for the individual study estimates above.                  from a distribution of effect sizes. This distribution is
The pooled estimate of RRR for the re-admissions                    usually assumed to be Normal, with a variance deter-
dataset is 0.85 with 95% CI (0.81–0.89), indicating a               mined by the data. In practical terms, accounting for
modest, statistically significant treatment benefit at                between study heterogeneity in this way produces a
the 5% level.This estimate is plotted using a diamond               pooled point estimate which is often (but not always)
shape in Fig. 1 directly below the 30 individual                    similar to the one produced by fixed-effect methods.
studies. Figure 1 is often called a forest plot and is              However, taking into account between study hetero-
commonly used to display the results of a meta-                     geneity produces a wider 95% confidence interval, so
analysis.                                                           the estimate is more conservative.
   This approach is often known as a fixed-effect                       The whole issue of appropriateness and suitability
approach, to distinguish it from the random-effect                  of fixed- and random-effect models for meta-analysis
models described below. It can be used to combine                   has been much discussed (Thompson 1993; Peto
outcomes on any scale; however, other related fixed-                 1987). A test for heterogeneity exists (Fleiss 1993),
effect methods specifically for combining odds ratios                and the result of this test can then be used to inform
also exist (Fleiss 1993; Sutton et al. 2000c). These                model choice. If it is non-significant a fixed-effect
fixed-effect methods all make the strong assumption                  model is to be used, and if it is significant a random-
that each study is estimating the same underlying                   effect model should be used. This seemingly sensible

140                                                       © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148
Meta-analysis methods




approach has a flaw because the test has low power.                              desirable than using random-effect models to allow
This implies that heterogeneity may exist even when                             for heterogeneity is to try to explain the heterogene-
it produces a non-significant result (Boissel et al.                             ity. This may lead to the identification of associations
1989). An alternative approach is to always use a                               between study or patient characteristics and the
random-effect model. The inflation of the confidence                              outcome measure, which would not have been pos-
interval is dictated by the degree of variation                                 sible in single studies. This may lead in turn to clini-
between studies, so when between-study variation is                             cally important findings and may eventually assist in
small the inflation will be negligible, producing a                              individualizing treatment regimes (Lau et al. 1998).
result which would be very similar to the fixed-effect                           Both subgroup analyses and regression methods can
approach.                                                                       be used to do this.
   A detailed description of the random-effect meta-                               Potential study level factors, pertaining to either
analysis model is beyond the scope of this paper, but                           study design or patient characteristics which could
clear accounts are given elsewhere (DerSimonian &                               affect study results should ideally be identified before
Laird 1986; Shadish & Haddock 1994). Combining                                  a meta-analysis is conducted. If this is carried out,
the 30 studies evaluating interventions to prevent re-                          data on these factors can then be obtained at the data
admission using a random-effect model produces a                                extraction stage of a review, and such explicit a priori
RRR of 0.83 (0.73–0.93). This estimate is plotted                               specification also reduces the temptation of ‘data
below the fixed-effect one in Fig. 1. The estimate of                            dredging’.
the between-study variance is 0.057, which is quite                                Returning to the re-admission dataset, one poten-
small but non-negligible (the test for between-                                 tial factor which could affect results is whether the
study heterogeneity is highly significant (P < 0.001)).                          intervention was administered by a team or an
Accounting for this heterogeneity has produced a                                individual. This information is given for each study
wider confidence interval compared to the fixed-                                  in column 15 of Table 1. In 16 of the studies the
effect approach, which is a typical finding. Modifica-                            intervention was administered by an individual
tions to the way the parameters in a random-effect                              and in 14 it was administered by a team. Separate
meta-analysis model are calculated have been devel-                             meta-analyses can be performed for these two sub-
oped (Hardy & Thompson 1996; Biggerstaff &                                      groups in an attempt to see if the effectiveness of
Tweedie 1997). One of these should be used if the                               the intervention depends on whether an individual
number of studies in the meta-analysis is small                                 or team implements it, and whether between study
(approximately less than 10) as it overcomes prob-                              heterogeneity is reduced in the subgroups. Pooled
lems with a previous simplification in the model cal-                            estimates for these subgroups turn out to be almost
culations, which can be important in meta-analyses of                           identical. The intervention administered by indi-
small numbers of studies.                                                       vidual subgroup has a RRR of 0.83 (0.70–0.97) and
   A final point concerning between study hetero-                                the estimate of the between-study heterogeneity
geneity is that there is little explicit guidance to offer                      of 0.056 (test for heterogeneity highly significant at
regarding the point at which studies estimates should                           P < 0.001). For the studies where the intervention
not be pooled at all because heterogeneity is deemed                            was administered by a team the RRR was 0.83
too great, but alternative approaches are discussed                             (0.69–0.99) and the estimate of between-study
below.                                                                          heterogeneity 0.062 (test for heterogeneity highly
                                                                                significant at P < 0.001). Hence, it would appear
                                                                                that whether the intervention is administered by an
                                                                                individual or a team makes very little difference to
Exploring and explaining heterogeneity
                                                                                the effectiveness of the intervention and, hence, does
Until now, the impression has been given that het-                              not explain any of the variation between study
erogeneity is a nuisance factor which needs account-                            results.
ing for when performing a meta-analysis. However,                                  If the factor of interest is measured on a continu-
investigating why between-study variation exists                                ous scale, or dummy indicator variables are created
offers the meta-analyst unique opportunities. More                              for the levels of categorical factors, then meta-

© 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148                                                 141
A.J. Sutton et al.




regression can be used to explore their impact. Meta-
                                                             Publication and related biases
regression models are very similar in principle to
ordinary simple linear regression models, the main           Publication bias exists because research with statisti-
differences being that individual observations (the          cally significant or interesting results is potentially
primary studies), unlike individual patients, are not        more likely to be submitted, published or published
given equal weight in the analysis (i.e. study should        more rapidly than work with null or non-significant
be weighted according to its precision). Addition-           results (Song et al. 2000). When only the published
ally, it may be desirable to include a random-effect         literature is included in a meta-analysis, this can
term to account for residual heterogeneity not               potentially lead to biased over-optimistic conclu-
explained by the covariate(s); such a model can be           sions. Related biases which can also bias the results
thought of as an extension to the random-effect              of a meta-analysis include (i) pipeline bias, when sig-
model described above (Berkey et al. 1995). An               nificant results are published quicker than non-sig-
example of a meta-regression analysis is given in            nificant ones; and (ii) language bias, when researchers
section 3.                                                   whose native tongue is not English are more likely to
   Meta-regression techniques are currently used             publish their non-significant results in non-English
relatively rarely, and the authors believe not to            written journals, but are more likely to publish their
their full potential, but examples are emerging              significant results in English. If this happens, a meta-
(Freemantle et al. 1999; von Dadelszen et al.                analysis including only study reports in English may
2000). Although a powerful tool, they do have their          be based on a biased collection of studies. Perhaps an
limitations. Regression analysis of this type are also       appropriate term which includes all these sources of
susceptible to aggregation bias, which occurs if the         bias is ‘dissemination bias’ (Song et al. 2000).
relation between patient characteristic study means             Long-term initiatives to alleviate the problem of
and outcomes do not directly reflect the relation             publication bias have commenced, including trial
between individuals’ values and individuals’ out-            amnesties (Horton 1997) to encourage publication of
comes (Greenland 1987). Additionally, meta-                  previously unpublished trials, and the creation of reg-
regression type analyses are often limited by the            istries for prospective registration of trials (Horton
number of studies included in the meta-analysis.             & Smith 1999). However, the issue is currently still
Special regression models have also been developed           a big concern for researchers carrying out meta-
to explore the effect of patients’ underlying risk on        analyses. There are certain measures which can be
intervention effect (Senn et al. 1996; Walter 1997)          taken to assess the presence and minimize the impact
which are necessary to avoid producing incorrect             of publication bias in a meta-analysis dataset. Cur-
results when exploring the effect of such a factor           rently, however, there is much debate, and some
(Schmid et al. 1998; Senn et al. 1996).                      dispute as to the approach researchers should take to
                                                             deal with publication bias in meta-analyses.
                                                                The presence of publication bias in a meta-
                                                             analysis dataset can be assessed informally by inspec-
3 Threats to the validity of a meta-analysis
                                                             tion of a funnel plot (Light & Pillemar 1984). This
Although meta-analyses are often considered to               plots the effect size for each study against some
provide the highest grade of evidence available              measure of its precision, e.g. the 1/standard error of
regarding the effectiveness of an intervention, higher       the effect size. The resulting plot should be shaped
than an individual trial, it should not be forgotten         like a funnel if no publication bias is present. This
that they are a type of observational study, and as          shape is expected because trials of decreasing size
such are open to biases which may threaten their             have increasingly large variation in their effect
validity. Perhaps the two most serious problems              size estimates due to random variation becoming
which can potentially lead to biased estimates are           increasingly influential. However, if the chance of
publication bias and variable study quality of the           publication is greater for larger trials or trials with
primary studies. These two issues are considered             statistically significant results, some small non-
further below.                                               significant studies may not appear in the literature.

142                                                © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148
Meta-analysis methods




                             20                                                      part of a sensitivity analysis is sensible (Sutton et al.
                                                                                     2000a) but more research is needed in this area.
1/standard error (In(RRR))




                             16

                             12                                                      Study quality

                              8                                                      It is rare that all the studies available for a meta-
                                                                                     analysis are of a unanimously high quality. More
                              4                                                      likely there will be a range in the quality of the
                                                                                     research pertaining to the intervention of interest.
                              0                                                      Restricting a meta-analysis to include only RCTs is
                                  0.25         0.5      0.75    1.0     1.5    2.0   a safeguard taken by such groups as the Cochrane
                                         Re-admission rate ratio (log scale)         Collaboration, in an attempt to include only evidence
                                                                                     which potentially produces the least biased results.
    Figure 2 Funnel plot of studies included in the hospital
                                                                                     Restricting analyses only to RCT does not guarantee
    discharge meta-analysis examining the effect of
    interventions on the effect of re-admission rates.
                                                                                     the meta-analysis will produce an unbiased result,
                                                                                     however, as there can still be methodological flaws
                                                                                     in the design, conduct and analysis of a trial. Clearly,
    This leads to omission of trials in one corner of the                            the inclusion of poor or flawed studies in a meta-
    plot – the bottom right-hand corner of the plot when                             analysis may be problematic because their influence
    an ‘undesirable’ outcome such as the re-admission                                may bias the pooled result and even mean the meta-
    rate is being considered, and hence to a degree of                               analysis cones to the wrong qualitative conclusions.
    asymmetry in the funnel. A funnel plot for the 30                                Unfortunately, most studies are flawed to some
    RCTs in the re-admissions dataset is provided in                                 degree, and including all but ‘perfect’ studies (which
    Fig. 2. Visual inspection would suggest that there is                            may not be possible to conduct due to ethical or prac-
    little evidence of publication bias in this dataset;                             tical constraints in some fields) may leave the meta-
    however, there are a few small studies with extremely                            analyst with few if any data. The problem of dealing
    beneficial RRRs at the bottom left-hand corner of                                 with study quality in a meta-analysis is similar to that
    the plot, for which there are no symmetric counter-                              for publication bias, in the sense that there is agree-
    parts with extreme positive RRRs in the bottom                                   ment that some assessment of quality should always
    right-hand corner.                                                               be made, but little consensus on how to make such
       Publication bias can be tested for more formally                              an assessment, or how to incorporate the results into
    using statistical tests which are based on the same                              the meta-analysis.
    symmetry assumptions as a funnel plot assessment                                    There have been many scales and checklists devel-
    (Begg & Mazumdar 1994; Egger et al. 1997; Duval &                                oped to aid in the assessment of study quality (Moher
    Tweedie 1998). One formal test (Egger et al. 1997)                               et al. 1995) but many of them have come under heavy
    produces a non-significant P-value of 0.57 for the re-                            criticism for not being constructed scientifically
    admissions dataset, which is consistent with the                                 (Moher et al. 1999a). Further, recent work has de-
    inconclusive visual assessment.                                                  monstrated that different results can be obtained in
       Disagreement exists about how to proceed if pub-                              a meta-analysis depending on the checklist used (Juni
    lication bias is suspected, after an assessment for its                          et al. 1999). A further problem is the fact that it is
    presence has been made. Methods to assess the likely                             often difficult to ascertain all the required details of
    impact of publication bias on the pooled outcome                                 the trial from a study report (Begg et al. 1996). Often,
    estimate have been developed (Duval & Tweedie                                    this means that an assessment of the trial report and
    1998; Givens et al. 1997; Copas 1999; Song et al. 2000)                          not of the trial itself is in effect being made. The
    but they are not widely used, due partly to the fact                             underlying problem with the use of a scale or check-
    that many are complex and hence difficult to imple-                               list is that it is impossible to predict which design
    ment, and due partly to concerns about their applic-                             aspects cause the most bias and, more fundamentally,
    ability. We believe that the use of such methods as                              it is often impossible to predict even the direction in

    © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148                                                   143
A.J. Sutton et al.




which any bias will be acting (Schulz et al. 1995). This                                             2.0




                                                               Re-admission rate ratio (log scale)
makes the direct adjustment of study estimates for
study quality impossible.
   Several ways in which study quality can be incor-                                                 1.0
porated into a meta-analysis have been suggested.
Perhaps the simplest is to use a quality threshold to
include or exclude studies. This could be defined                                                     0.5
using a cut-off value, on a particular quality scale, or
as a requirement of having several design aspects
present. A further possibility is to use a quality score
to weight study results, or incorporate such a score                                                 0.2

into the standard precision weightings (Berard &                                                           1   2   3        4       5   6   7
Bravo 1998). Finally, an approach which appears to                                                                 EPOC quality score
be gaining support is the exploration of quality, via
                                                                           Figure 3 Regression line examining the impact of
meta-regression. In such an approach a quality score,
                                                                           quality score, using a random effect meta-regression
or individual markers of study quality, such as the                        model for re-admission rate in the hospital discharge
degree of blinding or method of treatment allocation,                      meta-analysis.
are included in a regression model as explanatory
variables. Examining individual markers of quality
separately eliminates the problems with the some-                          further developments have been made. A proportion
what arbitrary construction for the quality scale                          of these focus on the synthesis of less standard data
scoring systems (Detsky et al. 1992).                                      types. For example, specialist methods are required
   Returning to the re-admissions meta-analysis,                           to pool the results of diagnostic tests because two
study quality was rated crudely using a count of effec-                    outcomes, specificity and sensitivity, require simulta-
tive practice and organization of care (EPOC)                              neous consideration (Irwig et al. 1995). Another area
quality criteria (Cochrane Effective Practice & Orga-                      which requires special methods is the analysis of sur-
nization of Care Review Group 1998) that were                              vival data because account has to be made of cen-
satisfied for each study. The scores obtained by each                       sored observations. (Dear 1994) Other data-types for
trial using this method are given in the penultimate                       which specialist methods have been developed are
column of Table 1. When these scores are included in                       dose–response data (Tweedie & Mengersen 1995)
a random effect regression model, the equation                             and economic data (Jefferson et al. 1996). Individual
ln(RRR) = -0.22 + 0.007 ¥ quality score is obtained.                       patient data (Stewart & Clarke 1995), where original
This regression line, together with the primary                            study datasets are pooled, rather than relying on pub-
studies (the size of the plotting symbol is propor-                        lished summary data has been described as the gold
tional to the precision of the effect size estimate), are                  standard, it is considered by some to be the only way
plotted in Fig. 3. The quality score coefficient is small                   to carry out a meta-analysis of survival data, and is
and not statistically significant (P = 0.88). This means                    much more time consuming and costly than meta-
study quality, at least as measured in this way, would                     analysis of summary data. It is currently unclear
not appear to affect the study results systematically,                     whether the extra effort required is worthwhile. For
or to explain the between-study heterogeneity.                             an overview of these and further meta-analytical
                                                                           developments see Sutton et al. (2000c).
4 Further developments in methods
of meta-analysis                                                           New directions for meta-analysis using
                                                                           Bayesian statistics
Specialist meta-analysis methods
                                                                           In addition to the above developments, more
While section 2 provided a summary of the most                             advanced methods for synthesis of information have
commonly used methods in meta-analysis, many                               been developed. Although not currently used rou-

144                                                   © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148
Meta-analysis methods




tinely, these provide potentially more powerful and                             ment many existing meta-analysis methods devel-
flexible tools for synthesizing evidence. Many of                                oped classically and, more importantly, develop
these methods use Bayesian statistics, in contrast to                           models not possible using more traditional classical
the more commonly employed classical approach.                                  software. This has potentially huge benefits for syn-
   A full description of Bayesian methods is not pos-                           thesizing information and builds on earlier pioneer-
sible here, but for a recent review of their use in                             ing work by Eddy et al. (1992), whose ‘new’ graphical
assessing health technologies see Spiegelhalter et al.                          approach to meta-analysis can now be implemented
(2000a). The key element of the Bayesian approach                               using WinBUGS (Spiegelhalter et al. 2000b). Issues
is that it introduces the idea of subjective probability                        being addressed by these methods are outlined
(O’Hagan 1988) in contrast to the objective pro-                                below:
babilities traditionally attached to specific, often                             1 Data from an RCT may be of direct interest, but
repeatable, events. Before carrying out a piece of                              not of a form which can simply included in a meta-
research, an investigator would have formed some                                analysis. For example, data from an RCT which uses
prior beliefs regarding its outcome, possibly derived                           the intervention of interest in the treatment arm, but
from results of previous research in the same field.                             a different intervention from the other studies in
These a priori beliefs are combined with the data                               the control arm may be available. Methods to
from the current investigation to produce results                               include such data have been developed (Higgins &
which reflect the researchers beliefs having con-                                Whitehead 1996).
ducted the research. These posterior beliefs are cal-                           2 In some assessments considering only random-
culated by combining the prior beliefs with the new                             ized evidence may not be the optimal approach.
data using Bayes’ Theorem, which forms the back-                                Observational studies, which could potentially be
bone of all Bayesian analysis.                                                  very large, providing valuable data on thousands of
   The advantages of using such an approach are                                 patients, may be available. It may sometimes seem
often subtle, but important. Perhaps most notable                               unjust to exclude these from a meta-analysis, partic-
from a health-care context is the ability to make                               ularly if they are of high quality, as they may have
direct probability statements regarding quantities of                           particular strengths and weaknesses, different from
interest, for example, the probability that patients                            those of randomized studies (Droitcour et al. 1993).
receiving drug A have better survival than those                                Special methods have been developed to account for
who receive drug B. There are good reasons,                                     different study designs in a meta-analysis (Prevost
however, why the Bayesian approach has largely                                  et al. 2000; Larose & Dey 1997). In other instances
been neglected in routine use. The most serious                                 data on the effect of a drug of interest in animals may
is that, generally, the computations required in                                be available and provide valuable information which
Bayesian models are very complex. Additionally, the                             can be incorporated (DuMouchel & Harris 1983).
expressing of prior beliefs in form which can be                                3 There may be benefits to including information
included in analysis is a non-trivial task. Excitingly,                         included in previous trials or meta-analyses on
many of the computational difficulties have been                                 similar topics using similar interventions and out-
addressed recently, with the development of special-                            come measures (Higgins & Whitehead 1996).
ist software, most notably WinBUGS (Spiegelhalter                               4 A study may not provide any quantitative data at
et al. 2000b). The problem of expressing prior beliefs                          all, being qualitative in design, but this qualitative
remains; however, there are practical ‘solutions’,                              data may be of direct relevance to the topic under
including using ‘off-the-shelf’ priors, which can                               assessment (Roberts et al. 1998).
express the presence of a range of degrees of prior                                Bayesian modelling gives us the potential to
knowledge, and can be used in a sensitivity analysis.                           include all these types of data in a variety of ways,
Use of ‘vague priors’, which essentially means prior                            including direct input into the model, or incorporated
information is ignored, is also possible.                                       through the specification of prior beliefs.
   The new WinBUGS software is able to compute                                     Other new approaches to meta-analysis have been
the calculations required for a wide range of                                   suggested, but the corresponding methodology is at
Bayesian analyses. The user has freedom to imple-                               the conceptual rather than practical stage of devel-

© 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148                                                145
A.J. Sutton et al.




opment. The extension of meta-regression to the                ideas which sound natural and appealing are often
simultaneous modelling of multiple scientific factors           difficult to implement in practice. Clearly, it will be
with the intention of producing a response surface of          some time before they are used routinely, but signif-
treatment effects, rather than a single pooled result          icant steps have been made. Moving the synthesis of
has been advocated (Rubin 1992). This may allow a              evidence beyond calculating simple averages is
more detailed examination of the science underlying            timely, feasible and, indeed, essential.
the results synthesized (Rubin 1992; Lau et al. 1998).
Further, it may be possible to model different aspects         Acknowledgements
of the processes under study separately. For example,
if one were interested in the effect of lowering cho-          The research on which this paper is based was
lesterol of clinical outcomes, in a first stage of the          funded, in part, by the NHS Research and Develop-
analysis data relating to the degree different inter-          ment Health Technology Assessment Programme
ventions lower cholesterol levels could be synthe-             (Methodology Project Numbers 93/52/3 & 95/09/03)
sized. Then, in a second stage, the relationship
between cholesterol level and various clinical out-            References
comes could be modelled (Katerndahl & Lawler
                                                               Begg C., Cho M., Eastwood S., Horton R., Moher O. &
1999). A further utilization of Bayesian modelling
                                                                 Olkin I. (1996) Improving the quality of reporting of
could allow meta-analysis to be placed within a deci-
                                                                 randomised controlled trials: the CONSORT statement.
sion theoretical framework (Berger 1980) which can               Journal of the American Medical Association 276,
also take into account utilities when making health              637–639.
care or policy decisions (Midgette et al. 1994).               Begg C.B. & Mazumdar M. (1994) Operating characteris-
   However, there is no magic wand to make all this              tics of a rank correlation test for publication bias. Bio-
happen. While Bayesian modelling provides flexibil-               metrics 50, 1088–1101.
ity and framework, it does not dictate how models              Berard A. & Bravo G. (1998) Combining studies using
should be specified, data should be incorporated, or              effect sizes and quality scores: application to bone loss
how priors should be elicited. There is much method-             in postmenopausal women. Journal of Clinical Epidemi-
ological work required to further develop the ideas              ology 51, 801–807.
outlined above.                                                Berger J.O. (1980). Statistical Decision Theory and Bayesian
                                                                 Analysis, 2nd edn. Springer-Verlag, New York.
                                                               Berkey C.S., Hoaglin D.C., Mosteller F. & Colditz G.A.
5 Conclusion                                                     (1995) A random-effects regression model for meta-
                                                                 analysis. Statistics in Medicine 14, 395–411.
Much has been written on meta-analysis and the syn-            Biggerstaff B.J. & Tweedie R.L. (1997) Incorporating vari-
thesis of evidence within the medical literature over            ability in estimates of heterogeneity in the random
the past two decades. During this time, the basic syn-           effects model in meta-analysis. Statistics in Medicine 16,
thesizing of effect measures using weighed averages              753–768.
has been refined to a high degree, and much of the              Boissel J.P., Blanchard J., Panak E., Peyrieux J.C., SACKS
methodology required to do so is in place for most               & H. (1989) Considerations for the meta-analysis of ran-
situations encountered. Threats to the validity of               domized clinical trials: summary of a panel discussion.
meta-analysis exist, and the methods for dealing with            Controlled Clinical Trials 10, 254–281.
problems such as publication bias and variations in            Cochrane Effective Practice and Organisation of Care
                                                                 Review Group (1998) The Data Collection Checklist.
quality of the primary studies are at a less refined
                                                                 University of Aberdeen, HSRU, Aberdeen.
stage. Additionally, many consider the standard
                                                               Copas J. (1999) What works?: selectivity models and meta-
‘weighted average approach’ to meta-analysis not to              analysis. Journal of the Royal Statistical Society, Series A
be ‘state of the art’ in at least some situations, where         161, 95–105.
the use of more sophisticated methods, generally to            von Dadelszen P., Ornstein M.P., Bull S.B., Logan A.G.,
synthesize a broader base of evidence, would be                  Koren G. & Magee L.A. (2000) Fall in mean arterial
advantageous. Currently, such approaches are still               pressure and fetal growth restriction in pregnancy hyper-
firmly in the experimental stage and unfortunately                tension: a meta-analysis. Lancet 355, 87–92.

146                                                  © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148
Meta-analysis methods




Dear K.B.G. (1994) Iterative generalized least squares for                        epidemiological literature. Epidemiological Review 9,
  meta-analysis of survival data at multiple times. Biomet-                       1–30.
  rics 50, 989–1002.                                                            Hardy R.J. & Thompson S.G. (1996) A likelihood approach
Deeks J., Glanville J. & Sheldon T. (1996) Undertaking sys-                       to meta-analysis with random effects. Statistics in Medi-
  tematic reviews of research on effectiveness: CRD                               cine 15, 619–629.
  guidelines for those carrying out or commissioning                            Higgins J.P.T. & Whitehead A. (1996) Borrowing strength
  reviews. Report no. 4. Centre for Reviews and Dissemi-                          from external trials in a meta-analysis. Statistics in Med-
  nation. York Publishing Services Ltd, York.                                     icine 15, 2733–2749.
DerSimonian R. & Laird N. (1986) Meta-analysis in clini-                        Hollis S. & Campbell F. (1999) What is meant by intention
  cal trials. Controlled Clinical Trials 7, 177–188.                              to treat analysis? Survey of published randomised con-
Detsky A.S., Naylor C.D.O., Rourke K., McGeer A.J.L.,                             trolled trials. British Medical Journal 319, 670–674.
  Abbe K.A., O’Rourke K. & L’Abbe K.A. (1992) Incor-                            Horton R. (1997) Medical editors trial amnesty. Lancet 350,
  porating variations in the quality of individual random-                        756.
  ized trials into meta-analysis. Journal of Clinical                           Horton R. & Smith R. (1999) Time to register randomised
  Epidemiology 45, 255–265.                                                       trials – the case is now unanswerable. British Medical
Dickersin K., Scherer R. & Lefebvre C. (1994) Systematic                          Journal 319, 865–866.
  reviews – identifying relevant studies for systematic                         Hunt M. (1997) How Science Takes Stock: the story of meta-
  reviews. British Medical Journal 309, 1286–1291.                                analysis. Russell Sage Foundation, New York.
Droitcour J., Silberman G. & Chelimsky E. (1993)                                Irwig L., Macaskill P., Glasziou P. & Fahey M. (1995) Meta-
  Cross-design synthesis: a new form of meta-analysis                             analytic methods for diagnostic test accuracy. Journal of
  for combining results from randomized clinical trials                           Clinical Epidemiology 48, 119–130.
  and medical-practice databases. International Journal                         Jefferson T., Mugford M., Gray A. & DeMicheli V. (1996)
  of Technology Assessment in Health Care 9, 440–                                 An exercise in the feasibility of carrying out secondary
  449.                                                                            economic analysis. Health Economics 5, 155–165.
DuMouchel W.H. & Harris J.E. (1983) Bayes methods for                           Juni P., Witschi A., Bloch R. & Egger M. (1999) The hazards
  combining the results of cancer studies in humans and                           of scoring the quality of clinical trials for meta-analysis.
  other species (with comment). Journal of the American                           Journal of the American Medical Association 282,
  Statistical Association 78, 293–308.                                            1054–1060.
Duval S. & Tweedie R. (1998) Practical estimates of the                         Katerndahl D.A. & Lawler W.R. (1999) Variability in meta-
  effect of publication bias in meta-analysis. Australasian                       analytic results concerning the value of cholesterol re-
  Epidemiologist 5, 14–17.                                                        duction in coronary heart disease: a meta-meta-analysis.
Eddy D.M., Hasselblad V. & Shachter R. (1992) Meta-                               American Journal of Epidemiology 149, 429–441.
  Analysis by the Confidence Profile Method. Academic                             Larose D.T. & Dey D.K. (1997) Grouped random effects
  Press, San Diego.                                                               models for Bayesian meta-analysis. Statistics in Medicine
Egger M., Smith G.D., Schneider M. & Minder C. (1997)                             16, 1817–1829.
  Bias in meta-analysis detected by a simple, graphical test.                   Lau J., Ioannidis J.P. & Schmid C.H. (1998) Summing up
  British Medical Journal 315, 629–634.                                           evidence: one answer is not always enough. Lancet 351,
Fleiss J.L. (1993) The statistical basis of meta-analysis. Sta-                   123–127.
  tistical Methods in Medical Research 2, 121–145.                              Light R.J. & Pillemar D.B. (1984) Summing Up: the science
Fleiss J.L. (1994) Measures of effect size for categorical                        of reviewing research. Harvard University Press, Cam-
  data. In The Handbook of Research Synthesis (eds H.                             bridge, MA.
  Cooper & L.V. Hedges), pp. 245–260. Russell Sage Foun-                        Midgette A.S., Wong J.B., Beshansky J.R., Porath A.,
  dation, New York.                                                               Fleming C. & Pauker S.G. (1994) Cost-effectiveness of
Freemantle N., Cleland J., Young P., Mason J. & Harrison                          streptokinase for acute myocardial-infarction – a com-
  J. (1999) b-Blockade after myocardial infarction: sys-                          bined metaanalysis and decision-analysis of the effects
  tematic review and meta regression analysis. British                            of infarct location and of likelihood of infarction.
  Medical Journal 318, 1730–1737.                                                 Medical Decision Making 14, 108–117.
Givens G.H., Smith D.D. & Tweedie R.L. (1997) Publica-                          Moher D., Cook D.J., Eastwood S., Olkin I., Rennie D. &
  tion bias in meta-analysis: a Bayesian data-augmentation                        Stroup D. for the QUORUM Group (1999b) Improving
  approach to account for issues exemplified in the passive                        the quality of reporting of meta-analysis of randomised
  smoking debate. Statistical Science 12, 221–250.                                controlled trials: the QUORUM statement. Lancet 354,
Greenland S. (1987) Quantitative methods in the review of                         1896–1900.

© 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148                                                      147
A.J. Sutton et al.




Moher D., Jadad A.R., Nichol G., Penman M., Tugwell P. &           Song F., Easterwood A., Gilbody S., Duley L. & Sutton
  Walsh S. (1995) Assessing the quality of randomized con-           A.J. (2000) Publication and other selection biases in
  trolled trials – an annotated bibliography of scales and           systematic reviews. Health Technology Assessment 4(10),
  checklists. Controlled Clinical Trials 12, 62–73.                  1–115.
Moher D., Klassen T.P., Jadad A.R., Tugwell P., Moher M. &         Spiegelhalter D.J., Miles J.P., Jones D.R. & Abrams K.R.
  Jones A.L. (1999a) Assessing the quality of randomised             (2000a) Bayesian methods in health technology assess-
  controlled trials: implications for the conduct of meta-           ment. Health Technology Assessment 4(38), 1–142.
  analyses. Health Technology Assessment 3(12), 1–98.              Spiegelhalter D.J., Thomas A. & Best N.G. (2000b)
O’Hagan A. (1988) Probability: methods and measurement.              Winbugs, version 1.2. user manual. MRC Biostatistics
  Chapman & Hall, London.                                            Unit, Cambridge.
Oxman A.D. (1996) The Cochrane Collaboration Hand-                 Sterne J.A.C., Egger M. & Sutton A.J. (2001) Meta-
  book: preparing and maintaining systematic reviews, 2nd            analysis software. In Systematic Reviews in Health Care:
  edn. Cochrane Collaboration, Oxford.                               meta-analysis in context, 2nd edn (eds M. Egger, G.
Parker S.G., Peet S.M., McPherson A., Cannaby A.M.,                  Davey Smith & D.G. Altman), pp. 336–346. BMJ Books,
  Baker R.,Wilson A., Lindesay J., Parker G.,Abrams K.R.             London.
  & Jones D.R. (2001) A systematic review of discharge             Stewart L.A. & Clarke M.J. (1995) Practical methodology
  arrangements for older people. Health Technology                   of meta-analyses (overviews) using updated individual
  Assessment (in press).                                             patient data. Cochrane Working Group on Statistical
Peto R. (1987) Why do we need systematic overviews of                Medicine 14, 2057–2079.
  randomised trials? Statistics in Medicine 6, 233–240.            Sutton A.J., Abrams K.R., Jones D.R., Sheldon T.A. & Song
Prevost T.C., Abrans K.R. & Jones D.R. (2000) Hierarchi-             F. (1998) Systematic reviews of trials and other studies.
  cal models in generalized synthesis of evidence: an                Health Technology Assessment 2(19), 1–310.
  example based on studies of breast cancer screening.             Sutton A.J., Abrams K.R., Jones D.R., Sheldon T.A. &
  Statistics in Medicine 19, 3359–3376.                              Song F. (2000c) Methods for Meta-Analysis in Medical
Roberts K.A., Jones D.R., Abrams K.R., Dixon-Woods M.                Research. John Wiley, London.
  & Fitzpatrick R. (1998) Meta-analysis of qualitative and         Sutton A.J., Duval S.J., Tweedie R.L., Abrams K.R. & Jones
  quantitative evidence: an example based on studies of              D.R. (2000a) Empirical assessment of effect of publica-
  patient satisfaction. Technical Report 98–01, University           tion bias on meta-analyses. British Medical Journal 320,
  of Leicester: Department of Epidemiology and Public                1574–1577.
  Health, Leicester.                                               Sutton A.J., Lambert P.C., Hellmich M., Abrams K.R. &
Rubin D. (1992) A new perspective. In The Future of Meta-            Jones D.R. (2000b) Meta-analysis in practice: a critical
  Analysis (eds K.W. Wachter & M.L. Straf), pp. 155–165.             review of available software. In Meta-Analysis in Medi-
  Russell Sage Foundation, New York.                                 cine and Health Policy (eds D.A. Berry & D.K. Stangl).
Schmid C.H., Lau J., McIntosh M.W. & Cappelleri J.C.                 Marcel Dekker, New York.
  (1998) An empirical study of the effect of the control           Thompson S.G. (1993) Controversies in meta-analysis: the
  rate as a predictor of treatment efficacy in meta-analysis          case of the trials of serum cholesterol reduction. Statisti-
  of clinical trials. Statistics in Medicine 17, 1923–1942.          cal Methods in Medical Research 2, 173–192.
Schulz K.F., Chalmers I., Hayes R.J. & Altman D.G. (1995)          Tweedie R.L. & Mengersen K.L. (1995) Meta-analytic
  Empirical evidence of bias: dimensions of methodologi-             approaches to dose–response relationships, with appli-
  cal quality associated with estimates of treatment effects         cation in studies of lung cancer and exposure to envi-
  in controlled trials. Journal of the American Medical              ronmental tobacco smoke. Statistics in Medicine 14,
  Association 273, 408–412.                                          545–569.
Senn S., Sharp S., Thompson S. & Altman D. (1996) Rela-            Walter S.D. (1997) Variation in baseline risk as an expla-
  tion between treatment benefit and underlying risk in               nation of heterogeneity in meta-analysis. Statistics in
  meta-analysis. British Medical Journal 313, 1550–1551.             Medicine 16, 2883–2900.
Shadish W.R. & Haddock C.K. (1994) Combining estimates             Whitehead A. & Jones N.M.B. (1994) A meta-analysis of
  of effect size. In The Handbook of Research Synthesis              clinical trials involving different classifications of
  (eds H. Cooper & L.V. Hedges), pp. 261–284. Russell                response into ordered categories. Statistics in Medicine
  Sage Foundation, New York.                                         13, 2503–2515.




148                                                      © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148

Weitere ähnliche Inhalte

Was ist angesagt?

Lancaster design and analysis of pilot studies
Lancaster design and analysis of pilot studiesLancaster design and analysis of pilot studies
Lancaster design and analysis of pilot studiesnoorafifah
 
Meta analysis ppt
Meta analysis pptMeta analysis ppt
Meta analysis pptSKVA
 
演講-Meta analysis in medical research-張偉豪
演講-Meta analysis in medical research-張偉豪演講-Meta analysis in medical research-張偉豪
演講-Meta analysis in medical research-張偉豪Beckett Hsieh
 
Statistical methods for cardiovascular researchers
Statistical methods for cardiovascular researchersStatistical methods for cardiovascular researchers
Statistical methods for cardiovascular researchersRamachandra Barik
 
Evaluating the Quality of Trauma Care A Literature Review
Evaluating the Quality of Trauma Care A Literature ReviewEvaluating the Quality of Trauma Care A Literature Review
Evaluating the Quality of Trauma Care A Literature ReviewKaylie Butt
 
Meta analysis
Meta analysisMeta analysis
Meta analysisJunaidAKG
 
Meta analysis techniques in epidemiology
Meta analysis techniques in epidemiologyMeta analysis techniques in epidemiology
Meta analysis techniques in epidemiologyBhoj Raj Singh
 
Critical appraisal of meta-analysis
Critical appraisal of meta-analysisCritical appraisal of meta-analysis
Critical appraisal of meta-analysisSamir Haffar
 
Meta analysis: Made Easy with Example from RevMan
Meta analysis: Made Easy with Example from RevManMeta analysis: Made Easy with Example from RevMan
Meta analysis: Made Easy with Example from RevManGaurav Kamboj
 
9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic Review9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic ReviewResearchGuru
 
Anatomy of a meta analysis i like
Anatomy of a meta analysis i likeAnatomy of a meta analysis i like
Anatomy of a meta analysis i likeJames Coyne
 
Meta-analysis and systematic reviews
Meta-analysis and systematic reviews Meta-analysis and systematic reviews
Meta-analysis and systematic reviews coolboy101pk
 

Was ist angesagt? (19)

Metaanalysis copy
Metaanalysis    copyMetaanalysis    copy
Metaanalysis copy
 
Meta analysis
Meta analysisMeta analysis
Meta analysis
 
Lancaster design and analysis of pilot studies
Lancaster design and analysis of pilot studiesLancaster design and analysis of pilot studies
Lancaster design and analysis of pilot studies
 
Meta analysis
Meta analysisMeta analysis
Meta analysis
 
Meta analysis ppt
Meta analysis pptMeta analysis ppt
Meta analysis ppt
 
演講-Meta analysis in medical research-張偉豪
演講-Meta analysis in medical research-張偉豪演講-Meta analysis in medical research-張偉豪
演講-Meta analysis in medical research-張偉豪
 
Statistical methods for cardiovascular researchers
Statistical methods for cardiovascular researchersStatistical methods for cardiovascular researchers
Statistical methods for cardiovascular researchers
 
Evaluating the Quality of Trauma Care A Literature Review
Evaluating the Quality of Trauma Care A Literature ReviewEvaluating the Quality of Trauma Care A Literature Review
Evaluating the Quality of Trauma Care A Literature Review
 
Seminar in Meta-analysis
Seminar in Meta-analysisSeminar in Meta-analysis
Seminar in Meta-analysis
 
Meta analysis
Meta analysisMeta analysis
Meta analysis
 
Meta analysis techniques in epidemiology
Meta analysis techniques in epidemiologyMeta analysis techniques in epidemiology
Meta analysis techniques in epidemiology
 
Critical appraisal of meta-analysis
Critical appraisal of meta-analysisCritical appraisal of meta-analysis
Critical appraisal of meta-analysis
 
Brief overview on meta analysis
Brief overview  on meta analysisBrief overview  on meta analysis
Brief overview on meta analysis
 
Meta analysis: Made Easy with Example from RevMan
Meta analysis: Made Easy with Example from RevManMeta analysis: Made Easy with Example from RevMan
Meta analysis: Made Easy with Example from RevMan
 
9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic Review9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic Review
 
Anatomy of a meta analysis i like
Anatomy of a meta analysis i likeAnatomy of a meta analysis i like
Anatomy of a meta analysis i like
 
Meta analysis_Sharanbasappa
Meta analysis_SharanbasappaMeta analysis_Sharanbasappa
Meta analysis_Sharanbasappa
 
Meta-analysis and systematic reviews
Meta-analysis and systematic reviews Meta-analysis and systematic reviews
Meta-analysis and systematic reviews
 
Introduction to meta analysis
Introduction to meta analysisIntroduction to meta analysis
Introduction to meta analysis
 

Ähnlich wie An illustrated guide to the methods of meta analysi

Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...Nat Rice
 
EJ1241940.pdf
EJ1241940.pdfEJ1241940.pdf
EJ1241940.pdfDeheMail
 
How to structure your table for systematic review and meta analysis – Pubrica
How to structure your table for systematic review and meta analysis – PubricaHow to structure your table for systematic review and meta analysis – Pubrica
How to structure your table for systematic review and meta analysis – PubricaPubrica
 
Systematic reviews and meta-analyses in the medical sciences
Systematic reviews and meta-analyses in the medical sciencesSystematic reviews and meta-analyses in the medical sciences
Systematic reviews and meta-analyses in the medical sciencesPubrica
 
Systematic reviews and meta-analyses in the medical sciences
Systematic reviews and meta-analyses in the medical sciencesSystematic reviews and meta-analyses in the medical sciences
Systematic reviews and meta-analyses in the medical sciencesPubrica
 
A practical guide to do primary research on meta analysis methodology - Pubrica
A practical guide to do primary research on meta analysis methodology - PubricaA practical guide to do primary research on meta analysis methodology - Pubrica
A practical guide to do primary research on meta analysis methodology - PubricaPubrica
 
How Randomized Controlled Trials are Used in Meta-Analysis
How Randomized Controlled Trials are Used in Meta-Analysis How Randomized Controlled Trials are Used in Meta-Analysis
How Randomized Controlled Trials are Used in Meta-Analysis Pubrica
 
Conduct title screening for systemic review- using Endnote Covidence – Pubric...
Conduct title screening for systemic review- using Endnote Covidence – Pubric...Conduct title screening for systemic review- using Endnote Covidence – Pubric...
Conduct title screening for systemic review- using Endnote Covidence – Pubric...Pubrica
 
A research study Writing a Systematic Review in Clinical Research – Pubrica
A research study Writing a Systematic Review in Clinical Research – PubricaA research study Writing a Systematic Review in Clinical Research – Pubrica
A research study Writing a Systematic Review in Clinical Research – PubricaPubrica
 
A research study writing a systematic review in clinical research – pubrica
A research study writing a systematic review in clinical research – pubricaA research study writing a systematic review in clinical research – pubrica
A research study writing a systematic review in clinical research – pubricaPubrica
 
A practical guide to do primary research on meta analysis methodology - Pubrica
A practical guide to do primary research on meta analysis methodology - PubricaA practical guide to do primary research on meta analysis methodology - Pubrica
A practical guide to do primary research on meta analysis methodology - PubricaPubrica
 
A Guide to Conducting a Meta-Analysis.pdf
A Guide to Conducting a Meta-Analysis.pdfA Guide to Conducting a Meta-Analysis.pdf
A Guide to Conducting a Meta-Analysis.pdfTina Gabel
 
Systematic review and meta analysis methodology
Systematic review and meta analysis methodologySystematic review and meta analysis methodology
Systematic review and meta analysis methodologyRajni Rathore
 
Systematic review article and Meta-analysis: Main steps for Successful writin...
Systematic review article and Meta-analysis: Main steps for Successful writin...Systematic review article and Meta-analysis: Main steps for Successful writin...
Systematic review article and Meta-analysis: Main steps for Successful writin...Pubrica
 
Application Of Statistical Process Control In Manufacturing Company To Unders...
Application Of Statistical Process Control In Manufacturing Company To Unders...Application Of Statistical Process Control In Manufacturing Company To Unders...
Application Of Statistical Process Control In Manufacturing Company To Unders...Martha Brown
 
Guide for conducting meta analysis in health research
Guide for conducting meta analysis in health researchGuide for conducting meta analysis in health research
Guide for conducting meta analysis in health researchYogitha P
 
Implementation Of Electronic Medical Records In Hospitals Two Case Studies
Implementation Of Electronic Medical Records In Hospitals  Two Case StudiesImplementation Of Electronic Medical Records In Hospitals  Two Case Studies
Implementation Of Electronic Medical Records In Hospitals Two Case StudiesMichelle Singh
 
Systematic review and meta analysis
Systematic review and meta analysisSystematic review and meta analysis
Systematic review and meta analysisumaisashraf
 

Ähnlich wie An illustrated guide to the methods of meta analysi (20)

Systematic review.pptx
Systematic review.pptxSystematic review.pptx
Systematic review.pptx
 
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...
 
EJ1241940.pdf
EJ1241940.pdfEJ1241940.pdf
EJ1241940.pdf
 
How to structure your table for systematic review and meta analysis – Pubrica
How to structure your table for systematic review and meta analysis – PubricaHow to structure your table for systematic review and meta analysis – Pubrica
How to structure your table for systematic review and meta analysis – Pubrica
 
Systematic reviews and meta-analyses in the medical sciences
Systematic reviews and meta-analyses in the medical sciencesSystematic reviews and meta-analyses in the medical sciences
Systematic reviews and meta-analyses in the medical sciences
 
Systematic reviews and meta-analyses in the medical sciences
Systematic reviews and meta-analyses in the medical sciencesSystematic reviews and meta-analyses in the medical sciences
Systematic reviews and meta-analyses in the medical sciences
 
A practical guide to do primary research on meta analysis methodology - Pubrica
A practical guide to do primary research on meta analysis methodology - PubricaA practical guide to do primary research on meta analysis methodology - Pubrica
A practical guide to do primary research on meta analysis methodology - Pubrica
 
How Randomized Controlled Trials are Used in Meta-Analysis
How Randomized Controlled Trials are Used in Meta-Analysis How Randomized Controlled Trials are Used in Meta-Analysis
How Randomized Controlled Trials are Used in Meta-Analysis
 
Conduct title screening for systemic review- using Endnote Covidence – Pubric...
Conduct title screening for systemic review- using Endnote Covidence – Pubric...Conduct title screening for systemic review- using Endnote Covidence – Pubric...
Conduct title screening for systemic review- using Endnote Covidence – Pubric...
 
A research study Writing a Systematic Review in Clinical Research – Pubrica
A research study Writing a Systematic Review in Clinical Research – PubricaA research study Writing a Systematic Review in Clinical Research – Pubrica
A research study Writing a Systematic Review in Clinical Research – Pubrica
 
A research study writing a systematic review in clinical research – pubrica
A research study writing a systematic review in clinical research – pubricaA research study writing a systematic review in clinical research – pubrica
A research study writing a systematic review in clinical research – pubrica
 
A practical guide to do primary research on meta analysis methodology - Pubrica
A practical guide to do primary research on meta analysis methodology - PubricaA practical guide to do primary research on meta analysis methodology - Pubrica
A practical guide to do primary research on meta analysis methodology - Pubrica
 
A Guide to Conducting a Meta-Analysis.pdf
A Guide to Conducting a Meta-Analysis.pdfA Guide to Conducting a Meta-Analysis.pdf
A Guide to Conducting a Meta-Analysis.pdf
 
Systematic review and meta analysis methodology
Systematic review and meta analysis methodologySystematic review and meta analysis methodology
Systematic review and meta analysis methodology
 
Systematic review article and Meta-analysis: Main steps for Successful writin...
Systematic review article and Meta-analysis: Main steps for Successful writin...Systematic review article and Meta-analysis: Main steps for Successful writin...
Systematic review article and Meta-analysis: Main steps for Successful writin...
 
Application Of Statistical Process Control In Manufacturing Company To Unders...
Application Of Statistical Process Control In Manufacturing Company To Unders...Application Of Statistical Process Control In Manufacturing Company To Unders...
Application Of Statistical Process Control In Manufacturing Company To Unders...
 
Guide for conducting meta analysis in health research
Guide for conducting meta analysis in health researchGuide for conducting meta analysis in health research
Guide for conducting meta analysis in health research
 
research_paper_acm_1.6
research_paper_acm_1.6research_paper_acm_1.6
research_paper_acm_1.6
 
Implementation Of Electronic Medical Records In Hospitals Two Case Studies
Implementation Of Electronic Medical Records In Hospitals  Two Case StudiesImplementation Of Electronic Medical Records In Hospitals  Two Case Studies
Implementation Of Electronic Medical Records In Hospitals Two Case Studies
 
Systematic review and meta analysis
Systematic review and meta analysisSystematic review and meta analysis
Systematic review and meta analysis
 

Mehr von rsd kol abundjani

Mehr von rsd kol abundjani (20)

Rpkps
RpkpsRpkps
Rpkps
 
Modul 7-format-kpt
Modul 7-format-kptModul 7-format-kpt
Modul 7-format-kpt
 
Draft kurikulum-2013-per-tgl-13-november-2012-pukul-14
Draft kurikulum-2013-per-tgl-13-november-2012-pukul-14Draft kurikulum-2013-per-tgl-13-november-2012-pukul-14
Draft kurikulum-2013-per-tgl-13-november-2012-pukul-14
 
Aspek penilaian
Aspek penilaianAspek penilaian
Aspek penilaian
 
8. pengembangan bahan ajar
8. pengembangan bahan ajar8. pengembangan bahan ajar
8. pengembangan bahan ajar
 
Tema tema kkn-ppm1
Tema tema kkn-ppm1Tema tema kkn-ppm1
Tema tema kkn-ppm1
 
Tayang peranan wi dan tantangannya ddn 09-12-09
Tayang peranan wi dan tantangannya ddn 09-12-09Tayang peranan wi dan tantangannya ddn 09-12-09
Tayang peranan wi dan tantangannya ddn 09-12-09
 
Spmpt
SpmptSpmpt
Spmpt
 
Skd
SkdSkd
Skd
 
pengawasan mutu pangan
pengawasan mutu panganpengawasan mutu pangan
pengawasan mutu pangan
 
Rpp opd seminar executive edit
Rpp opd seminar executive editRpp opd seminar executive edit
Rpp opd seminar executive edit
 
Pelatihan applied approach
Pelatihan applied approachPelatihan applied approach
Pelatihan applied approach
 
Matematika bangun-datar
Matematika bangun-datarMatematika bangun-datar
Matematika bangun-datar
 
Kuliah pendahuluan bioo teknologi pertanian
Kuliah pendahuluan bioo teknologi pertanianKuliah pendahuluan bioo teknologi pertanian
Kuliah pendahuluan bioo teknologi pertanian
 
Konsep penulisan modul mata pelajaran
Konsep penulisan modul mata pelajaranKonsep penulisan modul mata pelajaran
Konsep penulisan modul mata pelajaran
 
Kerangka acuan dan laporan
Kerangka acuan dan laporanKerangka acuan dan laporan
Kerangka acuan dan laporan
 
Keindahan matematik dan angka
Keindahan matematik dan angkaKeindahan matematik dan angka
Keindahan matematik dan angka
 
Kebijakan nasional spmi pt
Kebijakan nasional spmi ptKebijakan nasional spmi pt
Kebijakan nasional spmi pt
 
Jurnal pelatihan jafung adminkes
Jurnal pelatihan jafung adminkesJurnal pelatihan jafung adminkes
Jurnal pelatihan jafung adminkes
 
Inventarisasi koleksi perpustakaan
Inventarisasi koleksi perpustakaanInventarisasi koleksi perpustakaan
Inventarisasi koleksi perpustakaan
 

An illustrated guide to the methods of meta analysi

  • 1. Journal of Evaluation in Clinical Practice, 7, 2, 135–148 An illustrated guide to the methods of meta-analysis Alexander J. Sutton BSc MSc1 Keith R. Abrams BSc MSc PhD2 and David R. Jones BA MSc PhD CStat CMath DipTCDHE3 1 Lecturer in Medical Statistics, Department of Epidemiology and Public Health, University of Leicester, UK 2 Reader in Medical Statistics, Department of Epidemiology and Public Health, University of Leicester, UK 3 Professor of Medical Statistics, Department of Epidemiology and Public Health, University of Leicester, UK Correspondence Abstract Mr Alex J Sutton Meta-analysis is now accepted as a necessary tool for the evaluation of Department of Epidemiology and Public Health health care. Such analyses have been carried out in virtually every area of University of Leicester medicine to evaluate a wide spectrum of health care interventions and poli- 22-28 Princess Road West cies. This paper has three broad aims: (1) to describe the basic principles of Leicester LE1 6TP meta-analysis, using a meta-analysis of interventions intended to reduce UK hospital re-admission rates for illustration; (2) to consider threats to the Keywords: Bayesian methods, internal validity of meta-analysis, and the measures which can be taken to hospital discharge, meta-analysis, minimize their impact; and (3) to present an overview of more specialist methods, re-admission, review and developing methods for synthesizing data, with the intention of out- Accepted for publication: lining the directions meta-analysis may take in the future.The methods used 22 July 2000 to synthesize studies, which take ‘weighted averages’ of effect sizes have been refined to a high degree, while the methods for dealing with threats to the validity of meta-analyses such as publication bias, and variations in quality of the primary studies, are at a less advanced stage. However, many consider this standard ‘weighted average’ approach to meta-analysis not to be ‘state of the art’ in at least some situations, where the use of more sophisticated methods, generally to explain variation in estimates from different studies and synthesize a broader base of evidence, would be advantageous. Currently, approaches which attempt to do this are mainly still in the experimental stage and, unfortunately, ideas which sound natural and appealing are often difficult to implement in practice. Clearly, it will be some time before they are used routinely, but significant steps have been made. Since different studies are carried out using different 1 Introduction populations, different designs and a whole range of Meta-analysis is now accepted as a necessary tool other study-specific factors, it has been suggested that for the evaluation of health care. Such analyses have combining them will produce an estimate that has been carried out in virtually every area of medicine, broader generalizability than any single study. Addi- to evaluate a wide spectrum of health-care interven- tionally, it may be possible to explain the differences tions and policies. The primary aim of many meta- between results from individual studies by carrying analyses is to produce a more accurate estimate of the out a meta-analysis. Such an assessment may even effect of a particular intervention, or group of inter- provide further insight into the intervention, and ventions, than is possible using only a single study. develop our understanding of how it works. © 2001 Blackwell Science 135
  • 2. A.J. Sutton et al. Concurrent with the explosion in the use of meta- tive has produced a checklist addressing the quality analysis is the continued development and refine- of reporting of meta-analyses (QUORUM) (Moher ment of the methods used to carry out such analyses. et al. 1999b). This statement is in the same spirit as This is an important endeavour, because the science the CONSORT statement for reporting randomized of meta-analysis is still in its infancy, and in the past clinical trials (RCTs) (Begg et al. 1996) and is recom- over-simplistic methods have led to misleading mend as reading for those preparing reports of meta- conclusions (Hunt 1997). A systematic review of analyses of RCTs. methodology for meta-analysis carried out by the authors (Sutton et al. 1998) informed the writing of 2 The synthesis of estimates of effectiveness this paper, and is recommended further reading for from multiple primary studies more technical details on the material presented here. The reader should note, however, that several This section focuses on pooling results from a number important developments which are noted here have of studies investigating the relative effectiveness of an been published in the short time since the review was intervention. Often, meta-analyses of this sort include written, confirming the speed with which this field only RCTs, typically with two arms – one arm receiv- continues to develop. ing experimental treatment and the other control, This paper has three broad aims: (1) to describe the placebo or standard treatment. (The issue of variable basic principles of meta-analysis using a worked quality of studies, and the synthesis of studies with example; (2) to consider the threats to the validity of different designs is considered in sections 3 and 4, meta-analysis and the measures which can be taken respectively). Data from a meta-analysis of interven- to minimize their impact; and (3) to present an tions intended to improve the process of hospital dis- overview of more specialist and developing methods, charge of older people, published elsewhere (Parker with the intention of outlining the directions meta- et al. 2001), is used to illustrate the methods. Thirty analysis may take in the future. The term ‘meta- two-arm RCTs are included in the meta-analysis, and analysis’ is used to describe different aspects of the outcome focused on here is the re-admission rate research synthesis by different people. In some con- to hospital following discharge. In the remainder of texts it is used to indicate the whole review process, this section the principal ideas involved in performing including aspects such as literature searching and a meta-analysis are explained and, where possible, data extraction, as well as the statistical combination the calculations required are reproduced to aid of quantitative results. We prefer to use the term ‘sys- understanding. In practice, the use of computer soft- tematic review’ to indicate the whole review process, ware greatly facilitates the analyses required. The restricting the term ‘meta-analysis’ to describe the meta-analysis capabilities of many common statistical synthesis of quantitative data from multiple studies. analysis packages are limited; however, much Although many recent advances in pre-synthesis specialist software has been developed recently review methods have been made, such as the devel- (Sutton et al. 2000b; Sterne et al. 2001). opment of sophisticated searching methods (Sutton et al. 1998; Dickersin et al. 1994), this paper focuses Calculation of an effect size for each study solely on aspects of quantitative data synthesis, or meta-analysis. [Note: very often a systematic review Broadly speaking, quantitative outcomes from any will include a meta-analysis; however, if no quantita- study can be classified as belonging to one of three tive data are available from the primary reports, or data types: (i) binary, e.g. often indicating the pres- that which is available is deemed too heterogeneous ence or absence of the event of interest in each to be meaningfully combined, then only a narrative patient; (ii) continuous, where outcome is measured description of the studies may be carried out (Sutton on a continuous scale, e.g. this could be change in et al. 1998).] Guidelines for good practice for the pre- blood pressure, etc.; or (iii) ordinal, where outcome synthesis aspects of systematic reviews have been is measured on an ordered categorical scale, e.g. a described comprehensively elsewhere (Deeks et al. disease severity scale, where a patient can be classi- 1996; Oxman 1996). Very importantly, a recent initia- fied as belonging to one of several distinct categories. 136 © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148
  • 3. Meta-analysis methods The approaches used to combine either binary or calculated by dividing the RRs in the treatment and continuous outcomes are often similar, while ordinal control arms, 0.036/0.162, which produces an RRR of data is somewhat more complex and requires spe- 0.222. This RRR is less than one, which indicates the cialist methods, discussed elsewhere (Whitehead & re-admission rate is lower in the treatment arm, sug- Jones 1994). gesting that the intervention is beneficial. In this Table 1 provides a sample of the data extracted instance the estimated effect is large (a long way from reports of 30 RCTs to be included in the meta- from 1). The RRs for each arm are provided in analysis (for a list of references for these RCTs see columns 5, 8 and the RRR in column 9. the original report (Parker et al. 2001) – numbers Although the RRR is the measure of interest, due used to identify these RCTs in this report are pro- to theoretical statistical considerations (including vided here in the final column of Table 1). Columns improved approximate normality), a natural loga- three and six provide the number of patients ran- rithm transformation is used (ln(RRR)) for the domized to the experimental and control arms of purpose of combining studies via a meta-analysis. each study, respectively. [Note: analysis should (Fleiss 1994) The pooled result can be back-trans- usually be calculated on the basis of intention to treat formed by taking the exponential of the pooled (Hollis & Campbell 1999) – if the analysis in the orig- ln(RRR) (e1n(RRR)) afterwards, to convert the answer inal study report was not performed using this back to the RRR scale, allowing easier interpreta- method it may still be possible to extract the cor- tion. The ln(RRR) estimates for each study are given rect figures for the purposes of the meta-analysis.] in column 10 of Table 1. Columns four and seven indicate the number of re- A further value, the standard error (SE) of the admission episodes. Note that an individual can have ln(RRR), is required for the meta-analysis calcula- multiple re-admissions; for example, the new inter- tion. The SE gives an indication of the degree of pre- vention arm of study 8 included 142 patients, while cision to which each study estimates the effect size; a 554 events were reported. [Note: the fact that more small SE indicates a precise estimate, usually from a than one re-admission is permitted for each patient large study. The SE for the ln(RRR) is calculated by: means that an individual’s outcome is not binary.] Column two indicates the length of follow-up of the SE(ln(RRR)) = studies, which ranges from 1 to 12 months; it is nec- 1 1 essary to account for follow-up when calculating + num. of re - admiss. num. of re - admiss. in effect sizes, since the number of re-admissions may in exp. group control group be critically dependent on the length of the observa- tion period of the trial. Hence, for study 1 the SE(ln(RRR)) is An outcome measure which takes into account ÷1/2 + 1/9 = 0.782. Standard errors for the remain- length of follow-up is the re-admission rate ratio ing studies are provided in column 11 of Table 1. (RRR). As the name suggests, this is the ratio of It is common practice to calculate 95% confi- the re-admission rates (per month) in both arms. dence intervals for each study – these indicate The re-admission rate (RR) in each arm is calculated the interval in which the estimate of effect size by: would be expected to fall 95 times out of every 100 replications of the trial. Hence, a 95% confidence Number of re - admissions RR = interval provides a range in which one can be Number of patients ¥ length of follow - up reasonably sure the true effect size lies. The formula for calculating a 95% confidence interval for a For example, there are two re-admissions in 37 ln(RRR) is: patients over 1.5 months in trial 1, so the RR is 2/(3.7 ¥ 1.5) = 0.036. [Note: more decimal places are ln(RRR) ± 1.96 ¥ SE(ln(RRR)). used in the working of the calculations in this paper than are printed.] Similarly, the RR in the control For study 1 the ln(RRR) 95% confidence interval group is 0.162. The outcome of interest can now be is given by -1.504 ± 1.96(0.782) = (-3.04 - 0.03). Con- © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148 137
  • 4. 138 Table 1 Data and calculations for the hospital re-admissions meta-analysis Experimental group Control group Re- Number Length of Re- Re- Re- Re- admission EPOC used in Study follow-up Patients admissions admission Patients admissions admission rate ratio SE 95% CI 95% CI Intervention quality original ID (months) (n) (n) rate (n) (n) rate (RRR) ln(RRR) (ln(RRR)) ln(RRR) RRR Weight administration measure report* 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 1.5 37 2 0.036 37 9 0.162 0.222 -1.504 0.782 (-3.04 - 0.03) (0.05 - 1.03) 1.64 Single 5 53 2 3 464 102 0.073 439 102 0.077 0.946 -0.055 0.140 (-0.33 - 0.22) (0.72 - 1.24) 51.00 Single 3 59 3 6 499 347 0.116 502 340 0.113 1.027 0.026 0.076 (-0.12 - 0.18) (0.88 - 1.19) 171.73 Single 6 60 4 6 86 36 0.070 87 26 0.050 1.401 0.337 0.257 (-0.17 - 0.84) (0.85 - 2.32) 15.10 Single 4 69 5 12 57 9 0.013 56 6 0.009 1.474 0.388 0.527 (-0.65 - 1.42) (0.52 - 4.14) 3.60 Team 3 82 6 2 39 29 0.372 41 35 0.427 0.871 -0.138 0.251 (-0.63 - 0.35) (0.53 - 1.42) 15.86 Single 3 88 7 3 20 3 0.050 20 13 0.217 0.231 -1.466 0.641 (-2.72 to - 0.21) (0.07 - 0.81) 2.44 Single 6 177 8 3 142 554 1.300 140 868 2.067 0.629 -0.463 0.054 (-0.57 to - 0.36) (0.57 - 0.70) 338.16 Team 4 187 9 6 695 343 0.082 701 310 0.074 1.116 0.110 0.078 (-0.04 - 0.26) (0.96 - 1.30) 162.83 Team 6 222 © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148 10 2 178 43 0.121 176 37 0.105 1.149 0.139 0.224 (-0.30 - 0.58) (0.74 - 1.78) 19.89 Single 2 228 11 6 30 9 0.050 30 6 0.033 1.500 0.405 0.527 (-0.63 - 1.44) (0.53 - 4.21) 3.60 Team 5 231 12 6 96 42 0.073 97 62 0.107 0.684 -0.379 0.200 (-0.77 - 0.01) (0.46 - 1.01) 25.04 Team 3 236 13 3 303 104 0.114 300 109 0.121 0.945 -0.057 0.137 (-0.33 - 0.21) (0.72 - 1.24) 53.22 Team 4 275 14 6 150 51 0.057 99 32 0.054 1.052 0.051 0.226 (-0.39 - 0.49) (0.68 - 1.64) 19.66 Team 4 283 15 1 20 4 0.200 20 6 0.300 0.667 -0.405 0.645 (-1.67 - 0.86) (0.19 - 2.36) 2.40 Team 1 312 16 1.5 29 4 0.092 25 9 0.240 0.383 -0.959 0.601 (-2.14 - 0.22) (0.12 - 1.24) 2.77 Single 4 334 17 12 333 396 0.099 335 410 0.102 0.972 -0.029 0.070 (-0.17 - 0.11) (0.85 - 1.12) 201.44 Single 3 339 18 3 140 18 0.043 136 16 0.039 1.093 0.089 0.344 (-0.58 - 0.76) (0.56 - 2.14) 8.47 Single 4 351 19 9 418 495 0.132 417 549 0.146 0.899 -0.106 0.062 (-0.23 - 0.02) (0.80 - 1.02) 260.31 Single 3 397 20 6 62 21 0.056 58 35 0.101 0.561 -0.578 0.276 (-1.12 to - 0.04) (0.33 - 0.96) 13.13 Team 4 403 21 12 199 107 0.045 205 111 0.045 0.993 -0.007 0.135 (-0.2 - 0.26) (0.76 - 1.30) 54.48 Team 3 416 22 12 63 22 0.029 60 30 0.042 0.698 -0.359 0.281 (-0.91 - 0.19) (0.40 - 1.21) 12.69 Team 4 691 23 6 35 10 0.048 40 51 0.213 0.224 -1.496 0.346 (-2.17 to - 0.82) (0.11 - 0.44) 8.36 Single 4 1793 24 6 102 49 0.080 102 51 0.083 0.961 -0.040 0.200 (-0.43 - 0.35) (0.65 - 1.42) 24.99 Single 7 1796 25 6 140 24 0.029 97 29 0.050 0.573 -0.556 0.276 (-1.10 to - 0.02) (0.33 - 0.98) 13.13 Team 3.5 2211 26 3 45 5 0.037 46 5 0.036 1.022 0.022 0.632 (-1.22 - 1.26) (0.30 - 3.53) 2.50 Team 6 2229 27 4 49 11 0.056 51 7 0.034 1.636 0.492 0.483 (-0.46 - 1.44) (0.63 - 4.22) 4.28 Single 3.5 2657 28 6 177 49 0.046 186 107 0.096 0.481 -0.731 0.172 (-1.07 to - 0.39) (0.34 - 0.67) 33.61 Single 3 3632 29 3 381 154 0.135 381 197 0.172 0.782 -0.246 0.108 (-0.46 to - 0.04) (0.63 - 0.97) 86.43 Team 4 3636 30 2 96 22 0.019 110 43 0.033 0.586 -0.534 0.262 (-1.05 to - 0.02) (0.35 - 0.98) 14.55 Single 6 4460 *Parker et al. 2000. n = number.
  • 5. Meta-analysis methods Figure 1 Forest plot of 30 RCTs examining the effect on re- admission rates of interventions aimed at modifying the hospital discharge process for elderly people. fidence intervals for RRR are obtained by taking Combining effect sizes – calculating the exponential of this ln(RRR) interval; hence, weighted averages the RRR 95% confidence interval for study 1 is (0.05–1.03). This interval includes 1, which indicates The previous section illustrated how a RRR estimate that on its own the trial is inconclusive, because both and corresponding standard error could be calcu- beneficial and harmful effect size estimates are lated from summary data extracted from individual included in the interval and are in some sense plau- study reports. In other instances different effect sible. This highlights the need to consider the preci- measures may be more appropriate, but the general sion of the estimate; the study estimated a very large principle that an estimate and SE are required from treatment effect, but did so very imprecisely; the true each study remains. When outcomes are reported effect could be much smaller (or larger) than the on a binary scale, the odds ratio, risk ratio or risk point estimate. The 95% confidence intervals for difference measures are commonly used, while ln(RRR) and RRR for the remaining studies are outcomes measured on a continuous scale can be provided in columns 12 and 13, respectively. To aid combined directly, or standardized – if different examination of the results of the individual studies, scales of measurement have been used in the indi- these intervals can be plotted on the same axis, as in vidual studies. Descriptions and formulae for each of Fig. 1. The RRR estimate for each study is plotted, these outcome measures and others are available with the size of the plotting symbol proportional to elsewhere (Fleiss 1993; Sutton et al. 2000c). the precision of the estimate. The 95% confidence The simplest way to combine estimates is to interval for each RRR estimate is also plotted (the average them. Since different studies estimate the more precise estimates having the smaller confidence true effect size with varying degrees of precision, a intervals) (other features of this figure will be weighted average is used. The weight given to each explained in due course). This plot highlights the study in the re-admissions meta-analysis is calculated variability in the estimates and in the precisions by: between studies. The issue of variability between esti- mates from individual studies is considered further in 1 weight = 2 . later sections. SE(ln(RRR)) © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148 139
  • 6. A.J. Sutton et al. The square of the standard error is often known as treatment effect. Many people feel that in medical the variance, so combining studies using this weight- and related research such an assumption is unrealis- ing is often called the inverse variance-weighted tic (Thompson 1993) because studies are never iden- method (Fleiss 1993). The weightings for each study tical replications of one another, and study design are provided in Table 1, column 14. If an effect and conduct differences will inevitably have some measure other than the RRR is being used, then the degree of influence on study outcome. Models which weightings are calculated by the same principle, using account for underlying variability in the treatment the inverse of the variance of that effect measure. effect estimates are considered in the next section. Once weight for each study has been calculated, a pooled estimate of ln(RRR) is calculated by multi- Heterogeneity and random effect models plying each study’s weight by its ln(RRR) and summing the resulting values, and then dividing this When performing a meta-analysis, although the value by the sum of the weights. Using figures from overall aim may be to produce an overall pooled esti- Table 1, the outline calculation for the re-admissions mate of treatment effect, it is crucial to assess the data is: variation between results of the primary studies and, if possible, to investigate why they differ. Clearly, it ln( pooled RRR) would be remarkable if all studies being meta- [(1.64 ¥ (-1.504)) + . . . + (14.55 ¥ (-1.504))] analysed produced exactly the same treatment effect = (1.654 + . . . + 14.55) estimate. Some variation in results is expected, due = -0.164 simply to the play of chance; this is often called random variation. However, if effect size estimates The variance for ln(pooled RRR) (or any other vary between studies to a greater extent than effect measure used) is then calculated by taking the expected on the basis of chance alone the studies are reciprocal of the sum of the weights (1/sum of considered to be heterogeneous, and it is necessary weights): to account for the extra variation, above that ex- var ( pooled RRR) = 1 (1.64 + . . . + 14.55) pected by chance, in the meta-analysis model. The = 0.0006 way this is usually performed is through the use of a random-effect model. Essentially, this relaxes the Using these figures, an approximate 95% confidence assumption that each study is estimating exactly the interval for the pooled estimate can be calculated same underlying treatment effect, and instead in the same manner as confidence intervals were assumes that the underlying effect sizes are drawn produced for the individual study estimates above. from a distribution of effect sizes. This distribution is The pooled estimate of RRR for the re-admissions usually assumed to be Normal, with a variance deter- dataset is 0.85 with 95% CI (0.81–0.89), indicating a mined by the data. In practical terms, accounting for modest, statistically significant treatment benefit at between study heterogeneity in this way produces a the 5% level.This estimate is plotted using a diamond pooled point estimate which is often (but not always) shape in Fig. 1 directly below the 30 individual similar to the one produced by fixed-effect methods. studies. Figure 1 is often called a forest plot and is However, taking into account between study hetero- commonly used to display the results of a meta- geneity produces a wider 95% confidence interval, so analysis. the estimate is more conservative. This approach is often known as a fixed-effect The whole issue of appropriateness and suitability approach, to distinguish it from the random-effect of fixed- and random-effect models for meta-analysis models described below. It can be used to combine has been much discussed (Thompson 1993; Peto outcomes on any scale; however, other related fixed- 1987). A test for heterogeneity exists (Fleiss 1993), effect methods specifically for combining odds ratios and the result of this test can then be used to inform also exist (Fleiss 1993; Sutton et al. 2000c). These model choice. If it is non-significant a fixed-effect fixed-effect methods all make the strong assumption model is to be used, and if it is significant a random- that each study is estimating the same underlying effect model should be used. This seemingly sensible 140 © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148
  • 7. Meta-analysis methods approach has a flaw because the test has low power. desirable than using random-effect models to allow This implies that heterogeneity may exist even when for heterogeneity is to try to explain the heterogene- it produces a non-significant result (Boissel et al. ity. This may lead to the identification of associations 1989). An alternative approach is to always use a between study or patient characteristics and the random-effect model. The inflation of the confidence outcome measure, which would not have been pos- interval is dictated by the degree of variation sible in single studies. This may lead in turn to clini- between studies, so when between-study variation is cally important findings and may eventually assist in small the inflation will be negligible, producing a individualizing treatment regimes (Lau et al. 1998). result which would be very similar to the fixed-effect Both subgroup analyses and regression methods can approach. be used to do this. A detailed description of the random-effect meta- Potential study level factors, pertaining to either analysis model is beyond the scope of this paper, but study design or patient characteristics which could clear accounts are given elsewhere (DerSimonian & affect study results should ideally be identified before Laird 1986; Shadish & Haddock 1994). Combining a meta-analysis is conducted. If this is carried out, the 30 studies evaluating interventions to prevent re- data on these factors can then be obtained at the data admission using a random-effect model produces a extraction stage of a review, and such explicit a priori RRR of 0.83 (0.73–0.93). This estimate is plotted specification also reduces the temptation of ‘data below the fixed-effect one in Fig. 1. The estimate of dredging’. the between-study variance is 0.057, which is quite Returning to the re-admission dataset, one poten- small but non-negligible (the test for between- tial factor which could affect results is whether the study heterogeneity is highly significant (P < 0.001)). intervention was administered by a team or an Accounting for this heterogeneity has produced a individual. This information is given for each study wider confidence interval compared to the fixed- in column 15 of Table 1. In 16 of the studies the effect approach, which is a typical finding. Modifica- intervention was administered by an individual tions to the way the parameters in a random-effect and in 14 it was administered by a team. Separate meta-analysis model are calculated have been devel- meta-analyses can be performed for these two sub- oped (Hardy & Thompson 1996; Biggerstaff & groups in an attempt to see if the effectiveness of Tweedie 1997). One of these should be used if the the intervention depends on whether an individual number of studies in the meta-analysis is small or team implements it, and whether between study (approximately less than 10) as it overcomes prob- heterogeneity is reduced in the subgroups. Pooled lems with a previous simplification in the model cal- estimates for these subgroups turn out to be almost culations, which can be important in meta-analyses of identical. The intervention administered by indi- small numbers of studies. vidual subgroup has a RRR of 0.83 (0.70–0.97) and A final point concerning between study hetero- the estimate of the between-study heterogeneity geneity is that there is little explicit guidance to offer of 0.056 (test for heterogeneity highly significant at regarding the point at which studies estimates should P < 0.001). For the studies where the intervention not be pooled at all because heterogeneity is deemed was administered by a team the RRR was 0.83 too great, but alternative approaches are discussed (0.69–0.99) and the estimate of between-study below. heterogeneity 0.062 (test for heterogeneity highly significant at P < 0.001). Hence, it would appear that whether the intervention is administered by an individual or a team makes very little difference to Exploring and explaining heterogeneity the effectiveness of the intervention and, hence, does Until now, the impression has been given that het- not explain any of the variation between study erogeneity is a nuisance factor which needs account- results. ing for when performing a meta-analysis. However, If the factor of interest is measured on a continu- investigating why between-study variation exists ous scale, or dummy indicator variables are created offers the meta-analyst unique opportunities. More for the levels of categorical factors, then meta- © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148 141
  • 8. A.J. Sutton et al. regression can be used to explore their impact. Meta- Publication and related biases regression models are very similar in principle to ordinary simple linear regression models, the main Publication bias exists because research with statisti- differences being that individual observations (the cally significant or interesting results is potentially primary studies), unlike individual patients, are not more likely to be submitted, published or published given equal weight in the analysis (i.e. study should more rapidly than work with null or non-significant be weighted according to its precision). Addition- results (Song et al. 2000). When only the published ally, it may be desirable to include a random-effect literature is included in a meta-analysis, this can term to account for residual heterogeneity not potentially lead to biased over-optimistic conclu- explained by the covariate(s); such a model can be sions. Related biases which can also bias the results thought of as an extension to the random-effect of a meta-analysis include (i) pipeline bias, when sig- model described above (Berkey et al. 1995). An nificant results are published quicker than non-sig- example of a meta-regression analysis is given in nificant ones; and (ii) language bias, when researchers section 3. whose native tongue is not English are more likely to Meta-regression techniques are currently used publish their non-significant results in non-English relatively rarely, and the authors believe not to written journals, but are more likely to publish their their full potential, but examples are emerging significant results in English. If this happens, a meta- (Freemantle et al. 1999; von Dadelszen et al. analysis including only study reports in English may 2000). Although a powerful tool, they do have their be based on a biased collection of studies. Perhaps an limitations. Regression analysis of this type are also appropriate term which includes all these sources of susceptible to aggregation bias, which occurs if the bias is ‘dissemination bias’ (Song et al. 2000). relation between patient characteristic study means Long-term initiatives to alleviate the problem of and outcomes do not directly reflect the relation publication bias have commenced, including trial between individuals’ values and individuals’ out- amnesties (Horton 1997) to encourage publication of comes (Greenland 1987). Additionally, meta- previously unpublished trials, and the creation of reg- regression type analyses are often limited by the istries for prospective registration of trials (Horton number of studies included in the meta-analysis. & Smith 1999). However, the issue is currently still Special regression models have also been developed a big concern for researchers carrying out meta- to explore the effect of patients’ underlying risk on analyses. There are certain measures which can be intervention effect (Senn et al. 1996; Walter 1997) taken to assess the presence and minimize the impact which are necessary to avoid producing incorrect of publication bias in a meta-analysis dataset. Cur- results when exploring the effect of such a factor rently, however, there is much debate, and some (Schmid et al. 1998; Senn et al. 1996). dispute as to the approach researchers should take to deal with publication bias in meta-analyses. The presence of publication bias in a meta- analysis dataset can be assessed informally by inspec- 3 Threats to the validity of a meta-analysis tion of a funnel plot (Light & Pillemar 1984). This Although meta-analyses are often considered to plots the effect size for each study against some provide the highest grade of evidence available measure of its precision, e.g. the 1/standard error of regarding the effectiveness of an intervention, higher the effect size. The resulting plot should be shaped than an individual trial, it should not be forgotten like a funnel if no publication bias is present. This that they are a type of observational study, and as shape is expected because trials of decreasing size such are open to biases which may threaten their have increasingly large variation in their effect validity. Perhaps the two most serious problems size estimates due to random variation becoming which can potentially lead to biased estimates are increasingly influential. However, if the chance of publication bias and variable study quality of the publication is greater for larger trials or trials with primary studies. These two issues are considered statistically significant results, some small non- further below. significant studies may not appear in the literature. 142 © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148
  • 9. Meta-analysis methods 20 part of a sensitivity analysis is sensible (Sutton et al. 2000a) but more research is needed in this area. 1/standard error (In(RRR)) 16 12 Study quality 8 It is rare that all the studies available for a meta- analysis are of a unanimously high quality. More 4 likely there will be a range in the quality of the research pertaining to the intervention of interest. 0 Restricting a meta-analysis to include only RCTs is 0.25 0.5 0.75 1.0 1.5 2.0 a safeguard taken by such groups as the Cochrane Re-admission rate ratio (log scale) Collaboration, in an attempt to include only evidence which potentially produces the least biased results. Figure 2 Funnel plot of studies included in the hospital Restricting analyses only to RCT does not guarantee discharge meta-analysis examining the effect of interventions on the effect of re-admission rates. the meta-analysis will produce an unbiased result, however, as there can still be methodological flaws in the design, conduct and analysis of a trial. Clearly, This leads to omission of trials in one corner of the the inclusion of poor or flawed studies in a meta- plot – the bottom right-hand corner of the plot when analysis may be problematic because their influence an ‘undesirable’ outcome such as the re-admission may bias the pooled result and even mean the meta- rate is being considered, and hence to a degree of analysis cones to the wrong qualitative conclusions. asymmetry in the funnel. A funnel plot for the 30 Unfortunately, most studies are flawed to some RCTs in the re-admissions dataset is provided in degree, and including all but ‘perfect’ studies (which Fig. 2. Visual inspection would suggest that there is may not be possible to conduct due to ethical or prac- little evidence of publication bias in this dataset; tical constraints in some fields) may leave the meta- however, there are a few small studies with extremely analyst with few if any data. The problem of dealing beneficial RRRs at the bottom left-hand corner of with study quality in a meta-analysis is similar to that the plot, for which there are no symmetric counter- for publication bias, in the sense that there is agree- parts with extreme positive RRRs in the bottom ment that some assessment of quality should always right-hand corner. be made, but little consensus on how to make such Publication bias can be tested for more formally an assessment, or how to incorporate the results into using statistical tests which are based on the same the meta-analysis. symmetry assumptions as a funnel plot assessment There have been many scales and checklists devel- (Begg & Mazumdar 1994; Egger et al. 1997; Duval & oped to aid in the assessment of study quality (Moher Tweedie 1998). One formal test (Egger et al. 1997) et al. 1995) but many of them have come under heavy produces a non-significant P-value of 0.57 for the re- criticism for not being constructed scientifically admissions dataset, which is consistent with the (Moher et al. 1999a). Further, recent work has de- inconclusive visual assessment. monstrated that different results can be obtained in Disagreement exists about how to proceed if pub- a meta-analysis depending on the checklist used (Juni lication bias is suspected, after an assessment for its et al. 1999). A further problem is the fact that it is presence has been made. Methods to assess the likely often difficult to ascertain all the required details of impact of publication bias on the pooled outcome the trial from a study report (Begg et al. 1996). Often, estimate have been developed (Duval & Tweedie this means that an assessment of the trial report and 1998; Givens et al. 1997; Copas 1999; Song et al. 2000) not of the trial itself is in effect being made. The but they are not widely used, due partly to the fact underlying problem with the use of a scale or check- that many are complex and hence difficult to imple- list is that it is impossible to predict which design ment, and due partly to concerns about their applic- aspects cause the most bias and, more fundamentally, ability. We believe that the use of such methods as it is often impossible to predict even the direction in © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148 143
  • 10. A.J. Sutton et al. which any bias will be acting (Schulz et al. 1995). This 2.0 Re-admission rate ratio (log scale) makes the direct adjustment of study estimates for study quality impossible. Several ways in which study quality can be incor- 1.0 porated into a meta-analysis have been suggested. Perhaps the simplest is to use a quality threshold to include or exclude studies. This could be defined 0.5 using a cut-off value, on a particular quality scale, or as a requirement of having several design aspects present. A further possibility is to use a quality score to weight study results, or incorporate such a score 0.2 into the standard precision weightings (Berard & 1 2 3 4 5 6 7 Bravo 1998). Finally, an approach which appears to EPOC quality score be gaining support is the exploration of quality, via Figure 3 Regression line examining the impact of meta-regression. In such an approach a quality score, quality score, using a random effect meta-regression or individual markers of study quality, such as the model for re-admission rate in the hospital discharge degree of blinding or method of treatment allocation, meta-analysis. are included in a regression model as explanatory variables. Examining individual markers of quality separately eliminates the problems with the some- further developments have been made. A proportion what arbitrary construction for the quality scale of these focus on the synthesis of less standard data scoring systems (Detsky et al. 1992). types. For example, specialist methods are required Returning to the re-admissions meta-analysis, to pool the results of diagnostic tests because two study quality was rated crudely using a count of effec- outcomes, specificity and sensitivity, require simulta- tive practice and organization of care (EPOC) neous consideration (Irwig et al. 1995). Another area quality criteria (Cochrane Effective Practice & Orga- which requires special methods is the analysis of sur- nization of Care Review Group 1998) that were vival data because account has to be made of cen- satisfied for each study. The scores obtained by each sored observations. (Dear 1994) Other data-types for trial using this method are given in the penultimate which specialist methods have been developed are column of Table 1. When these scores are included in dose–response data (Tweedie & Mengersen 1995) a random effect regression model, the equation and economic data (Jefferson et al. 1996). Individual ln(RRR) = -0.22 + 0.007 ¥ quality score is obtained. patient data (Stewart & Clarke 1995), where original This regression line, together with the primary study datasets are pooled, rather than relying on pub- studies (the size of the plotting symbol is propor- lished summary data has been described as the gold tional to the precision of the effect size estimate), are standard, it is considered by some to be the only way plotted in Fig. 3. The quality score coefficient is small to carry out a meta-analysis of survival data, and is and not statistically significant (P = 0.88). This means much more time consuming and costly than meta- study quality, at least as measured in this way, would analysis of summary data. It is currently unclear not appear to affect the study results systematically, whether the extra effort required is worthwhile. For or to explain the between-study heterogeneity. an overview of these and further meta-analytical developments see Sutton et al. (2000c). 4 Further developments in methods of meta-analysis New directions for meta-analysis using Bayesian statistics Specialist meta-analysis methods In addition to the above developments, more While section 2 provided a summary of the most advanced methods for synthesis of information have commonly used methods in meta-analysis, many been developed. Although not currently used rou- 144 © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148
  • 11. Meta-analysis methods tinely, these provide potentially more powerful and ment many existing meta-analysis methods devel- flexible tools for synthesizing evidence. Many of oped classically and, more importantly, develop these methods use Bayesian statistics, in contrast to models not possible using more traditional classical the more commonly employed classical approach. software. This has potentially huge benefits for syn- A full description of Bayesian methods is not pos- thesizing information and builds on earlier pioneer- sible here, but for a recent review of their use in ing work by Eddy et al. (1992), whose ‘new’ graphical assessing health technologies see Spiegelhalter et al. approach to meta-analysis can now be implemented (2000a). The key element of the Bayesian approach using WinBUGS (Spiegelhalter et al. 2000b). Issues is that it introduces the idea of subjective probability being addressed by these methods are outlined (O’Hagan 1988) in contrast to the objective pro- below: babilities traditionally attached to specific, often 1 Data from an RCT may be of direct interest, but repeatable, events. Before carrying out a piece of not of a form which can simply included in a meta- research, an investigator would have formed some analysis. For example, data from an RCT which uses prior beliefs regarding its outcome, possibly derived the intervention of interest in the treatment arm, but from results of previous research in the same field. a different intervention from the other studies in These a priori beliefs are combined with the data the control arm may be available. Methods to from the current investigation to produce results include such data have been developed (Higgins & which reflect the researchers beliefs having con- Whitehead 1996). ducted the research. These posterior beliefs are cal- 2 In some assessments considering only random- culated by combining the prior beliefs with the new ized evidence may not be the optimal approach. data using Bayes’ Theorem, which forms the back- Observational studies, which could potentially be bone of all Bayesian analysis. very large, providing valuable data on thousands of The advantages of using such an approach are patients, may be available. It may sometimes seem often subtle, but important. Perhaps most notable unjust to exclude these from a meta-analysis, partic- from a health-care context is the ability to make ularly if they are of high quality, as they may have direct probability statements regarding quantities of particular strengths and weaknesses, different from interest, for example, the probability that patients those of randomized studies (Droitcour et al. 1993). receiving drug A have better survival than those Special methods have been developed to account for who receive drug B. There are good reasons, different study designs in a meta-analysis (Prevost however, why the Bayesian approach has largely et al. 2000; Larose & Dey 1997). In other instances been neglected in routine use. The most serious data on the effect of a drug of interest in animals may is that, generally, the computations required in be available and provide valuable information which Bayesian models are very complex. Additionally, the can be incorporated (DuMouchel & Harris 1983). expressing of prior beliefs in form which can be 3 There may be benefits to including information included in analysis is a non-trivial task. Excitingly, included in previous trials or meta-analyses on many of the computational difficulties have been similar topics using similar interventions and out- addressed recently, with the development of special- come measures (Higgins & Whitehead 1996). ist software, most notably WinBUGS (Spiegelhalter 4 A study may not provide any quantitative data at et al. 2000b). The problem of expressing prior beliefs all, being qualitative in design, but this qualitative remains; however, there are practical ‘solutions’, data may be of direct relevance to the topic under including using ‘off-the-shelf’ priors, which can assessment (Roberts et al. 1998). express the presence of a range of degrees of prior Bayesian modelling gives us the potential to knowledge, and can be used in a sensitivity analysis. include all these types of data in a variety of ways, Use of ‘vague priors’, which essentially means prior including direct input into the model, or incorporated information is ignored, is also possible. through the specification of prior beliefs. The new WinBUGS software is able to compute Other new approaches to meta-analysis have been the calculations required for a wide range of suggested, but the corresponding methodology is at Bayesian analyses. The user has freedom to imple- the conceptual rather than practical stage of devel- © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148 145
  • 12. A.J. Sutton et al. opment. The extension of meta-regression to the ideas which sound natural and appealing are often simultaneous modelling of multiple scientific factors difficult to implement in practice. Clearly, it will be with the intention of producing a response surface of some time before they are used routinely, but signif- treatment effects, rather than a single pooled result icant steps have been made. Moving the synthesis of has been advocated (Rubin 1992). This may allow a evidence beyond calculating simple averages is more detailed examination of the science underlying timely, feasible and, indeed, essential. the results synthesized (Rubin 1992; Lau et al. 1998). Further, it may be possible to model different aspects Acknowledgements of the processes under study separately. For example, if one were interested in the effect of lowering cho- The research on which this paper is based was lesterol of clinical outcomes, in a first stage of the funded, in part, by the NHS Research and Develop- analysis data relating to the degree different inter- ment Health Technology Assessment Programme ventions lower cholesterol levels could be synthe- (Methodology Project Numbers 93/52/3 & 95/09/03) sized. Then, in a second stage, the relationship between cholesterol level and various clinical out- References comes could be modelled (Katerndahl & Lawler Begg C., Cho M., Eastwood S., Horton R., Moher O. & 1999). A further utilization of Bayesian modelling Olkin I. (1996) Improving the quality of reporting of could allow meta-analysis to be placed within a deci- randomised controlled trials: the CONSORT statement. sion theoretical framework (Berger 1980) which can Journal of the American Medical Association 276, also take into account utilities when making health 637–639. care or policy decisions (Midgette et al. 1994). Begg C.B. & Mazumdar M. (1994) Operating characteris- However, there is no magic wand to make all this tics of a rank correlation test for publication bias. Bio- happen. While Bayesian modelling provides flexibil- metrics 50, 1088–1101. ity and framework, it does not dictate how models Berard A. & Bravo G. (1998) Combining studies using should be specified, data should be incorporated, or effect sizes and quality scores: application to bone loss how priors should be elicited. There is much method- in postmenopausal women. Journal of Clinical Epidemi- ological work required to further develop the ideas ology 51, 801–807. outlined above. Berger J.O. (1980). Statistical Decision Theory and Bayesian Analysis, 2nd edn. Springer-Verlag, New York. Berkey C.S., Hoaglin D.C., Mosteller F. & Colditz G.A. 5 Conclusion (1995) A random-effects regression model for meta- analysis. Statistics in Medicine 14, 395–411. Much has been written on meta-analysis and the syn- Biggerstaff B.J. & Tweedie R.L. (1997) Incorporating vari- thesis of evidence within the medical literature over ability in estimates of heterogeneity in the random the past two decades. During this time, the basic syn- effects model in meta-analysis. Statistics in Medicine 16, thesizing of effect measures using weighed averages 753–768. has been refined to a high degree, and much of the Boissel J.P., Blanchard J., Panak E., Peyrieux J.C., SACKS methodology required to do so is in place for most & H. (1989) Considerations for the meta-analysis of ran- situations encountered. Threats to the validity of domized clinical trials: summary of a panel discussion. meta-analysis exist, and the methods for dealing with Controlled Clinical Trials 10, 254–281. problems such as publication bias and variations in Cochrane Effective Practice and Organisation of Care Review Group (1998) The Data Collection Checklist. quality of the primary studies are at a less refined University of Aberdeen, HSRU, Aberdeen. stage. Additionally, many consider the standard Copas J. (1999) What works?: selectivity models and meta- ‘weighted average approach’ to meta-analysis not to analysis. Journal of the Royal Statistical Society, Series A be ‘state of the art’ in at least some situations, where 161, 95–105. the use of more sophisticated methods, generally to von Dadelszen P., Ornstein M.P., Bull S.B., Logan A.G., synthesize a broader base of evidence, would be Koren G. & Magee L.A. (2000) Fall in mean arterial advantageous. Currently, such approaches are still pressure and fetal growth restriction in pregnancy hyper- firmly in the experimental stage and unfortunately tension: a meta-analysis. Lancet 355, 87–92. 146 © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148
  • 13. Meta-analysis methods Dear K.B.G. (1994) Iterative generalized least squares for epidemiological literature. Epidemiological Review 9, meta-analysis of survival data at multiple times. Biomet- 1–30. rics 50, 989–1002. Hardy R.J. & Thompson S.G. (1996) A likelihood approach Deeks J., Glanville J. & Sheldon T. (1996) Undertaking sys- to meta-analysis with random effects. Statistics in Medi- tematic reviews of research on effectiveness: CRD cine 15, 619–629. guidelines for those carrying out or commissioning Higgins J.P.T. & Whitehead A. (1996) Borrowing strength reviews. Report no. 4. Centre for Reviews and Dissemi- from external trials in a meta-analysis. Statistics in Med- nation. York Publishing Services Ltd, York. icine 15, 2733–2749. DerSimonian R. & Laird N. (1986) Meta-analysis in clini- Hollis S. & Campbell F. (1999) What is meant by intention cal trials. Controlled Clinical Trials 7, 177–188. to treat analysis? Survey of published randomised con- Detsky A.S., Naylor C.D.O., Rourke K., McGeer A.J.L., trolled trials. British Medical Journal 319, 670–674. Abbe K.A., O’Rourke K. & L’Abbe K.A. (1992) Incor- Horton R. (1997) Medical editors trial amnesty. Lancet 350, porating variations in the quality of individual random- 756. ized trials into meta-analysis. Journal of Clinical Horton R. & Smith R. (1999) Time to register randomised Epidemiology 45, 255–265. trials – the case is now unanswerable. British Medical Dickersin K., Scherer R. & Lefebvre C. (1994) Systematic Journal 319, 865–866. reviews – identifying relevant studies for systematic Hunt M. (1997) How Science Takes Stock: the story of meta- reviews. British Medical Journal 309, 1286–1291. analysis. Russell Sage Foundation, New York. Droitcour J., Silberman G. & Chelimsky E. (1993) Irwig L., Macaskill P., Glasziou P. & Fahey M. (1995) Meta- Cross-design synthesis: a new form of meta-analysis analytic methods for diagnostic test accuracy. Journal of for combining results from randomized clinical trials Clinical Epidemiology 48, 119–130. and medical-practice databases. International Journal Jefferson T., Mugford M., Gray A. & DeMicheli V. (1996) of Technology Assessment in Health Care 9, 440– An exercise in the feasibility of carrying out secondary 449. economic analysis. Health Economics 5, 155–165. DuMouchel W.H. & Harris J.E. (1983) Bayes methods for Juni P., Witschi A., Bloch R. & Egger M. (1999) The hazards combining the results of cancer studies in humans and of scoring the quality of clinical trials for meta-analysis. other species (with comment). Journal of the American Journal of the American Medical Association 282, Statistical Association 78, 293–308. 1054–1060. Duval S. & Tweedie R. (1998) Practical estimates of the Katerndahl D.A. & Lawler W.R. (1999) Variability in meta- effect of publication bias in meta-analysis. Australasian analytic results concerning the value of cholesterol re- Epidemiologist 5, 14–17. duction in coronary heart disease: a meta-meta-analysis. Eddy D.M., Hasselblad V. & Shachter R. (1992) Meta- American Journal of Epidemiology 149, 429–441. Analysis by the Confidence Profile Method. Academic Larose D.T. & Dey D.K. (1997) Grouped random effects Press, San Diego. models for Bayesian meta-analysis. Statistics in Medicine Egger M., Smith G.D., Schneider M. & Minder C. (1997) 16, 1817–1829. Bias in meta-analysis detected by a simple, graphical test. Lau J., Ioannidis J.P. & Schmid C.H. (1998) Summing up British Medical Journal 315, 629–634. evidence: one answer is not always enough. Lancet 351, Fleiss J.L. (1993) The statistical basis of meta-analysis. Sta- 123–127. tistical Methods in Medical Research 2, 121–145. Light R.J. & Pillemar D.B. (1984) Summing Up: the science Fleiss J.L. (1994) Measures of effect size for categorical of reviewing research. Harvard University Press, Cam- data. In The Handbook of Research Synthesis (eds H. bridge, MA. Cooper & L.V. Hedges), pp. 245–260. Russell Sage Foun- Midgette A.S., Wong J.B., Beshansky J.R., Porath A., dation, New York. Fleming C. & Pauker S.G. (1994) Cost-effectiveness of Freemantle N., Cleland J., Young P., Mason J. & Harrison streptokinase for acute myocardial-infarction – a com- J. (1999) b-Blockade after myocardial infarction: sys- bined metaanalysis and decision-analysis of the effects tematic review and meta regression analysis. British of infarct location and of likelihood of infarction. Medical Journal 318, 1730–1737. Medical Decision Making 14, 108–117. Givens G.H., Smith D.D. & Tweedie R.L. (1997) Publica- Moher D., Cook D.J., Eastwood S., Olkin I., Rennie D. & tion bias in meta-analysis: a Bayesian data-augmentation Stroup D. for the QUORUM Group (1999b) Improving approach to account for issues exemplified in the passive the quality of reporting of meta-analysis of randomised smoking debate. Statistical Science 12, 221–250. controlled trials: the QUORUM statement. Lancet 354, Greenland S. (1987) Quantitative methods in the review of 1896–1900. © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148 147
  • 14. A.J. Sutton et al. Moher D., Jadad A.R., Nichol G., Penman M., Tugwell P. & Song F., Easterwood A., Gilbody S., Duley L. & Sutton Walsh S. (1995) Assessing the quality of randomized con- A.J. (2000) Publication and other selection biases in trolled trials – an annotated bibliography of scales and systematic reviews. Health Technology Assessment 4(10), checklists. Controlled Clinical Trials 12, 62–73. 1–115. Moher D., Klassen T.P., Jadad A.R., Tugwell P., Moher M. & Spiegelhalter D.J., Miles J.P., Jones D.R. & Abrams K.R. Jones A.L. (1999a) Assessing the quality of randomised (2000a) Bayesian methods in health technology assess- controlled trials: implications for the conduct of meta- ment. Health Technology Assessment 4(38), 1–142. analyses. Health Technology Assessment 3(12), 1–98. Spiegelhalter D.J., Thomas A. & Best N.G. (2000b) O’Hagan A. (1988) Probability: methods and measurement. Winbugs, version 1.2. user manual. MRC Biostatistics Chapman & Hall, London. Unit, Cambridge. Oxman A.D. (1996) The Cochrane Collaboration Hand- Sterne J.A.C., Egger M. & Sutton A.J. (2001) Meta- book: preparing and maintaining systematic reviews, 2nd analysis software. In Systematic Reviews in Health Care: edn. Cochrane Collaboration, Oxford. meta-analysis in context, 2nd edn (eds M. Egger, G. Parker S.G., Peet S.M., McPherson A., Cannaby A.M., Davey Smith & D.G. Altman), pp. 336–346. BMJ Books, Baker R.,Wilson A., Lindesay J., Parker G.,Abrams K.R. London. & Jones D.R. (2001) A systematic review of discharge Stewart L.A. & Clarke M.J. (1995) Practical methodology arrangements for older people. Health Technology of meta-analyses (overviews) using updated individual Assessment (in press). patient data. Cochrane Working Group on Statistical Peto R. (1987) Why do we need systematic overviews of Medicine 14, 2057–2079. randomised trials? Statistics in Medicine 6, 233–240. Sutton A.J., Abrams K.R., Jones D.R., Sheldon T.A. & Song Prevost T.C., Abrans K.R. & Jones D.R. (2000) Hierarchi- F. (1998) Systematic reviews of trials and other studies. cal models in generalized synthesis of evidence: an Health Technology Assessment 2(19), 1–310. example based on studies of breast cancer screening. Sutton A.J., Abrams K.R., Jones D.R., Sheldon T.A. & Statistics in Medicine 19, 3359–3376. Song F. (2000c) Methods for Meta-Analysis in Medical Roberts K.A., Jones D.R., Abrams K.R., Dixon-Woods M. Research. John Wiley, London. & Fitzpatrick R. (1998) Meta-analysis of qualitative and Sutton A.J., Duval S.J., Tweedie R.L., Abrams K.R. & Jones quantitative evidence: an example based on studies of D.R. (2000a) Empirical assessment of effect of publica- patient satisfaction. Technical Report 98–01, University tion bias on meta-analyses. British Medical Journal 320, of Leicester: Department of Epidemiology and Public 1574–1577. Health, Leicester. Sutton A.J., Lambert P.C., Hellmich M., Abrams K.R. & Rubin D. (1992) A new perspective. In The Future of Meta- Jones D.R. (2000b) Meta-analysis in practice: a critical Analysis (eds K.W. Wachter & M.L. Straf), pp. 155–165. review of available software. In Meta-Analysis in Medi- Russell Sage Foundation, New York. cine and Health Policy (eds D.A. Berry & D.K. Stangl). Schmid C.H., Lau J., McIntosh M.W. & Cappelleri J.C. Marcel Dekker, New York. (1998) An empirical study of the effect of the control Thompson S.G. (1993) Controversies in meta-analysis: the rate as a predictor of treatment efficacy in meta-analysis case of the trials of serum cholesterol reduction. Statisti- of clinical trials. Statistics in Medicine 17, 1923–1942. cal Methods in Medical Research 2, 173–192. Schulz K.F., Chalmers I., Hayes R.J. & Altman D.G. (1995) Tweedie R.L. & Mengersen K.L. (1995) Meta-analytic Empirical evidence of bias: dimensions of methodologi- approaches to dose–response relationships, with appli- cal quality associated with estimates of treatment effects cation in studies of lung cancer and exposure to envi- in controlled trials. Journal of the American Medical ronmental tobacco smoke. Statistics in Medicine 14, Association 273, 408–412. 545–569. Senn S., Sharp S., Thompson S. & Altman D. (1996) Rela- Walter S.D. (1997) Variation in baseline risk as an expla- tion between treatment benefit and underlying risk in nation of heterogeneity in meta-analysis. Statistics in meta-analysis. British Medical Journal 313, 1550–1551. Medicine 16, 2883–2900. Shadish W.R. & Haddock C.K. (1994) Combining estimates Whitehead A. & Jones N.M.B. (1994) A meta-analysis of of effect size. In The Handbook of Research Synthesis clinical trials involving different classifications of (eds H. Cooper & L.V. Hedges), pp. 261–284. Russell response into ordered categories. Statistics in Medicine Sage Foundation, New York. 13, 2503–2515. 148 © 2001 Blackwell Science, Journal of Evaluation in Clinical Practice, 7, 2, 135–148