Principles of Diagnostic Testing and ROC 2016

Principles of Diagnostic Testing
Statistics for Research
William F. Auffermann, MD/PhD
Department of Radiology and Imaging Sciences
Emory University School of Medicine

Learning Objectives
• Provide an overview of the basic statistical
concepts needed to critically appraise and
perform research

Diagnostic Testing
• Diagnostic tests are designed to answer
specific medical questions.
• When there is concern for a medical
disease, appropriate diagnostic testing can
be used to better risk stratify patients
• The probability of a disease after testing is a
function of both pre-test probability and the
results of the test.

Diagnostic Testing
• Diagnostic testing may be thought of as a
way of refining the estimate for the
probability of a patient having a particular
disease.
• Understanding the principles of diagnostic
testing requires an understanding of
probability and statistics.

Probability and Statistics
Two Sides of the Same Coin
• Probability: assumes you know the
underlying laws of a process, and can be
used to predict outcomes
• Statistics: used to compare data with
theory/model and look at how well they
agree

Hypothesis
• A proposed explanation for a phenomenon‡
• A key aspect of diagnostic testing and
statistics is formulation of a good
hypothesis
‡ http://en.wikipedia.org/wiki/Hypothesis
Accessed 2014-11-13

Hypothesis
• Hypothesis are often paired with their
logical opposite
• The null hypothesis (H0) is considered the
default hypothesis
• The alternative hypothesis (HA) its logical
complement

Hypothesis
• H0: the medication does not reduce blood
pressure
• HA: the medication does reduce blood
pressure

Hypothesis
• Hypotheses should address the question of
interest and be testable
• Clear statement of the hypothesis is critical
for appropriate statistical testing

Hypothesis
• H0: mean blood pressure in treatment group
the same as control group (MBP2 = MBP1)
• HA: mean blood pressure in treatment
group lower than the control group (MBP2
< MBP1)

Probability
• Probability relates to the likelihood of a
particular event occurring
• There is an assumption we know the laws
governing the behavior of the process being
examined
• For example if we have a fair coin where
the probability of heads/tails are both 0.5
(equal), then we can estimate the probability
of flipping a coin and obtaining: HHTH

Pre/Post Test Probability
• Diagnostic testing is useful as it effects the
post test probability of a diagnosis.
• Diagnostic testing which does not
significantly effect the post test probability
may not be clinically useful

• Let ‘p’ represent the probability of a disease
and ‘t’ the results of a diagnostic test
p2 = LR(t) * p1
• Where p1 and p2 are the pre and post test
probabilities respectively, and LR(t) is the
likelihood ratio for the test.
• LR(t) gives probability values for both
positive and negative results.

p2 = LR(t) * p1
Fagan nomogram
http://http://mcmasterevidence.wordpress.c
om/2013/02/20/what-are-pre-test-
probability-post-test-probability-and-
likelihood-ratios/
Accessed 2014-11-13

V/Q Scan
• Consider a patient with symptoms
concerning for pulmonary embolism.
• Based on the patients clinical symptoms, we
can risk stratify them for probability of
pulmonary embolism, corresponding to the
pretest probability (p1)

V/Q Scan
• A V/Q test is performed to better risk
stratify the patient.
• The various patterns of findings on V/Q
scan correlate with the probability of
pulmonary embolism

V/Q Scan
• The post-test probability is derived from
both the pretest probability and the results
of the test.

V/Q Scan
p(pretest)
p(test) 0.2 0.42 0.8
0.1 0.2 0.06
0.19 0.04 0.16 0.4
0.5 0.16 0.28 0.66
0.8 0.56 0.88 0.96
http://www.auntminnie.com/index.aspx?sec=ser&sub=def&pag=dis&ItemID=54625
Pretest for Well’s Scores; Posttest for VQ
Accessed 2014-11-13

V/Q Scan
J Nucl Med 2013; 54:1–5

p2 = LR(t) * p1
http://www.healthknowledge.org.uk/publ
ic-health-textbook/disease-causation-
diagnostic/2c-diagnosis-screening/ratios
Accessed 2014-11-13

Probability Distribution
• A probability
distribution function
gives the probability
of a certain value as a
function of value
p(x)
x

Probability Distributions
• There are several different probability
distributions
• Different physical and biological
phenomena can be modeled using different
distributions
• One of the most common naturally
occurring distribution is the normal
(Gaussian) distribution

Normal Distribution
-4 -3 -2 -1 0 1 2 3 4
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4

• Based on the knowledge of a probability
distribution, it is possible to estimate the
probability of observing a range of values

-4 -3 -2 -1 0 1 2 3 4
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4

• When performing or evaluation research it
is very important that the data being
modeled can actually be represented by the
proposed distribution
• Graphical displays of data can be helpful to
confirm this is true (frequency polygon,
histogram)

-10 -8 -6 -4 -2 0 2 4 6 8 10
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5

Statistics
The science that deals with the collection, classification,
analysis, and interpretation of numerical facts or data,
and that, by use of mathematical theories of probability,
imposes order and regularity on aggregates of more or
less disparate elements.
http://dictionary.reference.com/
Accessed 2014-11-13

Why Does Statistics Matter?
• Statistics provides a means of summarizing
a data set and making inferential statements
• Appropriate application can highlight
important aspects of the data
• Incorrect application can be confusing at
best, and misleading at worst
• Statistics do not ‘lie’, but they may be
misleading

Statistic
• A mathematical summary of a data set
• Examples include the mean (-), median (-),
mode (-), standard deviation

Statistic
0 5 10 15 20
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
mean (-), median (-), mode (-)
Gama(2,3)
Frequency

Statistic
• The selection of a statistic for representing
data should be based on the nature of the
process underlying the observations
• The statistic should be based on the model
which best represents the data

Statistics
• Qualitative: specific summary measures of
the data (statistics) may provide greater
clarity than the data set as a whole.
• Quantitative: Based on the underlying
theory of the process being measured,
inferential statements may be made
regarding whether the data and theory agree

Example - Qualitative
-4 -3 -2 -1 0 1 2 3 4
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09

Example - Qualitative
-4 -3 -2 -1 0 1 2 3 4
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
mean(x1) mean(x2)

Quantitative
• Based on known properties of the statistical
test in question and the distribution of the
data, it is possible to make statements of the
significance a result

P-values
• A p-value is the probability that a value
from the proposed distribution is the same
as or farther from the expected value than
the observed value.

P-values
-4 -3 -2 -1 0 1 2 3 4
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4

P-values
• The lower the p-value, the less likely that
the observed statistical value can be
explained by the model under H0

P-values
• Assume you want to know if a coin is a fair
coin (equal probability of H/T after
flipping)
• You flip the coin 100 times and get H 60
times. Is the coin fair?

P-values
0 10 20 30 40 50 60 70 80 90 100
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
pdf
Observed
Value
Area under curve = p-value = 0.0176

P-values
• By convention, p-values less than or equal
to 0.05 are generally considered statistically
significant
• Note that other thresholds can and are used
• Type I error (often denoted by α ) is the
probability of rejecting the null hypothesis
based on the result of a test if H0 is in fact
true.

Multiple Comparisons
• P-values give the probability of an value at
least as extreme as the one observed for a
single test.
• What happens if there are multiple tests?
• Does this affect our decision to consider p-
values less than 0.05 statistically
significant?

• Consider we are looking at a set of anti-
hypertensive medications for effect on
blood pressure
• A p-value of 0.05 corresponds to a 1/20
probability

• If we examine 20 medications, we would
expect 1 to have a p-value of 0.05 or lower
by chance alone even if there were no
therapeutic effect

-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1
-5 0 5
0
0.5
1

Mean T P-value
-0.0996 -0.6684 0.7461
-0.1300 -0.9387 0.8232
-0.1740 -1.0768 0.8560
0.0172 0.1023 0.4595
0.2228 1.4224 0.0813
-0.0330 -0.2339 0.5919
-0.0737 -0.4641 0.6774
0.3357 2.6773 0.0054
0.0493 0.3540 0.3626
0.1828 1.3001 0.1005
-0.0341 -0.1953 0.5769
0.3751 2.5683 0.0070
0.1226 0.6835 0.2491
0.0789 0.6016 0.2754
0.0108 0.0631 0.4750
-0.1832 -1.1043 0.8620
-0.1618 -1.0581 0.8518
0.0209 0.1269 0.4498
-0.1519 -1.1910 0.8797
-0.1685 -1.2920 0.8981
α = 0.05

• It is possible to correct for multiple
comparisons
• There are several ways to perform this
correction
• Several are dependant on knowledge of the
correlation between variables

Bonferroni Correction
• A conservative correction assuming each
test is independent
• The threshold for significance if changed to
the overall desired significance (often 0.05)
/ number of comparisons
• New threshold = 0.05/20 = 0.0025

Bonferroni Correction
• This correction adjusts the type I error such
that there is α overall probability of a
positive result for any test if H0 is true
(across all tests).

Mean T P-value
-0.0996 -0.6684 0.7461
-0.1300 -0.9387 0.8232
-0.1740 -1.0768 0.8560
0.0172 0.1023 0.4595
0.2228 1.4224 0.0813
-0.0330 -0.2339 0.5919
-0.0737 -0.4641 0.6774
0.3357 2.6773 0.0054
0.0493 0.3540 0.3626
0.1828 1.3001 0.1005
-0.0341 -0.1953 0.5769
0.3751 2.5683 0.0070
0.1226 0.6835 0.2491
0.0789 0.6016 0.2754
0.0108 0.0631 0.4750
-0.1832 -1.1043 0.8620
-0.1618 -1.0581 0.8518
0.0209 0.1269 0.4498
-0.1519 -1.1910 0.8797
-0.1685 -1.2920 0.8981
α = 0.0025

Diagnostic Testing
• Diagnostic tests are designed to answer
specific medical questions.
• When there is concern for a medical
disease, appropriate diagnostic testing can
be used to better risk stratify patients
• Recognize that diagnostic tests are not
perfect, and even the best may misclassify
patients.

Confusion Table
Test
Prediction
Positive
Test
Prediction
Positive
Actual Positive TP FN
Actual Negative FP TN

Confusion Table Derivations
• Sensitivity = TP / (TP + FN)
• Specificity = TN / (FP + TN)
• Positive Predictive Value
• PPV = TP / (TP + FP)
• Negative Predictive Value
• NPV = TN / (TN + FN)
Prediction
Positive
Prediction
Positive
Actual Positive TP FN
Actual Negative FP TN

• Sensitivity = the probability of a positive case
being marked positive
• Specificity = the probability of a negative case
being marked negative
• PPV = The probability of a positive test result
being positive
• NPV = The probability of a negative test result
being negative

• Sensitivity
• Specificity
• PPV
• NPV
Not effected by prevalence
of disease in a population
Effected by prevalence of
disease in a population

Sensitivity and Specificity
• Diagnostic Testing is a compromise
between sensitivity and specificity
• Most tests offer a compromise between
these two measures
• Very often two or more tests may
complement each other (one may be high
sensitivity, the other may be high
specificity)

http://www.medcalc.org/manual/roc-curves.php
Accessed 2014-11-13

Sensitivity and Specificity
• Sensitive tests: useful for screening, test
usually negative if disease is absent
• Specific tests: useful for confirming a
diagnosis, test usually positive if disease is
present

Diagnostic Testing
• It is important to note that there are
instances where diagnostic testing will not
significantly alter the posttest probability
relative to the pretest probability.

Diagnostic Testing
• Diagnostic testing may be less useful in
instances of very low or very high
probability.
• Diagnostic tests may be thought of as most
useful in instances of intermediate
probability.

Principles of Diagnostic Testing and ROC 2016

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Principles of Diagnostic Testing and ROC 2016

Ähnlich wie Principles of Diagnostic Testing and ROC 2016 (20)

Mehr von evadew1

Mehr von evadew1 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (15)

Principles of Diagnostic Testing and ROC 2016