Statistics for Anaesthesiologists

Statistics for Anaesthesiologists
Dr John George K. MD,PDCC
Associate Professor of Anaesthesiology
KMC, Manipal

Recommended Software
• RStudio (GUI) with R, R Commander, R Commander
Plugins like EZR (Free, Cross platform, powerful
programming paradigm)
• G*Power (Free, for power analysis)
• SPSS (Commercial, expensive)
• SOFA (Free, basic)
• Graphpad.com
• Spreadsheet software like MS Excel for initial data entry
(export as CSV file format)

Data Types

• Nominal or Categorical data
• Ordinal data
• Interval data
• Ratio data

Data Types
 Nominal: Categorical data and numbers that are simply used
as identifiers or names. Ex: social security (Aadhar) number
 Ordinal: an ordered series of relationships or rank order. Ex:
first, second, or third place in a contest, Likert scale
 Interval: A scale that represents quantity and has equal units
but for which zero represents simply an additional point of
measurement.. Ex: Fahrenheit scale
 Ratio: similar to the interval scale. However, this scale also has
an absolute zero (no numbers exist below zero). Ex: Height,
Weight

Reporting data types
OK to compute

Nominal

Ordinal

Interval

Ratio

Frequency
Distribution

Yes

Yes

Yes

Yes

Median,
percentiles

No

Yes

Yes

Yes

Mean, SD, SE of No
mean

No

Yes

Yes

Ratio or
coefficient of
variation

No

No

Yes

No

Tests for normality of data
• Kolmogorov-Smirnov Test – inferior to others,
relies on goodness of fit of a sample with a normal
distribution curve, avoid its use!
• Shapiro-Wilk Test – better, mores specific, more
powerful especially with small sample sizes,
available in Rcommander, SPSS (under menu
Analyze>Descriptive Statistics>Explore)

Tests for normality of data
• D'Agostino-Pearson test
• Anderson-Darling test
• Q-Q (Quantile Probability) Plot – visual guide
• Histogram – inferior, look for Skew or Kurtosis
• Density Plot – better, look for Skew or Kurtosis

Choosing a statistical test
• Make sure you have adequate sample size (power)
to reject null hypothesis (Ho)
• Check is it one (only < or > μ, only one direction)
or two-tailed comparison (≠μ , test significance at
both sides) – in general use 2
• Look at your data types – ordinal, interval etc
• Do descriptive statistics testing

Choosing a statistical test
• Test normality of data – tests and visual
comparison (especially when n<30)
• Decide to use Parametric Vs Non-parametric tests
• Look at number of groups 2 or more – t-tests (if
n<30), z-test (n>30) or ANOVA (F-test) or their
non-parametric equivalents
• For 2 or more groups check if data is paired or
independent

What is p-value?

Ronald Fisher

What is p-value?
• The p-value is a probability of the test statistic’s sampling
distribution under the null hypothesis (null distribution, we
first assume Ho is true!)
• The (left-tailed) p-value is the quantile of the value of the
test statistic, the right-tailed p-value is one minus the
quantile, while the two-tailed p-value is twice whichever of
these is smaller.
• The p-value is NOT the probability that the null hypothesis
is true, nor is it the probability that the alternative
hypothesis is false

What is p-value?
• p-value is NOT the same as α !
• p-value is NOT the probability of rejecting the null
hypothesis (we reject Ho when p-value is less than
the significance level which is α)
• p-value is computed while α is set by experimental
design
• If Ho is true, α is the probability of rejecting null
hypothesis

CHI SQUARE OR FISHER’S EXACT TEST?
• In the days before computers were readily
available, people analyzed contingency tables by
hand, or using a calculator, using chi-square tests
• Works by computing the expected values for each
cell if the relative risk (or odds' ratio) were 1.0. It
then combines the discrepancies between
observed and expected values into a chi-square
statistic from which a P value is computed

• The chi-square test is only an approximation!
• Yates continuity correction is designed to make it
better, but it over corrects so gives a p-value that
is too large (too 'conservative’)
• With large sample sizes, Yates' correction makes
little difference, and the chi-square test works very
well. With small sample sizes, chi-square is not
accurate, with or without Yates' correction

• Fisher's exact test, as its name implies, always gives an
exact P value and works fine with small sample sizes
• Fisher's test (unlike chi-square) is very hard to calculate by
hand (so generally used for 2 x 2 or 2 x n table), but is easy
to compute with a computer
• Advisable to use when any cell of the table has expected
value < 5

• Most statistical books advise using it instead of chi-square
test (especially small samples, but chi square becomes
acceptable for large sample sizes)
• Fisher’s exact test can be used for a m x n table
• Some have criticized it as the exact answer to the wrong
question!

Men

Women

Total

Dieting

a

b

a+b

Not Dieting

c

d

c+d

a+c

b+d

(a+b+c+d)=n

Total

ANOVA (ANALYSIS OF VARIANCE)
• The one-way analysis of variance (ANOVA) is used to
determine whether there are any significant differences
between the means of two or more independent
(unrelated) groups
• For ex: to understand if exam performance (dependent
variable) differed based on test anxiety levels amongst
students, dividing students into three independent groups
(e.g., low, medium and high-stressed students)

ONE-WAY ANOVA DESIGN
Treatment/C
ondition

CONDITION1

Levels (Independent Variable)
Group1

Group2

Group3

S1 DV
S2 DV
S3 DV
S4 DV
S5 DV

S6 DV
S7 DV
S8 DV
S9 DV
S10 DV

S11 DV
S12 DV
S13 DV
S14 DV
S15 DV

DV = Dependent Variable S = Subject

• It is an omnibus test statistic and cannot tell you which
specific groups were significantly different from each
other; it only tells you that at least two groups were
different.
• Since you may have ≥3 groups in your study design,
determining which of these groups differ from each other
is done using a Post-hoc test (Tukey’s test is preferred)
which gives a Multiple comparisons table.

• To apply ANOVA 6 assumptions must be met:
• Assumption #1: Your dependent variable should be
measured at the interval or ratio level (i.e., they are
continuous)
• Assumption #2: Your independent variable should
consist of two or more categorical, independent groups; it
can be used for just two groups (but an independentsamples t-test is more commonly used for two groups)

• Assumption #3: You should have independence of observations,
which means that there is no relationship between the
observations in each group or between the groups themselves.
• Assumption #4: There should be no significant outliers.
• Assumption #5: Your dependent variable should be approximately
normally distributed for each category of the independent variable
(but it is quite "robust" to violations of normality)
• Assumption #6: There needs to be homogeneity of variances. (in
SPSS using Levene's test for homogeneity of variances)

ANOVA (ANALYSIS OF VARIANCE) METHOD
• ANOVA calculates the mean for each of the groups - the
Group Means.
• It calculates the mean for all the groups combined - the
Overall Mean.
• Then it calculates, within each group, the total deviation
of each individual's score from the Group Mean - Within
Group (Error )Variation.

• Next, it calculates the deviation of each Group Mean
from the Overall Mean - Between Group Variation.
• Finally, ANOVA produces the F statistic which is the
ratio Between Group Variation to the Within Group
(Error) Variation.

TWO-WAY ANOVA DESIGN
Treatment/Conditi
on (Independent)

Levels (Independent Variable)
Group3

S6 DV

S11 DV

S2 DV

S7 DV

S12 DV

S3 DV

S8 DV

S13 DV

S4 DV

S9 DV

S14 DV

S5 DV

S10 DV

S15 DV

S16 DV

S21 DV

S26DV

S17 DV
CONDITION2

Group2

S1 DV
CONDITION1

Group1

S22 DV

S27 DV

S18 DV

S23 DV

S28 DV

S19 DV

S24 DV

S29 DV

S20 DV

S25 DV

S30 DV

ANCOVA (ANALYSIS OF COVARIANCE)
• An extension of the one-way ANOVA used to determine whether
there are any significant differences between the means of two or
more independent (unrelated) groups (specifically, the adjusted
means) by adjusting for a third or confounding variable
• Third variable (known as a "covariate” or “confounding variable”) is
that you want to "statistically control” that maybe affecting results
of ANOVA
• In each one of the two groups we can compute the correlation
coefficient between the third variable and dependent variables

REPEATED MEASURES ANOVA
• A repeated measures ANOVA is used when you have a
single group on which you have measured something a few
times
• For example, you may have a test of understanding of
Classes. You give this test at the beginning of the topic, at
the end of the topic and then at the end of the subject
• You would use a one-way repeated measures ANOVA to see
if student performance on the test changed over time

• Repeated measures ANOVA is the equivalent of the one-way
ANOVA, but for related, not independent groups, and is the
extension of the dependent t-test
• A repeated measures ANOVA is also referred to as a withinsubjects ANOVA or ANOVA for correlated samples
• The major advantage with running a repeated measures ANOVA
over an independent ANOVA is that the test is generally much
more powerful. This particular advantage is achieved by the
reduction in variability (due to differences between subjects) during
the performance of the test

Subjects

Time/Condition (Independent Variable)
T1

T2

T3

S1

S1

S1

S1

S2

S2

S2

S2

S3

S3

S3

S3

S4

S4

S4

S4

S5

S5

S5

S5

TWO-WAY ANOVA REPEATED MEASURES
Factor
(Independent)

Time/Condition (Independent Variable)
Subjects

T3

S1

S1

S1

S2

S2

S2

S2

S3

S3

S3

S3

S4

S4

S4

S4

S5

S5

S5

S5

S6

S6

S6

S6

S7
GROUP2

T2

S1
GROUP1

T1

S7

S7

S7

S8

S8

S8

S8

S9

S9

S9

S9

S10

S10

S10

S10

Variable type & CHOOSING A Test
Explanatory
Variable

Response
Variable

Methods

Categorical

Categorical

Contingency
Tables

Categorical

Quantitative

ANOVA

Quantitative

Quantitative

Regression

ANOVA – WHY NOT JUST USE t-TESTS?
• Multiple t-tests are not the answer because as the number of
groups grows, the number of needed pair comparisons grows
quickly. For example in 7 groups there are 21 pairs. If we test 21
pairs we should not be surprised to observe things that happen
only 5% of the time. Thus in 21 pairings, a p-value = 0.05 for one
pair cannot be considered significant.
• Our level of significance α has to be divided for multiple
comparisons (Ex: for above it becomes α/21)
•

ANOVA puts all the data into one number (F) and gives us one pvalue for the null hypothesis.

ANOVA – WHY NOT JUST USE t-TESTS?

From eBook: Research skills for Psychology Majors by
William Gabrenya

Likert ITEM & LIKERT Scale

• Likert scale consists of multiple Likert-type
items
• Likert-type scales (such as "On a scale of 1
to 10, with one being no pain and ten being
high pain, how much pain are you in today?")
• Represent ordinal data (order, rank, but
no real distance)

Likert ITEM & LIKERT Scale
• Fundamentally, these scales do not represent a
measurable quantity
• An individual may respond 8 and be in less pain
than someone else who responded 5
• A person may not be in exactly half as much pain if
they responded 4 than if they responded 8
• Visual Analog Scale is a Likert scale but often
(wrongly) analyzed as if it were continuous data

COMPOSITE SCORE & LIKERT Scale
• Composite scores combine multiple Likert item
scales into a single scale
• Composite scores must first be analyzed for
internal consistency and inter-item correlation for
each item and reported (ex: using Cronbach’s
alpha – scale reliability analysis)
• These scores represent ordinal data so must use
non-parametric tests and descriptives

Cronbach’s Alpha For scales

• Check for internal consistency and overall
validity of a multiple Likert-type item scale
• Check correlation (α) with each item
deleted at a time
• Based on number of items and comparison
of its variances

Cronbach’s Alpha For scales

• Values of α range from 0 to 1
• Ideally overall α and α for each item (when
deleted from scale) must be > 0.7 to 0.8
• Clinical scores need higher α > 0.8 to 0.9
(Bland-Altman)

Power analysis & effect size
• To calculate sample size (n) we must know the type
of statistical test involved in our primary outcome
measure
• Also we must also know:
• Desired α error (usually taken as 0.05)
• Power (1-β) usually taken 0.8 (80%) or greater
• Two or one-tailed comparison
• Effect size

• Power is the fraction of experiments that you expect to
yield a "statistically significant” p-value (80% of
experiments of the sample may yield a significant p-value)
• Effect size (Cohen’s d for mean) depends on study design,
it is calculated by data from pilot studies or reference
studies
• Effect size depends on a clinically defined level of
significance (ex: more than 20% difference between 2
groups, with difference for proportion or mean ± SD data
etc)

• Cohen’s d is usually calculated based on pilot studies but if
effect size is unknown Jacob Cohen provided 3 guess estimate
effect sizes (value varies slightly for different statistical tests):
1. Small effect d around 0.2 (requires large sample sizes)
2. Medium effect d around 0.5 (seen with careful observation, use
when in doubt)
3. Large effect greater than 0.8 (if large it is obvious)
• Criticized when d is used as above as “T-shirt” effect sizes

• Calculation of required sample size a with set target for
power before starting the final study is called A priori
analysis (before the fact) – accepted method, especially
important to avoid incorrectly being “blind” to a real
difference in a negative study (due to large βerror)
• Calculation of required sample size at the end of the final
study is called Post hoc analysis (after the fact) – incorrect
as the computed power is a simple reflection of the pvalue!
• G*Power software is a free useful resource

Statistics for Anaesthesiologists

Statistics for Anaesthesiologists

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (16)

Ähnlich wie Statistics for Anaesthesiologists

Ähnlich wie Statistics for Anaesthesiologists (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Statistics for Anaesthesiologists

Hinweis der Redaktion