1. Kappa
Is a form of correlation for measuring agreement on two
or more diagnostic categories by two or more clinicians or methods.
Why not use % agreement?
Because just by chance there could be lots of agreement.
Kappa can be defined as the proportion of agreements after chance
agreement is removed.
Kappa of 0 occurs when agreement is no better than chance.
2. Kappa of 1 indicates perfect agreement.
Negative Kappa means that there is less
agreement than you’d expect by chance (very
rare)
Categories may be ordinal or nominal
5. Steps
1. Add agreements = 2 + 4 + 2 = 8
2. Multiply number of times each judge
used a category:
(1x0) + (2x3) + (6x5) + (3x4)
3. Add them up = 48
4. Apply formula
6. Kappa = (N x agreements) – N as in 3
N2 – N as in 3
Which = (12 x 8) – 48 = 96 – 48 = 48 = 0.50
144 – 48 96 96
7. How large should Kappa be?
Landis & Koch (1977) suggested
0.0 – 0.20 = no or slight agreement
0.21 – 0.40 = fair
0.41 – 0.60 = moderate
0.61 – 0.80 = good
> 0.80 = very good
8. Weighted Kappa
In ordinary Kappa, all disagreements are
treated equally. Weighted Kappa takes
magnitude of discrepancy into account (often
most useful); is often higher than unweighted
Kappa.
9. N.B. Be careful with Kappa if the prevalence of
one of the categories in very low (< 10%); this
will underestimate level of agreement.
Example:
If 2 judges are very accurate (95%) a Kappa of
0.61 with a prevalence of 10% will drop to
•
0.45 if prevalence is 5%
•
0.14 if prevalence is 1%.