SlideShare a Scribd company logo
1 of 9
Download to read offline
Kappa

Is a form of correlation for measuring agreement on two
or more diagnostic categories by two or more clinicians or methods.

Why not use % agreement?

Because just by chance there could be lots of agreement.

Kappa can be defined as the proportion of agreements after chance
agreement is removed.

       Kappa of 0 occurs when agreement is no better than chance.
Kappa of 1 indicates perfect agreement.


Negative Kappa means that there is less
agreement than you’d expect by chance (very
rare)
      Categories may be ordinal or nominal
How is it calculated?

Patient ID   Psychiatrist   Psychologist
1            1              2
2            2              2
3            2              2
4            3              3
5            3              3
6            3              3
7            3              3
8            3              4
9            3              4
10           4              4
11           4              4
12           4              3
Category   1   2     3     4
   1
   2       1   11
   3                1111   1
   4                 11    11
Steps

1. Add agreements = 2 + 4 + 2 = 8
2. Multiply number of times each judge
 used a category:
      (1x0) + (2x3) + (6x5) + (3x4)
3. Add them up = 48
4. Apply formula
Kappa = (N x agreements) – N as in 3
              N2 – N as in 3




Which = (12 x 8) – 48 = 96 – 48 = 48 =    0.50
          144 – 48        96      96
How large should Kappa be?

Landis & Koch (1977) suggested


0.0 – 0.20 = no or slight agreement

0.21 – 0.40 = fair

0.41 – 0.60 = moderate

0.61 – 0.80 = good

> 0.80 = very good
Weighted Kappa

In ordinary Kappa, all disagreements are

treated equally. Weighted Kappa takes

magnitude of discrepancy into account (often

most useful); is often higher than unweighted

Kappa.
N.B. Be careful with Kappa if the prevalence of

one of the categories in very low (< 10%); this

will underestimate level of agreement.

Example:

If 2 judges are very accurate (95%) a Kappa of

0.61 with a prevalence of 10% will drop to

 •
0.45 if prevalence is 5%

 •
0.14 if prevalence is 1%.

More Related Content

What's hot

Error, confounding and bias
Error, confounding and biasError, confounding and bias
Error, confounding and biasAmandeep Kaur
 
Validity and Reliability
Validity and Reliability Validity and Reliability
Validity and Reliability Tauseef Jawaid
 
Epidemiology Exercises
Epidemiology ExercisesEpidemiology Exercises
Epidemiology ExercisesMujeeb M
 
Bias and Confounding
Bias and Confounding  Bias and Confounding
Bias and Confounding soudfaiza
 
Measures Of Association
Measures Of AssociationMeasures Of Association
Measures Of Associationganesh kumar
 
EXPERIMENTAL EPIDEMIOLOGY
EXPERIMENTAL EPIDEMIOLOGYEXPERIMENTAL EPIDEMIOLOGY
EXPERIMENTAL EPIDEMIOLOGYDr. Thaher
 
Nested case control,
Nested case control,Nested case control,
Nested case control,shefali jain
 
7. experimental epidemiology
7. experimental epidemiology7. experimental epidemiology
7. experimental epidemiologyNaveen Phuyal
 
Biases in epidemiology
Biases in epidemiologyBiases in epidemiology
Biases in epidemiologySubraham Pany
 
randomised controlled trial
randomised controlled trial randomised controlled trial
randomised controlled trial DrSridevi NH
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in StatisticsVikash Keshri
 
Overview of systematic review and meta analysis
Overview of systematic review and meta  analysisOverview of systematic review and meta  analysis
Overview of systematic review and meta analysisDrsnehas2
 

What's hot (20)

Error, confounding and bias
Error, confounding and biasError, confounding and bias
Error, confounding and bias
 
Validity and Reliability
Validity and Reliability Validity and Reliability
Validity and Reliability
 
Epidemiology Exercises
Epidemiology ExercisesEpidemiology Exercises
Epidemiology Exercises
 
Cohort Study
Cohort Study Cohort Study
Cohort Study
 
Cross sectional study
Cross sectional studyCross sectional study
Cross sectional study
 
Cohort study
Cohort studyCohort study
Cohort study
 
Descriptive epidemiology
Descriptive epidemiologyDescriptive epidemiology
Descriptive epidemiology
 
Bias and Confounding
Bias and Confounding  Bias and Confounding
Bias and Confounding
 
Bias in Research
Bias in ResearchBias in Research
Bias in Research
 
Measures Of Association
Measures Of AssociationMeasures Of Association
Measures Of Association
 
Epidemiological studies
Epidemiological studiesEpidemiological studies
Epidemiological studies
 
Descriptive epidemiology
Descriptive epidemiologyDescriptive epidemiology
Descriptive epidemiology
 
EXPERIMENTAL EPIDEMIOLOGY
EXPERIMENTAL EPIDEMIOLOGYEXPERIMENTAL EPIDEMIOLOGY
EXPERIMENTAL EPIDEMIOLOGY
 
Association and causation
Association and causationAssociation and causation
Association and causation
 
Nested case control,
Nested case control,Nested case control,
Nested case control,
 
7. experimental epidemiology
7. experimental epidemiology7. experimental epidemiology
7. experimental epidemiology
 
Biases in epidemiology
Biases in epidemiologyBiases in epidemiology
Biases in epidemiology
 
randomised controlled trial
randomised controlled trial randomised controlled trial
randomised controlled trial
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in Statistics
 
Overview of systematic review and meta analysis
Overview of systematic review and meta  analysisOverview of systematic review and meta  analysis
Overview of systematic review and meta analysis
 

Viewers also liked

Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methodsdionisos
 
Bias and reliability
Bias and reliabilityBias and reliability
Bias and reliabilityRob Golding
 
Sources, reliability and bias
Sources, reliability and biasSources, reliability and bias
Sources, reliability and biasdposkerhill
 
3.3 hierarchy of populations
3.3 hierarchy of populations3.3 hierarchy of populations
3.3 hierarchy of populationsA M
 
3.7 preventing biases
3.7 preventing biases3.7 preventing biases
3.7 preventing biasesA M
 
Inter-rater Reliability Training - ACT Holistic Writing Rubric
Inter-rater Reliability Training - ACT Holistic Writing RubricInter-rater Reliability Training - ACT Holistic Writing Rubric
Inter-rater Reliability Training - ACT Holistic Writing RubricPeggy Muehlenkamp
 
Psychometric Assessment at its best
Psychometric Assessment at its best Psychometric Assessment at its best
Psychometric Assessment at its best Dr. Johnsey Thomas
 
How to detect bias in the news ('12)
How to detect bias in the news ('12)How to detect bias in the news ('12)
How to detect bias in the news ('12)LHendersonRSS
 
Population 3.5 - Spearman’S Rank
Population 3.5 - Spearman’S RankPopulation 3.5 - Spearman’S Rank
Population 3.5 - Spearman’S RankEcumene
 
Advanced statistics Lesson 1
Advanced statistics Lesson 1Advanced statistics Lesson 1
Advanced statistics Lesson 1Cliffed Echavez
 
Spearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation CoefficientSpearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation CoefficientSharlaine Ruth
 
Multidimensional scaling1
Multidimensional scaling1Multidimensional scaling1
Multidimensional scaling1Carlo Magno
 
Research methods - PSYA1 psychology AS
Research methods - PSYA1 psychology ASResearch methods - PSYA1 psychology AS
Research methods - PSYA1 psychology ASNicky Burt
 
Multidimensional scaling
Multidimensional scalingMultidimensional scaling
Multidimensional scalingH9460730008
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good testcyrilcoscos
 

Viewers also liked (20)

Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methods
 
Dishonest by Design?
Dishonest by Design?Dishonest by Design?
Dishonest by Design?
 
Bias and reliability
Bias and reliabilityBias and reliability
Bias and reliability
 
Sources, reliability and bias
Sources, reliability and biasSources, reliability and bias
Sources, reliability and bias
 
3.3 hierarchy of populations
3.3 hierarchy of populations3.3 hierarchy of populations
3.3 hierarchy of populations
 
3.7 preventing biases
3.7 preventing biases3.7 preventing biases
3.7 preventing biases
 
Identifying bias
Identifying biasIdentifying bias
Identifying bias
 
Inter-rater Reliability Training - ACT Holistic Writing Rubric
Inter-rater Reliability Training - ACT Holistic Writing RubricInter-rater Reliability Training - ACT Holistic Writing Rubric
Inter-rater Reliability Training - ACT Holistic Writing Rubric
 
Psychometric Assessment at its best
Psychometric Assessment at its best Psychometric Assessment at its best
Psychometric Assessment at its best
 
How to detect bias in the news ('12)
How to detect bias in the news ('12)How to detect bias in the news ('12)
How to detect bias in the news ('12)
 
Population 3.5 - Spearman’S Rank
Population 3.5 - Spearman’S RankPopulation 3.5 - Spearman’S Rank
Population 3.5 - Spearman’S Rank
 
Advanced statistics
Advanced statisticsAdvanced statistics
Advanced statistics
 
Advanced statistics Lesson 1
Advanced statistics Lesson 1Advanced statistics Lesson 1
Advanced statistics Lesson 1
 
Spearman Rank
Spearman RankSpearman Rank
Spearman Rank
 
Spearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation CoefficientSpearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation Coefficient
 
Multidimensional scaling1
Multidimensional scaling1Multidimensional scaling1
Multidimensional scaling1
 
Research methods - PSYA1 psychology AS
Research methods - PSYA1 psychology ASResearch methods - PSYA1 psychology AS
Research methods - PSYA1 psychology AS
 
Multidimensional scaling
Multidimensional scalingMultidimensional scaling
Multidimensional scaling
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
 
Psychometric Tests
Psychometric TestsPsychometric Tests
Psychometric Tests
 

Kappa

  • 1. Kappa Is a form of correlation for measuring agreement on two or more diagnostic categories by two or more clinicians or methods. Why not use % agreement? Because just by chance there could be lots of agreement. Kappa can be defined as the proportion of agreements after chance agreement is removed. Kappa of 0 occurs when agreement is no better than chance.
  • 2. Kappa of 1 indicates perfect agreement. Negative Kappa means that there is less agreement than you’d expect by chance (very rare) Categories may be ordinal or nominal
  • 3. How is it calculated? Patient ID Psychiatrist Psychologist 1 1 2 2 2 2 3 2 2 4 3 3 5 3 3 6 3 3 7 3 3 8 3 4 9 3 4 10 4 4 11 4 4 12 4 3
  • 4. Category 1 2 3 4 1 2 1 11 3 1111 1 4 11 11
  • 5. Steps 1. Add agreements = 2 + 4 + 2 = 8 2. Multiply number of times each judge used a category: (1x0) + (2x3) + (6x5) + (3x4) 3. Add them up = 48 4. Apply formula
  • 6. Kappa = (N x agreements) – N as in 3 N2 – N as in 3 Which = (12 x 8) – 48 = 96 – 48 = 48 = 0.50 144 – 48 96 96
  • 7. How large should Kappa be? Landis & Koch (1977) suggested 0.0 – 0.20 = no or slight agreement 0.21 – 0.40 = fair 0.41 – 0.60 = moderate 0.61 – 0.80 = good > 0.80 = very good
  • 8. Weighted Kappa In ordinary Kappa, all disagreements are treated equally. Weighted Kappa takes magnitude of discrepancy into account (often most useful); is often higher than unweighted Kappa.
  • 9. N.B. Be careful with Kappa if the prevalence of one of the categories in very low (< 10%); this will underestimate level of agreement. Example: If 2 judges are very accurate (95%) a Kappa of 0.61 with a prevalence of 10% will drop to • 0.45 if prevalence is 5% • 0.14 if prevalence is 1%.