Computing Kappa is a Piece of Cake
Kappa is a useful measure of agreement between two raters. Say you have two radiologists looking at X-rays, rating them as normal or abnormal and you want to get a quantitative measure of how well they agree. Kappa is your go-to coefficient.
How do you compute it? Well, personally, I use SAS because this is the year 2015 and we have computers.
Let’s take this table, where 100 X rays were rated by two different raters as an example:
Rating by Physician 1
————-Abnormal | Normal
Physician 2
————————————–
Abnormal 40 20
Normal 10 30
So ….. the first physician rated 60 X-rays as Abnormal. Of those 60, the second physician rated 40 abnormal and 20 normal, and so on.
If you received the data as a SAS data set like this, with an abnormal rating = 1 and normal = 0, then life is easy and you can just do the PROC FREQ.
Rater1 Rater2
1 1
1 1
and so for 50 lines.
However, I very often get not an actual data set but a table like the one above. In this case, it is still relatively simple to code
DATA compk ;
INPUT rater1 rater2 nums ;
DATALINES ;
1 1 40
1 0 20
0 1 10
0 0 30
;
So, there were 40 x-rays coded as abnormal by both rater1 and rater2. When rater1 = 1 (abnormal) and rater2 = 0 (normal), there were 20, and so on.
The next part is easy
PROC FREQ DATA = compk ;
TABLES rater1*rater2/ AGREE ;
WEIGHT nums ;
That’s it. The WEIGHT statement is necessary in this case because I did not have 100 individual records, I just had a table, so the WEIGHT variable gives the number in each category.
This will work fine for a 2 x 2 table. If you have a table that is more than 2 x 2, at the end, you can add the statement
TEST WTKAP ;
This will give you the weighted Kappa coefficient. If you include this with a 2 x2 table nothing happens because the weighted kappa coefficient and the simple Kappa coefficient are the same in this case.
See, I told you it was simple.