A SAS Mystery Solved – When FREQ and MEANS disagree
I’m preparing a data set for analysis and since the data are scored by SAS I am double-checking to make sure that I coded it correctly. One check is to select out an item and compare the percentage who answered correctly with the mean score for that item. These should be equal since items are scored 0=wrong, 1=correct.
When I look at the output for my PROC MEANS it says that 31% of the respondents answered this item correctly, that is, mean = .310.
However, the correct answer is D and when I look at the results from my PROC FREQ it shows that 35% of the respondents gave ‘D’ as the correct answer.
What is going on here? Is my program to score the tests off somewhere? Will I need to score all of these tests by hand?
I am sure those of you who are SAS gurus thought of the answer already (and if you didn’t, you’re going to be slapping your head when you read the simple solution).
By default, PROC FREQ gives you the percentage of non-missing records. Since many students who did not know the answer to the question left it blank, they were (rightfully) given a zero when the test was automatically scored. To get your FREQ and MEANS results to match, use the MISSING option, as so
PROC FREQ DATA =in.score ;
TABLES item1 / MISSING ;
You will find that 31% of the total (including those who skipped the question) got the answer right.
Sometimes it’s the simplest things that give you pause.
Ann,
Can you please provide a small SAS code example to illustrate what you mentioned in this blog ie the difference of PROC MEANS and PROC FREQ wrt Missing Values. Shouldn’t both exclude missing values ?
Thanks.
Both by default exclude missing values. However, if you score your tests like this:
If answer = “D” then correct = 1 ;
else correct = 0 ;
All of those missing an answer will be scored 0, as it should be, since they did not give the correct answer. So, when you compute the mean,all of those will no longer be missing, they will have had their answer on that item scored as a 0.