Minimum Sample Size in Factor Analysis & Other Small Sample Thoughts

Someone handed me a data set on acculturation that they had collected from a small sample size of 25 people. There was a good reason that the sample was small – think African-American presidents of companies over $100 million in sales or Latina neurosurgeons. Anyway, small sample, can’t reasonably expect to get 500 or 1,000 people.

The first thing I thought about was whether there was a valid argument for a minimum sample size for factor analysis. I came across this very interesting post by Nathan Zhao where he reviews the research on both a minimum sample size and a minimum subjects to variables ratio.

Since I did the public service of reading it so you don’t have to, (though seriously, it was an easy read and interesting), I will summarize:

  1. There is no evidence for any absolute minimum number, be it 100, 500 or 1,000.
  2. The minimum sample size depends on the number of variables and the communality estimates for those variables
  3. “If components possess four or more variables with loadings above .60, the pattern may be interpreted whatever the sample size used .”
  4. There should be at least three measured variables per factor and preferably more.

This makes a lot of sense if you think about factor loadings in terms of what they are, correlations of an item with a factor. With correlations, if you have a very large correlation in the population, you’re going to find statistical significance even with a small sample size. It may not be precisely as large as your population correlation, but it is still going to be significantly different than zero.

So … this data set of 25 respondents that I received originally had 17 items. That seemed clearly too many for me.  I thought there were two factors, so I wanted to reduce the number of variables down to 8, if possible. I also suspected the communality estimates would be pretty high, just based on previous research with this measure.

Here is what I did next :

  • Parceled
  • Parallel analysis
  • Factor Analysis

I can’t believe I haven’t written at all on parceling before and hardly any on the parallel analysis criterion, given the length of time I’ve been doing this blog. I will remedy that deficit this week. Not tonight, though. It’s past midnight, so that will have to wait until the next post.

Update: read post on parcels and the PROC FACTOR code here

—-

My day job is making games that make you smarter. Check out our latest game, Forgotten Trail. Runs on Mac or Windows in any browser. Be more than ordinary.

People on farm

Similar Posts

2 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *