statistics

Simple graphs, not so simple answers

ByAnnMaria De Mars February 12, 2013February 12, 2013

The truth is, what I wanted to be talking about today was either data mining, text mining or mixed models. Those are three things I want to be doing more and would be doing more except that we have a Kickstarter campaign going on to fund the next six levels of our game that teaches math, which really is awesome < Seriously, it is .

So, even though I was feeling statistically deprived these days, I have to admit that sometimes simple statistics do give you pretty straight answers. Let’s take the test we created to see whether this game really works to improve students’ math scores. This is what we used to measure the effectiveness of the game in our pilot phase. Our original idea was to take released items from the state standards test. Turns out that North Dakota, where we piloted the game, is one of the states that never releases the items on its tests. So … we found other states that had identical standards, like, “Solve problems involving division of multi-digit numbers by one-digit numbers.”

Then we took questions released from those tests, like:

6. Valerie has 225 pennies. She divides her pennies into five equal piles. How many pennies are in each pile?

and created our test aligned to state standards. That is good for content validity – that is our test matches the content teachers were supposed to be teaching. When we look at the percentage of each item answered correctly by grade level, we see two things.

First, if you look at those vertical lines, after the third, eighth and eighteenth questions, those are grade level. As I wrote about previously, this gives us some evidence for contract validity given that fourth-grade students answer most questions at the second grade level correctly, and relatively few at the fourth and fifth-grade level. (Because this was a low-performing school on other criteria, we expected many students to be below grade level.)

Notice the dashed horizontal line I added, though. That is at 25%. If students just randomly guessed, they would get 25% correct. Many of those who got those items “correct” , I would suppose just guess. This introduces random error and makes your results less reliable. Now correction of scores for guessing is not new. Frary, Cross and Lowry published an interesting article on the topic and how it affects reliability in the Journal of Experimental Education back in 1977 and there has certainly been plenty of discussion since.

Still, it would be much better to NOT have to correct. Looking at these charts was one thing that really convinced me to switch from a multiple choice to fill-in-the-blank format. Since the tests are taken on line there is no more effort to score if we use this method. Of course, that does mean we are not going to be using the exact same items as the state tests, and, in fact, ours will be a bit harder. That’s okay, though, since we are comparing pre-test to post-test and not to any of the multiple choice tests administered for the states. Looking at these graphs was one of the factors that confirmed my decision to rewrite that part of the game using PHP and JavaScript instead of using a commercial service – Surveymonkey. Since we only had eight months total to write the game, pilot it in the schools and write up the analyses, we went with some pre-fabricated solutions, but now that phase is over, we are going back and revising parts of the design more to what we would like in an ideal situation. It’s really the point of a prototype.

Also interesting to me, notice how many questions are BELOW that line of 25%. Why do you think that is? Are Native American kids just bad guessers? I know the answer to that question, but put your guesses in the comments and I’ll tell you on Friday.

Dr. De Mars General Life Ramblings | statistics

Hi-de-ho, off to Salem I go

ByAnnMaria De Mars October 9, 2013

Hey, if you are a furloughed federal employee looking for something free to do on Thursday because you still haven’t received your back pay that Congress promised you, you can drop on into Salem, Oregon to the Oregon SAS Conference. I will be speaking on Categorical Data Analysis, Telling Stories with Your Data and How…

Dr. De Mars General Life Ramblings | statistics

Schoolchildren questions for statisticians

ByAnnMaria De Mars May 23, 2013May 23, 2013

It must be that time of year because I was asked to speak at two different schools in downtown Los Angeles this week, one elementary school and one middle school. The Perfect Jennifer probably won the coolest teacher award for getting her younger sister, a world champion in mixed martial arts and subject of a…

Dr. De Mars General Life Ramblings | statistics

Current trends never continue

ByAnnMaria De Mars January 20, 2011January 30, 2011

or why, despite shootings in Tuscon, terrorism by the Taliban, the expanding concentration of wealth in the hands of the richest 1% of Americans, the implosion of Detroit and every movie Michael Moore ever made, I still remain hopeful. I was depressed this week, until I remembered The Black Swan and Volcano. The otherwise forgettable…

20 Day Blogging | statistics

I’m going to do that again: Day 4 of the Blogging Challenge

ByAnnMaria De Mars January 10, 2014January 14, 2014

Amazingly, given my current schedule, I have made it to Day 4 of the 20-day blogging challenge. This was the brain child of Kelly Hines as a way to get herself to blog more regularly. Today’s prompt was : Share a topic/ idea from class this week. What’s one thing you did with students this…

Software | statistics | Technology

Starting with Perfect Data from SASHELP

ByAnnMaria De Mars July 19, 2012

More than once, I have said that I would never hire someone right out of graduate school. My reason is that graduate students come expecting perfect data sets to analyze, with no data entry errors, normal distributions of all variables, no missing data – and a liberal sprinkling of fairy dust to make it all…

Dr. De Mars General Life Ramblings | statistics

Is anyone out there? Tracking blog statistics

ByAnnMaria De Mars March 13, 2012March 13, 2012

Heidi Cohen gives a lot of good advice on getting your blog noticed, very little of which I follow. For one thing, she does not begin by suggesting you have someone bring you a glass of cognac, which is how this particular post started, proving that she may know more about blogging but I’m a…

3 Comments

Ronald says:

February 12, 2013 at 7:31 pm

We don’t often have multiple-choice tests in UK maths education – in fact: students are given partial credit for communicating a correct method.

You’ve left me with a super-interesting stats puzzle! I’m not confident of an obvious solution, but here are some suggestions 🙂

– Assuming people are guessing randomly, the mean of random guesses should be 25% in the population – however, we expect to see some variation from 25% in the sample you’ve collected. Depending on the number of students you tested these results might not be unusually low.

– If students are not required to guess, and are not confident, they probably won’t guess at all. I see this even at a much higher level and even if there is no penalty for a bad guess: people don’t like to be seen to be wrong.

– Students could be guessing non-randomly, with some (unhelpful) knowledge. Especially if the tester can predict what misconceptions students will have: ‘red herring’ wrong answers could produce percentages below 25%.

I’m not sure if Question 6 is set up like this, but – for example – a student who is not sure might think “it seems fivey” and guess 55.
Geoff says:

February 12, 2013 at 8:55 pm

With a fairly small sample, the variance from your expected number of randomly correct answers will vary pretty significantly, leading to questions with correct percentages above and below 25%.
Max Lybbert says:

February 13, 2013 at 1:34 am

I think Ronald’s on to something: the worse-than-random-guessing scores probably come from people who leave questions blank.

Similar Posts

3 Comments

Leave a Reply