Statistics is Everywhere: An unexpected use of PROC SURVEYSELECT
Although I tell my students all of the time that statistics is everywhere, even I did not really see where mixed martial arts, free rice and PROC SURVEYSELECT could possibly have anything in common.
Here is what happened ….
Mixed martial arts
Darling daughter #3 after the Olympics decides not to go to college as mom advises and instead takes up a career as a professional mixed martial artist. She quickly became the #2 ranked woman at 145 lbs. At some point, the number one ranked woman in the 135 lb division said some uncalled for things about said daughter, who responds,
“I can beat up you and your boyfriend, too!”
She did not add,
“And my dog can beat up your dog”
but only because she did not think of it. Fast-forward past the twitter war that ensued and lovely daughter #3 is fighting to take the 135 lb world title on March 3rd. As she said in her Gaspari bio,
“I am greatly motivated by spite.”
That sounds SO much better than,
I hold a grudge.
If you don’t happen to be in Columbus, you can watch it on Showtime.
Enter freerice.com
Well, Ronda won a medal in the Olympics and world championships at 154 lbs, then dropped to 145 lbs for MMA. Now, she has to lose another ten pounds for this fight. She was complaining about being hungry one day while idly playing games on the web and thus the freerice group, rondamma, was born. Here is the group description. I’m not sure why other hunger programs don’t have “Because it sucks to be hungry” as their tagline – oh yeah, because they’re probably run by people over 25.
Because it sucks to be hungry! I’m Ronda Rousey, I’m cutting weight for the fight for the 135 lb world title and I’m hungry. How much more would it suck to be hungry every day? So I’m asking all of my friends and fans to be part of my freerice.com group to donate to the World Food Programme. When we hit 1,000,000 I’ll send a t-shirt to the 3 highest donors. I have special prizes for the 3 highest donors as of when I weigh in on March 2nd. With 20,000 twitter followers plus my friends in judo and MMA, I’ll bet we could donate 1,000,000 grains a day. That’s enough to feed almost 300 people every day. Come on, play free rice, do good, get smarter and I’ll send you swag.
Here is the link to join the group and play. It’s free! Money is donated by sponsors.
http://freerice.com/content-group/rondamma
How PROC SURVEYSELECT got in there
So, a very nice man made t-shirts and sent them to one of the gyms where Ronda trains. We mailed t-shirts to the top three people the first day, who helped kick it off, and then the top three when the group hit 1,000,000 . She is also going to send to the top people when they hit 10,000,000 grains of rice donated. They’re at about 7,000,000 now after 11 days.
Here’s the problem, though. The group has over 400 members but some of them have a lot more time to spend on the game to donate free rice than others. Some people have jobs, go to school and have kids, while others have fewer responsibilities. SO … she thought during free rice week (next week) she could do some random drawing and give away stuff from Gaspari, her new sponsor.
I’m not sure what kind of business model that is for Gaspari Nutrition to sponsor her and then have her give away stuff for free, but,whatever. I guess it pans out. It’s stuff like t-shirts and water bottles, not the corporate jet. (I’m pretty certain they don’t have a jet. Maybe one of the other athletes they sponsor gave that away.)
So, first, I’ll use PROC SURVEYSELECT to do a simple random sample from the group, so every person who has donated anything has an equal chance to win, because, as she says, she appreciates everybody’s effort, plus all the checking the group and tweeting takes her mind off being hungry.
Next, I want to do a sampling weighted by the amount donated. There are multiple ways to play with SAS and do that.
One way I can think of to do this is to use a data step and output a record for 1 to NumberGrains for each person, so the person who had donated 600,000 grains of rice has 600,000 chances to win and the person who donated 10 grains has ten chances, then do a random sample.
OR, I could stratify by donations and sample by strata, so put the five people who donated over 300,000 in one, the 50 people who donated 50 – 299,990 in a second strata, the 120 people with 20-49,990 in a third strata and everyone else in the fourth and sample one per strata. Of course, this method guarantees all of the prizes doesn’t go to the highest donors, while giving them a far better chance and still giving some prizes to the people who only had the time to donate a little.
I would be happy if I could just create a PROC FORMAT and have SAS stratify by the formatted value but I’m almost certain that won’t work because the STRATA statement works like the BY statement and when you sort, you get it sorted by the unformatted values.
So, there is how surveyselect, free rice and mixed martial arts go together. See, statistics IS everywhere. If you want, it’s not too late to join the free rice group and donate. You don’t even have to know anything about MMA. Free rice week is February 6-11 so you have time. I don’t know why it is only six days long or whatever happened to the other day.
Maybe Ronda ate it.
She really is hungry.