Census in Black & White: What I wondered about lately
The census now allows more than one race to be checked. For many years, friends of mine in inter-racial couples when they registered their children for school would check the “Other” box for race, rather than pick black or white.
Although an individual’s census form responses are confidential, you certainly are free to tell anyone what you put. In response to an inquiry, the white house spokesperson said that President Obama had checked only “African-American or Black“, even though his mother is white.
Now that you can select both black and white as race, I wondered how many people did. Unlike normal people who wonder about these things, I decided to download all 3,030,728 records from the 2009 American Community Survey to find out. Once I downloaded the survey and read it into SAS, I produced the chart below. I was quite surprised to see how few people checked both black and white. As you can see, it was less than 1%.
The SAS code to create this chart is shown below. You might think this is a ridiculous amount of work to create one chart and you could do it way easier in Excel. You’d be correct except for two things. One, I know that earlier versions of Excel no way could you read in 3,000,000+ records. Even if you can do it now I’ll bet it’s painfully slow. Two, most of these options or steps only need to be done once and I was doing multiple charts. The AXIS and PATTERN statements only need to be specified once.
If you DO want to create your chart in Excel, you could just do the first part, the PROC FREQ, and then export your output from the frequency procedure to a four- record file and do the rest in Excel. There is no need to get religiously attached to doing everything with one program or package.
PROC FREQ DATA = lib.pums9 NOPRINT;
TABLES racblk* racwht / OUT = lib.blkwhitmix ;
WEIGHT pwgtp ;
DATA byrace ;
SET lib.blkwhitmix ;
IF racblk = 1 AND racwht = 0 THEN Race = "Black" ;
ELSE IF racblk = 0 AND racwht = 1 THEN Race = "White" ;
ELSE IF racblk = 0 AND racwht = 0 THEN Race = "Other" ;
ELSE IF racblk = 1 AND racwht = 1 THEN Race = "Mixed" ;
PERCENT = PERCENT/ 100 ;
AXIS1 LABEL = ( ANGLE = 90 "Percent") ORDER = (0 to 1 by .1 ) ;
AXIS2 ORDER = ("White" "Black" "Mixed" "Other" ) ;
PATTERN1 COLOR = BLACK ;
PATTERN2 COLOR= GRAY ;
PATTERN3 COLOR = BROWN ;
PATTERN4 COLOR=WHITE ;
PROC GCHART DATA=byrace ;
VBAR Race / raxis = axis1 maxis = axis2
SUMVAR= percent
TYPE=SUM
OUTSIDE= SUM
PATTERNID = MIDPOINT ;
LABEL Race = "Race" ;
FORMAT percent percent8.1 ;
Interesting post. I’ve only used SAS in a couple classes in school, so I was going to try to do this same thing in R. However, I can’t for the life of me figure out the format of these files. I found the file 20091YRSF.zip on the data.gov website, but it contains hundreds of data files which include many numbers. The README I found on the website was no help. Could you tell me how you figured out the format of these files and where the race info is?
I got the data from the census.gov site
http://www.census.gov/acs/www/data_documentation/pums_data/
You can download it as a SAS file or a .csv file and all of the documentation is on the site right there. Each state has a household file and a personal file, plus the household file and personal file is split in 2 for the country, due to size, I guess. I downloaded the two personal files and just appended the two.