CHAPTER 1: AFTER THE DATA STEP
Any person who claims to know all of SAS is either clinically insane or a liar. However, that is not you. YOU are reading this book. Based on this one fact, I can conclude a couple of things about you.
First, you know the basics of SAS. You can code a DATA step. You have mastered the INPUT statement and are familiar with several different formats. You probably are familiar with common procedures like CONTENTS, MEANS and FREQ. Second, you want to expand beyond the basics, that’s why you are reading this book. So … what is your next step?
You have several choices.
- You can decide to learn as much as possible about statistics. That’s not a bad decision. You can become deeply steeped in the lore of complex sampling, eigenvectors and proportional hazards models, leaving simpler statisticians doing logistic regressions and repeated measures ANOVA to brush the dust from your shoes.
- You could become the world’s foremost expert on reporting, using either PROC REPORT or PROC TABULATE. Every large organization needs reports. Don’t forget to throw some ODS in there while you are at it, and you’ll probably learn to customize templates with PROC TEMPLATE.
- You could become an artist with SAS/GRAPH . Personally, I think given the difficulty of getting what you want out of SAS/GRAPH it will be supplanted by other options sooner rather than later, but I failed art in junior high and never tried it again, so you probably shouldn’t be following my advice on this one.
- You could become the macro maven , actually, I think that term is already taken by a guy who works for the Center for Disease Control, named Ron Fehd, who has something like 11,734,789 posts on SAS-L (the SAS mailing list – a great resource, by the way).
Well, you get the idea, there are a bunch of things you could do. Since it is impossible for you to learn all of SAS, you specialize. Everyone does that to some extent, but we’re not talking about everyone, we’re talking about you. My suggestion, if you ask me, which you kind of did, since you are reading my book, is that you don’t specialize too early in your career.
For the past 29 years, I have been using SAS, but that has been twenty-nine years of roaming all over the map. I’ve written papers on all of those topics above and more – complex survey designs, macros, data visualization, logistic regression, data quality, mixed models. I’ve published articles in scientific journals using factor analysis, repeated measures Analysis of Variance using PROC GLM and even SAS Enterprise Guide. The organizations where I worked ranged from one of the largest corporations in the world to a small consulting company I started with two partners to a university – and almost everything in between. I’ve never worked for organized crime or llama breeders, although I have worked on a couple of projects involving buffalo (photographic evidence of actual buffalo submitted herein).
Given my own experience, I was troubled by the advice that the key to career success is to be the foremost specialist in some obscure application or language. That doesn’t fit with my experience at all. If I’d followed that advice I’d still be trying to program in Foresight, Fortran, BASIC and Tel-A-Graf.
I could have done actual research, but that would have required reading some academic journals and reading them is almost as boring as writing them (trust me, I know). Instead, I sent a shout-out on Twitter, got responses from smart people and quoted them. This is not, contrary to false allegations by my enemies, a lazy way of avoiding actual work – well, it is, but it’s also known as crowd-sourcing, accessing social media and leveraging Web 2.0 capabilities. That sounds so much better.
Responses varied, from statistician, Dr. Peter Flom, who took the compromise view that people could have success either as a specialist or a generalist. Then, there was Jon Pelletier, one of the foremost consultants in the nation, pushing the envelope with Excel. He, not surprisingly, thought whether specialization was a good thing depended on how in demand your specialty is.
The one who best summed up my view on the whole generalist versus specialist question was Evan Stubbs of SAS Institute in Australia who said.
Fly high, fall far; pay’s good for specializing until you go the way of the buggy whip. Generalists fit anywhere, learn faster
If you want to be a generalist, and at the same time go beyond the DATA step, you’ve come to the right book. The first section is generally useful information to know regardless of where you will be working and what your industry is. The next chapter introduces some statements, formats and functions that could be helpful anywhere. The following chapter shows new procedures and a few new tips on procedures you probably already know. The final chapter in this section is on beautifying your output because there is always someone in every organization who will want to know if your report or graph can be done in the colors of the national flag of Lithuania. You are not allowed to kill those people. There is a rule in the company handbook about it. I checked.
Because this is a full-service book, I have provided an image of the national flag of Lithuania above, for your viewing pleasure. It is my personal opinion that it could be made less dull by the inclusion of – well, anything. I do realize a naked mole rat would be inappropriate since their natural habitat is in Africa where Lithuania is not. I was going to superimpose a regular Los Angeles city rat on it to see how it would look but I did not want to take the risk of offended Lithuanians boycotting my book and keeping it off of the New York Times bestseller list. So, you will have to just use your imagination to envision how it would look. If you have no imagination, let me tell, it would look disgusting. The Lithuanians were right not to do it.
Regarding graphs in SAS – SAS 9.2 is a quantum leap. The SGPLOT, SGPANEL and SGSCATTER procedures are excellent, and if you really want to get down into the nitty gritty, there’s SGRENDER which lets you control one heck of a lot.
I wrote about scatter plots
http://www.statisticalanalysisconsulting.com/using-the-sg-procedures-to-create-and-enhance-scatter-plots/
Peter