Another of my favorite procs: PROC DATASETS
I’m just heading off to the Western Users of SAS Software meeting that starts tomorrow. After the keynote, during which I have promised not to swear even once, I’m doing a SAS Essentials talk on Thursday, where I teach students 10 basic steps that allow them to complete an entire annual report project.
One of these is PROC DATASETS. It is used twice in the project. First, they get a list of all of the datasets in the directory. We’re using SAS Studio which runs on the SAS server. Since students neither have access to issue Unix commands directly nor do they know any most likely, we use PROC DATASETS.
libname mydata "/courses/u_mine.edu1/i_1234/c_7890/wuss14/";
proc datasets library= mydata ;
This gives me the output below.
# | Name | Member Type | File Size | Last Modified |
1 | SLPOST_SCORED | DATA | 208896 | 26Jun14:04:00:40 |
2 | SLPRE_SCORED | DATA | 487424 | 26Jun14:04:00:41 |
3 | SL_ANSWERS | DATA | 619520 | 26Jun14:03:59:42 |
4 | SL_PRE_POST | DATA | 196608 | 26Jun14:04:00:03 |
Once we have cleaned up the data in every data set, we are not quite ready to start merging them together. A common problem is that data sets have different names, lengths or types for the same variable. You’d be wise to check the variable names, types and lengths of all the variables. So, here is where we use PROC DATASETS a second time.
proc datasets library= work ;
contents data = _all_ ;
This time, we added another statement. The “contents data = _all_ “ will print the contents of all of the data sets. In perusing the contents, I see that grade is entered as character data in one – 5th, 4th and so on, while it is numeric data in another. This is the sort of thing you never run into in “back of the textbook” data, but that shows up often in real life.
Those are two super simple steps that allow you to do useful things.
You can do more with PROC DATASETS – append, compare – but my plane is boarding so more about that some other time.