Whipping your data into shape with SAS : Day 2 Fixing Errors & Identifying Input Datasets
Last post, we happily uploaded our data, read it into SAS using a combination of SAS utilities and coding, decided all was lovely and used this code to concatenate the 4 datasets.
DATA allplants ;
set import1 – import4 ;
IF you get an error at this point, what should you do?
Let’s say you get the error below?
118
119 Data allplants ;
120 set import1 – import4 ;
ERROR: Variable Finance_Commission___Interest_Co has been defined as both character and numeric.
121 run ;
This is one of those examples where you can be too clever. We aren’t going to use this variable in the analysis so let’s just drop it. Ask yourself, do I need this variable? If the answer is , as in this case, no you don’t, just drop it.
- The (drop =) after the dataset name will drop the variable you list.
- The (in = a) creates a temporary variable, a, that is true of the record comes from the dataset import1 and false otherwise.
- Since both options go in parentheses after the data set name you include both of these in the same set of parentheses.
- Now that you have the variables denoting the source dataset , you can use those in IF-THEN-ELSE statements like any other variable.
Data allplants ;
set import1 (drop =Finance_Commission___Interest_Co in=a)
import2 (drop =Finance_Commission___Interest_Co in=b)
import3 (drop =Finance_Commission___Interest_Co in=c)
import4 (drop =Finance_Commission___Interest_Co in = d);
if a then group = “student” ;
else if b then group = “control” ;
else if c then group = “devloper” ;
else if d then group = “testcase” ;
run ;
Now we’ve dropped the troublesome variable and have a group variable based on the source.
So, this code SEEMS like it should work and the data are all good. We look at the log and see no errors, but maybe we should take some more steps just to be safe.