Visual Analytics are EVERYWHERE: SAS Global Forum Continued

ByAnnMaria De Mars April 22, 2016

The nice thing about going to SAS Global Forum is that it’s the gift that keeps on giving. Long after I have gone home, there are still points to ponder.

Visual analytics is big and not just in the sense of there is a product out called that which I have never used but that every presentation, no matter how ‘tech-y’ now makes very effective use of graphics. If I was the type of person to say I told you so, I would mention that I predicted this six years ago after I went to SAS Global Forum in 2010.

In my last post, I mentioned the propensity score graphic with mustaches.

Richard Culter’s presentation on PROC HPSPLIT, which was really excellent, made extensive use of graphics to illustrate fairly complex models.

You can create classification and regression trees (the model you can’t see in this tiny graphic on the left) and you can drill down into sub-trees for further analysis.

Sometimes your classification tree is very easily interpretable. For example, in this case here from the same presentation, each split represents a different type of vegetation/ land surface – water, two different species of tree, etc.

Speaking of classification, regression and PROC HPSPLIT ….

If you didn’t know, now you know

PROC HPSPLIT is a high performance procedure for fitting and classification now available in SAS/STAT which is useful for data sets where relationships are non-linear. It produces classification and regression trees, includes options for pruning trees and a whole lot more. It is now available on a single computer, not limited to high performance computing clusters. So, yay!

A regression tree is what you get when your dependent variable is continuous, and a classification tree when it is categorical, as in the vegetation example above.

On a semi-related note, graphics can even be used to show when a data set is not suited to a linear model as in the example below, also from Cutler’s presentation. You can see that all of the 1’s are in two quadrants and all of the 0’s in two other quadrants. Yes, you COULD use a regression line to fit this but that is not the best fit of the data.

Also, on a related topic that visualizing data, like all of statistics, really, is a process of iterations, I think this would be more obvious if the quadrants were color coded.

‘

I have a lot more to say on this but I am in North Dakota speaking at the ND STEM conference this weekend and a kind soul gave me tickets to the hockey game in the president’s box, so, peace, I’m out.

Dr. De Mars General Life Ramblings | statistics

Really Teaching Statistics

ByAnnMaria De Mars September 12, 2014September 12, 2014

The new common core standards have statistics first taught in the sixth grade, or so they say. I disagree with this statement because as I see it, much of the basis of statistics is taught in the earlier grades, although not called by that name. Here are just a few examples: Bar graphs Line plots…

Software

Using SAS functions to force structure on unstructured data

ByAnnMaria De Mars November 15, 2010

Unstructured data is to the usual database as Toontown is to Irvine Ranch (or Diamond Bar or Porter Ranch or any other white bread community that has two names, six types of floor plans and where half the children are named Buffy, Jessica, Jason or Justin – you know who you are). If one were…

Software | Technology

How to Fix Your SPSS Truncated Data File

ByAnnMaria De Mars November 5, 2013November 5, 2013

Well, this morning started out annoying, and not just because I had to be at LAX at 8 a.m. I am sitting here trying to analyze an SPSS file someone sent me last night and I get this … Command: CORRELATIONS Incomplete (truncated) SPSS Statistics data file: /Users/annmaria/blahblah.sav Execution of this command stops. I tried…

Software | Technology

Converting dates: from character to numeric, from Excel to SAS

ByAnnMaria De Mars July 25, 2018

I’m back with another SAS Tip of the Day. Like a lot of people, I work with dates very often. How many days is it from when a client applies to when he or she is determined eligible? How many days until the average client is employed? You get the idea. Inconveniently, in this particular…

Software | statistics

Repeated measures with SAS: Common mistakes in PROC GLM

Byannmaria April 14, 2019April 14, 2019

When I teach students how to use SAS to do a repeated measures Analysis of Variance, it almost seems like those crazy foreign language majors I knew in college who were learning Portuguese and Italian at the same time. I teach how to do a repeated measures ANOVA using both PROC GLM and PROC MIXED….

Software | Technology

SAS Editor on the Web – Mac Lovers Rejoice

ByAnnMaria De Mars October 3, 2012

I’ve been somewhat of a fan of SAS On-Demand for Academics, but there are two problems. One is that it runs slow and the other is that it doesn’t run native on a Mac at all. Enter the SAS Web Editor. I just started using it yesterday and so far, I love it. It is…

One Comment

Pingback: What I learned from my favorite paper at SAS Global Forum : AnnMaria's Blog

Similar Posts

One Comment

Leave a Reply