2011-10-28 11:10:00 Allan Engelhardt wrote in CYBAEA Data and Analysis:
A recent question on one of the LinkedIn groups about the advantages of using R over commercial tools like SAS or IBM SPSS Modeller drew lots of comments for R. We like R a lot and we use it extensively, but I also wanted to balance the discussion. R is great, but looking at commercial organizations near the end of 2011 it is not necessarily the right choice to make.
We have created and managed analytics teams in commercial organizations (mainly telecommunications) across Europe. The teams were using SAS or SPSS. Our company now has a commercial analytics as a service offering and we mainly use R.
The benefits of R is productivity. We want to spend time on the actions from the analytical insights, not the coding, and we choose our tool accordingly. Being a consulting type organization it is easier for us to attract and retain talent.
Big corporations have procurement departments who do not have a process for free software. Also software spend goes on the balance sheet in a way that the CFO prefers to people but something like R will take a little talent to set up initally. (And yes, we know the Revolutions guys well, but they are not really credible in Europe yet.)
This will change as (a) companies become more mature in their procurement and as (b) commercial support for R improves. (On the latter point, Oracle’s R integration to the database is great news.)
There are recognised training programmes for SPSS and (especially) SAS which makes it easier to recruit the technical skills. How do you know what somebody knows when they say they “know R”? How do you even begin to quantify it from a CV? How do you separate the guy who downloaded the tool and just read “An Introduction to R” from the Frank Harrells of this world?
Yes, I would argue (and in fact have argued in Commercial Analytics: The Capabilities) that technical skill is not the most important in an analyst (and can be learned anyhow) but it does help filter the CVs and, you guessed it, fits well with the corporation’s processes.
(One reason we use R internally is that we find that it is, on average, a more interesting type of analyst who is proficient in that tool. It seems to encourage curiosity and love or learning in a way that menu-based tools do not.)
I think the commercial R companies are really missing a trick here to provide recognised certification.
Yes I know I already said that but there is another reason why this is critical.
R takes talent to use. (That is kind of why we like it.) It takes talent to maintain.
My problem as the manager of a commercial analytical insights team is that it is very hard for me to retain that talent. Think about it: what can I offer in terms of career progression? If you are an analyst you might become a senior analyst but you will always be an analyst. There are no examples of a way up the organization (except perhaps out through IT and then up to CIO). [This too will change with time.] And new challenges: yes, some, but we are not a research university and it tends to be the same few problem types that we are always working on. So if you are an analyst looking for new challenges and more pay, the best thing – the logical and rational thing to do – is to get a new job. And your time with Big Corporation will look good on your CV and you will probably land the job easily.
Contact us now and get results from your analytics.
Subscribe to CYBAEA Data and Analysis
Jump to comments.
Can we make our analysis using the R statistical computing and analysis platform run faster? Usually the answer is yes, and the best way is to improve your algorithm and variable selection. But recently David Smith was suggesting that a big benefit of their (commercial) version of R was that it was linked to a to a better linear algebra library. So I decided to investigate. The quick summary is that it only really makes a difference for fairly artificial benchmark tests. For “normal” work you are unlikely to see a difference most of the time.
Commercial Analytics: The Capabilities
Commercial Analytics is the kind that makes money. From data to dollars, insights to income, this is all about how to run the business better. To do it and to do it well you need certain capabilities in place. This article builds a map of those business c…
Join the discussion
R Certification
Hey Allan,
We definitely agree with you on point 2.2. There is a need for some sort of certification, if only because hiring managers don't know what questions to ask. Often large SAS shops have been using SAS for quite some time and now what questions are needed to elicit a true representation of a candidates SAS abilities. The same is not true of R. Often the hiring manager is experienced in the legacy system, but not R.
Soon we should roll out some of our thoughts on certification and then we will be looking for the thoughts of the R community, so if you have anything you would like to see let me know.
Finally, we need some of that credibility in Europe ;)
Cheers,
Derek
Incentives and Risks
Allan,
There are certainly incentives for organizations to maintain the status quo, which you have covered with some good insights.
Another incentive to use SAS, etc. is the availability of support from the vendor. This can reduce the perceived risk to the company. The fact that the company is paying for it ( your first reason above) provides grounds to " insist" on adequate support levels ( which they will also pay for!). This can be a very important factor to those who are trying to play it safe.
Thanks!
R versus SAS/SPSS in corporation
I use SAS daily and I thin you have left out a few reasons both for and against.
You're incorrect in assuming most of the people who use SAS use the menu-drive SAS Enterprise Guide, Enterprise Miner, etc. In 29 years in diverse organizations I see the same thing - the great majority of people who use SAS write code, and often extend SAS by writing code in other languages as well. Assuming R = talented SAS = pointing, clicking is a bit over-simplistic.
The major problems I have seen with SAS are:
a) It does not run on a Mac.
b) It's installation process is insane - and I say this as someone who is an extremely experienced SAS user
c) It's free cloud-based service is pathetically slow. I'm holding out hope for that one, though, because it has increased so much from a year ago when it was just useless. Useless to usable and decent but slow is a pretty big leap.
There are advantages, too. First of all, amazing technical support. Second, a huge user group base. Third, and I think you under-emphasized this for corporations, an enormous amount of legacy code out there. You'd need to re-write the code, re-write the documentation and re-train the employees.
I've been meaning to write a blog about this for a while, so thanks for the extra motivation.