Should You Abandon Base R Functions?

​You will no doubt fall in love with the R language. ​You just have to get past its learning curve. But, once you do, you will discover just how powerful the language is. There is so much you can accomplish in so few commands. It's well supported, too!

​Read more about the benefits of R at Experts-Exchange, "Why R Programming Will Become Your Go To Language"

​The fact that it's well supported brings up an interesting situation. Should you abandon the functionality that comes standard in R (known as Base R) in favor of functionality included by the R community, when that new functionality drastically improves the language?

​The Case For Abandonment

There is something to be said for abandonment. Base R functions can often be slow, especially when dealing with complex structures and data. A great example of functionality that improves an aspect of Base R is the DPLYR package.

DPLYR was written by Hadley Wickham of RStudio fame. If you haven't heard of his name, then you haven't been working in R for very long. He is to R what Kernighan and Richie ​are to the C language (well, almost. Wickham didn't ​create the R language). However, he has done much for the enhancement of the language. The DPLYR package gives you the ability to slice and dice your data (among other awesome features) and was written in to be wicked fast, using C++ as its base.

As great as ​any package ​may be, should you or your company become dependent on it? Most packages will improve your productivity and this makes them tempting to use. However, is there a chance that package developers such as RStudio ​could start charging for ​their use? Suppose you encouraged your company to switch to a package, selling the benefits, one of the biggest is that it is open source (i.e., free!) You train a whole team of people to use ​the package to the point where they forget how to do the same functionality in Base R. In one fell swoop, ​the developer charges ​which costs the company money, and it's charged per ​seat. That could get expensive, quickly!

Disclaimer: I have no idea if RStudio has plans to start charging for DPLYR usage. I am only using this package to illustrate my point. It could be any package from any other company or developer! I am not making any implications here!

res <- transpose(html)[["results"]]

​What If Your Company Say No?

Your company may be faced with this dilemma, and could even prohibit outside libraries due this possibility. That would be a shame. If you are faced with this restriction, consider the following:

  • ​Open source probably would not have survived if its members participated in this scheme, i.e, get the community hooked on a great package and then pull the rug out from under them by charging later.
  • ​The blogosphere runs deep. By this I mean, many would cry foul if anyone tried this, and it could hurt them when introducing new packages in the future. If you know someone who pulled that stunt, would you let them do it again? They ​would be one trick ponies, basically.
  • There are plenty of smart folks in the open source community. For example, if ​RStudo did start charging for DPLYR (and he is one smart cookie) some other smartypants elsewhere would be all-too-happy to dethrone the king, so to speak. Imagine the hero status of the person who creates a substitute package!
  • Developers of open source packages often find other ways to monetize their efforts. For example, if you needed training on how to use a package, the developers could create Udemy courses, Skillshare courses, etc.

​...some other smartypants elsewhere would be all-too-happy to dethrone the king, so to speak.

Click to Tweet

I don't believe the risks are high ​that open source ​developers ​will start charging after their packages have been unleashed into the community. There is too much risk, and in my opinion not enough benefit, to the developers. ​This risk would prevent ​most of them from engaging in ​such practices.

It is hoped that you can use what is written here to convince your company to reverse any restrictions placed on third-party, open-sourced packages. ​Your company would suffer more by not taking advantage of the productivity gains ​in using the better packages.

About the Author James

James is a data science writer who has several years' experience in writing and technology. He helps others who are trying to break into the technology field like data science. If this is something you've been trying to do, you've come to the right place. You'll find resources to help you accomplish this.

follow me on:

Leave a Comment: