The battle for the data science language-of-choice is on. The two standout contenders are clearly R and Python. There are other contenders, but they don’t come close to the popularity of the two main languages. Should you choose R or Python to learn as the language for your data science career?
The correct answer is that you should learn both. I have read that you can pick either one and run with that choice. That certainly can work. However, it can limit your career choices.
Suppose you decide to learn R and you put all your efforts into doing this. You’ll gain a command of the R language. However, Python has taken the lead in popularity. This means that companies will be looking for more people with Python proficiency than R proficiency. There are jobs for people who know R, but just not as many.
Disclaimer: The links and banners on this page may generate a commission for the site owner should you make purchases as a result from clicking on these entities.
This information could cause you to choose Python as the language to learn. It’s a good choice, but you still are leaving several opportunities on the table by not knowing R. If you already know R, you understand why the language is popular with data scientists. It’s well-supported and quite powerful.
Resources for Learning
You limit not only the jobs available when you learn one language vs another, but also the resources too. Suppose you choose to learn R over Python. You will find plenty of resources for R programming. But, there are many resources that will be available for Python as well. Python is equally well-supported. If you don’t know Python, you’ll struggle to follow along with the sample code provided. Further, you’ll have to transfer the code into R for it to work for you. This is not impossible, but it takes more time.
Should the Programming Language Matter?
If you are new to the data science field or the computer science field, for that matter, you may be thinking why should it matter which language you choose. In perfect world, it shouldn’t matter. When I first started in the computer science field, there was a shortage of programmers. Companies were willing to teach you the language they were using if you didn’t already know it going in. With the advent of the internet and globalization, however, that all changed.
Companies suddenly had the upper hand and required programmers to not only know the language these companies were using, but programmers had to be proficient in those languages. This constraint still holds true to this day. It was much easier to hire someone who could hit the ground running than to train someone.
The following is a great article on the differences between R and Python. Pay particular attention to the chart describing the differences near the end of the article.
GURU99. “R Vs Python: What's the Difference?” Meet Guru99 - Free Training Tutorials & Video for IT Courses, Guru99.Com, www.guru99.com/r-vs-python.html.
Should You Learn Other Languages Too?
Based on the information above, you may be tempted to go all out and learn as many languages as you can that are popular with data science. This is a mistake. The benefits of doing this won’t outweigh the costs, and your skills in the two main languages will suffer as a result. It’s not ideal to be a jack-of-all-trades. You are better off becoming a master of Python and R.
Too Much Emphasis on Languages?
I believe that companies are putting too much emphasis on the programming languages. It is an important aspect of a data science job, to be sure. But, it isn’t the main aspect of the job. When you learn a language such as R or Python, that doesn’t make you an instant data science guru. You can learn these languages without ever using data science libraries or machine learning algorithms. When you learn the techniques of data science, however, you can use any language to apply them. Of course, you’ll need to use some programming language when you set out to learn data science.
What Languages Will Be Popular in the Future?
No one can say with any certainty what languages will be popular for data science in the future. The field will have matured to the point where languages aren’t needed as much. What this means is you should keep on top of any developments in the field and adjust accordingly. This isn’t easy, and you won’t always get it right. But, it is one of those unspoken requirements of technology-related fields.
Unfortunately, this means what you learn now, i.e., Python and R, may not be as popular five years from now. But, these are the two main languages in the field at this point in time.
Start with One, But Commit to Learning Both
There is nothing wrong with getting started with one language and becoming proficient in it to the point where you can get hired. However, don’t become complacent. When you find a job, make sure you keep up the commitment to learning the other language you didn’t choose. When developments emerge, be ready to shift your focus. Doing so will help you recognize the opportunities as they present themselves.
If you want a good language to start with, you should learn Python first. It's a bit easier to grasp than R. It's not that the R language is difficult, but it wasn't a language that was created with programmers in mind. It was created for statisticians. Personally, I love the R language and it is my preferred choice.