I was reading an article in Forbes, "Learning Data Science Skills Is Easier Than You Think." While I don't think learning data science is on par with rocket science, I think it's a stretch to state that it's easy.
The article was written by Dr. Anant Agarwal, who heads up the website edX.org. If you are unfamiliar with this site, it offers training programs in a variety of fields, including data science.
I have taken classes on edX.org and they are high-quality. I have total respect for the platform. Many of the courses you can audit for free.
Python Is Not the Only Data Science Skill
Dr. Agarwal did state that to get started with data science, one should learn Python. It's an easy language to learn and has several libraries that cater to data science. It has become the main language for data science.
I agree that Python is easy. I also agree that if you are considering a career in data science, learning Python is a great first step, one that will serve you well.
What I am struggling with in this article is giving readers the illusion that learning the rest of data science is as easy as learning Python. It's not! You'll need to know a fair amount of statistics, machine learning, deep learning, data analysis, and significant business knowledge.
You'll need to have a good grasp of visual tools and problem solving, too.
I don't state any of this to discourage anyone from learning data science. I just want to set the expectations appropriately, that the rest of the coursework will be a breeze like learning Python will be. It won't. It's setting people up for disappointment, in my opinion.
Check Out Job Listings
Want to know what you'll need to know for data science jobs? Check out current job listings for data science jobs. Let's take one recent job as an example:
It's interesting that Python is not listed here. I still agree that Python is the most popular language in data science, however. You'll see the language R listed, that is another popular data science language.
As you can see, statistics is the first requirement. I am not suggesting statistics is hard to learn, but I know from experience that most people won't breeze through the way they would with an intro Python course. I have tutored people in stats.
This listing also talks about the business aspects of the job. It's for a finance company, so having 2-4 years of quantitative analysis is on the requirements. Do you know how to create predictive models? That, too, is another requirement.
Let's try another example. This one has Python as the main requirement:
This job is likely from a company that listed stricter requirements previously and has since toned down their requirements because they had difficulty finding candidates. I don't know that for sure, but the wording of the requirements suggests that is the case.
The libraries they mention are Python libraries (numpy, scipy, pandas, scikit-learn). You can see statistics and data analysis listed as requirements, too. The reason I believe this listing may have been toned down is they are only asking for basic knowledge of these skills.
Once again, business knowledge is listed in the requirements for financial credit and marketing.
I am Not Trying to Be a Negative Nellie
I want to emphasize that I am not raising exceptions to this article because I am trying to be negative. I do believe you can learn data science, but let's dispel the notion that it's going to be easy. Also, let's dispel the notion that learning Python is all you need to get a job.
Continue going through job listings on various boards. I used ones from Indeed.com, but you can choose any that list jobs for data scientists.
I don't feel that Dr. Agarwal is being deceptive with the information in his article. He is trying to promote Python courses on his edX.org platform, which I am not against. The courses on that platform are high-quality and you will learn. Kudos for that aspect of the article.
But I do believe he missed the boat in the title of his article. I would have been happier if he said, Python is an easy skill to learn for data science, or something along those lines. I thinking he needed more depth in the article to highlight that the rest of data science is going to be more work than Python.
If you are considering learning about data science, set your expectations correctly from the start. It's attainable, but it isn't a piece of cake. You can join an aggressive training program and learn much of what you need in about six months, if you are dedicated and serious.
I do recommend learning Python as a start, just as Dr. Agarwal suggested, and go ahead and sign up for ones at edX.org.