Are you considering a career in data science? It is one of the hottest fields and growing. Even if you aren’t looking for a full-time career in the area, you may find it helps you with gleaning insight from your data and systems. Whatever the reason, finding the right resources is essential. This article will help with the resources to become a data scientist.
The Choices Are Overwhelming
New resources are emerging almost on a daily basis in the data science field. Companies are clamouring for workers and workers are learning about the high pay. But having too many choices is overwhelming, to say the least. Where do you start? Which path do you choose? It's enough to make your head spin.
Streamline Your Learning!
How would you like to sidestep the learning process with many of the skills you need for your data science career? Learn about a resource that provides an all-encompassing platform that streamlines the training process!
That's why I wrote this article. To help prospective data scientists find the right resources that you can use immediatly. I try to find ones that cost the least amount of money, too. Great resources at a low price. That is a winning combination in anyone's book!
My other goal for this resource is to introduce inexpensive and even free resources to get started. No one wants to spend thousands of dollars on learning a field only to release that it isn’t what they were hoping for.
Oddly, with data science resources, it’s not always about getting what you pay for. I have seen courses offered for introductory Python or R programming that were charging more than $500. These courses don't seem teach topics that you can't get elsewhere.
The creators of these resources are out of touch. You can find quality resources to train you on these introductory courses that are either free or will cost you a minimal amount. Many of these low-cost options will teach you what you need to know to get you started in coding.
You have to ask what is it about these high-end courses offerings that would justify the cost. The course creators are trying to take advantage of those who are misinformed. If you are one of those course creators, feel free to comment below about what you are offering with this intro-level courses that justify that huge cost you’re charging.
Here is what I would suggest. Take a free Python course using one of the options listed below. Then, if you happen to find the syllabus of one of the high-end intro Python courses, compare what they are teaching to the free course. You have nothing to lose by taking the free course.
The moral of the story is to make sure you know what you are paying for.
What Skills Do You Need to Become a Data Scientist?
Many data science resources focus much of their efforts on teaching programming.
While this is an important component to data science, someone studying the field must be well rounded. A data scientist needs to know:
- data cleansing,
- business domain knowledge.
Learning programming and statistics is an excellent first step in your data science career. These are foundational skills. The resources in this guide will only focus on these topics. I will follow up with next-step resources in another article.
While it seems like a tall order to learn all the necessary skills for data science, the structure of the field is still forming. As the industry matures, you’ll see specializations in each of the concentrations. However, all data scientists will need to have a base understanding of each of them.
Data Science Programming
The prevalent languages in data science are Python and R. Python is overtaking R as the choice language due to its ease of learning. Both solutions have extensive support, and there seem to be competitive forces at work here.
If you are wondering which to learn R or Python, the correct answer is to learn both. That may be easier said than done. But, doing so accomplishes two objectives. The first is that you have access to more opportunities.
Companies often require specific languages for their job requirements. I believe they go overboard with this as if you are skilled in a few languages you can pick others up easily. However, that is the way of the world. It is a constraint we need to deal with.
Another reason to learn both languages is it will help you during your training. If you know R but find a lot of training modules in Python, you will likely continue your search to find R tutorials. While the number of training modules increases frequently, you’ll waste less time when you know both languages. The basics of the language aren’t the issue. It’s when you get into more advanced concepts.
DataCamp has created a unique program where all the learning is encompassed within the platform. They were one of the first to implement this strategy, and it's been working. They also keep an eye on the trends and adjust to them accordingly, which is something you want to see in a training company.
While the training is not free, with most of the courses, you can try the first chapter for free. You can learn quite a bit from the first chapters of many of these modules. It also gives you a feel for the training environment that comes with the platform.
DataCamp.com offers an innovative, leading-edge way of learning data science, data engineering, and some financial concepts. It's a resource to consider when learning data science.
Kaggle Python Course
Kaggle is a phenomenal resource that offers competitions to data science professionals. They use real-world problems submitted by companies, and these companies reward the best solution with prizes, often significant cash awards. Some of the contests are meant for practice and won’t earn you prizes. But, many will. Kaggle is a worthy resource to bookmark.
Kaggle also helps beginners with free courses. At the current time, it is offering an intro to Python course (free). Upon completion, you can continue on your coursework. Current course offerings:
- Intro to Machine Learning
- Data Visualization
- Intermediate Machine Learning
- Deep Learning
- Intro to SQL
- Micro challenges
- Machine Learning Explainability
Udemy offers both paid and free courses. The website has a large selection of courses on many topics, including data science.
The website used to include a filter when searching for free courses, but it appears to be gone. You can still find free courses with a Google search as follows:
free python courses udemy.com
If you want to find other courses offered for free, replace the “python” in the above search with the topic you want to learn. It’s not perfect, but you can usually find something of interest.
With Udemy, the website frequently offers significant discounts ($10-$15) per course. If you find a course that you’d like to take that isn’t free, consider waiting for a promotional deal. They happen several times per year, often within weeks of one another.
It can be frustrating as it’s a waiting game for the courses become discounted, but it does happen frequently. When dealing with beginning level data science, there are plenty of alternatives to Udemy.
The Source of the Languages
When dealing with open-source computer languages such as R and Python, there are bound to be resources for learning from the language creators. This is true with both R and Python.
This website falls in the category of Massively Open Online Education (MOOC). It was initially a cooperative effort with Google and edX. Many universities and large corporations (Microsoft) have joined the effort to offer training and micro masters degree-like programs. Many of the courses are offered for free (audit) but require paying for a certificate to become verified.
The interface is a bit cumbersome when first starting out, but the website gives tutorials on how to use it. It is interactive and often includes a discussion board for questions and comments. It won’t take too long to get used to the interface, however.
There are several courses offered in the data science field on edX. You can learn both R and Python (as well as other courses) from this resource. This resource provides Capstone projects which help you to reinforce your learning from other courses.
While learning statistics is a necessary part of the discipline included with data science, you don’t need to have advanced knowledge for this skill to be useful. You’ll need to understand probability, descriptive statistics, regression, and the fundamentals of hypothesis testing. You should also have a good understanding of Bayesian concepts.
Too often, you’ll read that you need to be an expert in statistics. That’s not true. It seems like everyone wants data scientists to be experts in every aspect of the field.
Obviously, the more statistics you know, the better. As long as you further your studies, you should not let your lack of “being an expert” in statistics hold you back from applying for jobs. You will need the basics, however, and that is what you’ll find with the following resources.
Khan Academy Statistics Courses
If you have never heard of Khan Academy, now would be a good time to see what they have to offer. This resource is geared towards students. However, they offer math and statistics classes. If you want to brush up or learn linear algebra, which is helpful in data science, this resource provides courses for that as well, which are free.
A Great Tutorial on Hypothesis Testing
At some point during your learning, you’ll need to come to grips with hypothesis testing. It is a core concept in data science and machine learning. You may have dreaded learning it in an intro statistics class in high school or college. You likely forgot about it as soon as you were done with the class. I found a great YouTube video that explains the concept for those that want to learn it or need a refresher.
Elite Data Science
How to Learn Statistics the Self-Starter Way
This is not a structured tutorial in the real sense of the word. It is more of a reference to several other websites with explanations on how those websites can help you learn statistics. It also explains why you need to learn the concepts mentioned.
If you had insider knowledge about what employers are looking for with data science people, you could use that information to your advantage. Toptal created a resource to help employers find the right data science candidates. Toptal is a premier resource that matches the world's best talent with great companies. Their data is likely to be spot on with hiring requirements. This is a must-have resource!
Learn more about Toptal here.
The resources in this article are by no means extensive. The field is dynamic, and you’ll find new resources popping up almost daily. As mentioned previously, there is more to data science than learning a language or two and applying a few statistical concepts. Machine learning, artificial intelligence, presentations skills, business domain expertise, data analysis, and data cleansing are all skills that will be needed and should be considered the next steps toward your journey to data science mastery.