Most people think that SQL is an arcane language. If you’re not familiar with SQL, it stands for Structured Query Language. It is the language of databases. Many companies have years of records stored on databases.
But with popular languages such as R and Python, it’s easy to believe that SQL is on its way out. The question becomes, should you learn SQL?
Purchases made from clicking on the links on this page may result in commissions generated to the site owner.
If you hold the belief that SQL is a dying language, you should realize that nothing could be further from the truth. Python and R have their uses, but they are limited by the memory of the workstations running those programming languages. SQL engines often reside on high-end machines with massive storage and significant RAM.
What About the Cloud?
Suppose you (or your company) have a cloud account with one or more of the big three cloud providers (Amazon, Microsoft, and Google). In that case, you can provision high-end machines for running data analysis or machine learning algorithms. That seems to be the direction many companies are taking. As data requirements increase, the average workstations at companies will struggle to keep up.
However, languages like Python or R and not meant to be replacements for databases. They’ll be used in unison with databases. Unless something kick-ass comes around to replace databases, they are here to stay and will continue to be workhorses.
Distributed Computing to the Rescue
Companies are also provisioning distributed resources, like Spark and Hadoop. But guess what? Spark has the SQL language built into the software. If you’re not familiar with distributed software, it divides work units into chunks and distributes them onto multiple machines. The technology is usually controlled by a lead program that is responsible for the distribution and the reassembly of the work units when all the work is complete.
No matter how you slice it, SQL ain’t going anywhere anytime soon, which makes SQL skills highly desirable. Don’t believe me?
Course Offerings in SQL on the Rise
If you are going down kicking and screaming on this one, know that online training websites such as DataCamp and Udemy are beefing up their SQL and database offerings. When I first started with DataCamp, they were primarily an R training shop. Udemy also taught a lot of R classes in the beginning. But both of these highly sought-after services are adding SQL as part of core offerings. Why would they do this unless there is a demand for these skills?
Related: How to Bucket Data in SQL
Good News – SQL Is Easy to Learn
If you are worried that learning SQL will take too much time, you’ll be happy to know that it’s not difficult to learn the basics. Even the advanced concepts are not hard to learn but are a bit more challenging to master.
If you are convinced about the need to know SQL, why not head over to one of the best resources to learn these skills? The resources for SQL and other data-related skills are growing like crazy on this website!
SQL also is a good language to get your foot in the door. Due to its high demand, you could use it as your stepping-stone language and then learn more challenging skills on-the-job. Many companies offer a stipend for learning, which you could apply to training sites such as DataCamp or Udemy.
SQL Can Provide Answers Quickly
When you become the go-to SQL person, your manager and your team will ask you more business questions more often. Why? Because SQL gives direct access to your data in the database. With other languages, or even with BI tools, like Power BI, you’d have to set up the connections to the database.
Unless you can anticipate the types of questions people ask, with tools other than SQL, ad hoc queries are not easily generated. A SQL pro will have the answers in a few seconds. It’s simply a matter of running a query directly against the database based on the requester's information.
It will take the SQL pro longer to export the results and send them to the requester than it will to run the query. I’m not joking. I have been working with SQL in excess of 15 years, and when you know the data schemas, you can obtain most answers in a short amount of time.
Creating Queries for Others
When you learn SQL, many others won’t. It’s not a sexy language. Your colleagues will spend time learning R or Python. If you do, you’ll be called upon to create queries for other people. You’ll be called upon to create data feeds. You could even head up the initiative of creating distributed processing systems, especially the ones like Spark that contains a SQL engine.
Reconciling Data Becomes Easier When Using SQL
Have you ever tried to reconcile aggregated data? For instance, when you load data into a BI tool, the tool often aggregates much of the data as a default. That’s because most BI analysis relies on aggregated data. It makes sense for these BI tools to default to aggregation mode, right?
Without direct access to the database, reconciling the aggregations is a bit more challenging. That’s because you need to take a sampling of data and perform the same aggregations in some other program, like Excel.
If it ties out, that’s great. But if it doesn’t, you need to find why it’s not tying out. Perhaps you didn’t understand the query properly. You’ll need to check the data for each iteration of problem-solving. This could require requesting database data multiple times if you didn’t solve the problem on the first pass.
Using SQL, though, you’ll likely have access to the source of the data.
You can easily run whatever queries you need to reconcile the data. Often, numbers that don’t tie-out can be a result of forgetting a component of a group – for instance, instead of sales by region, country, and gender, you left out the gender, etc. Having direct access to the database can uncover quickly the reason why the numbers didn’t tie out.
Need to Know SQL for Other Language Connections
If you need to connect to database engines via Python, you may still need to know SQL to get the data you need. Most BI tools (Power BI, Tableau) provide out-of-the-box connectors to the major database engines. However, if you want to run any sophisticated queries, you’ll need to know SQL to handle that.
To run SQL via Power BI (I know this engine better than Tableau), you select the advanced options during the connection and supply the SQL command. You could also use programming languages (R, Python) inside BI tools to access database data. Either way, you’ll need to have some SQL background or find somebody in your organization with knowledge in the skill.
Why Use SQL in BI Tools?
An obvious question you may have after reading the previous paragraph is why even use SQL in the BI tools? Can’t they perform the aggregations you need? The answer is maybe. Remember, database engines house a significantly larger number of records than most other data stores. The out-of-the-box queries that are run for default database connections will likely retrieve all records in the tables you choose.
Your workstation may not have the necessary configuration to handle the workload. By using SQL, you can narrow down the number of records to those that are needed for the analysis.
Of course, you may need to retrieve all the records in the selected tables, in which case you may need a high-end workstation with GPUs or TPUs to handle the work. At that point, you’ll probably need to find a cloud solution. But this presents other issues, like connectivity to your database or creating a new database in the cloud. That usually involves logistical and security challenges to overcome. It also doesn’t negate the need for SQL knowledge.
SQL Differences in Vendor Engines
Most major SQL vendors implement ANSI SQL. However, as it happens with competition, these vendors sometimes add software-specific tools that are not compliant with ANSI SQL. Will this hinder your efforts when you apply for jobs?
Not really. If you know SQL, the variations are not difficult to learn. Companies would prefer if you know their engines already, but it is not likely to be a showstopper. It will help greatly if you know one engine well, even if it’s not the engine of the company you are interviewing with.
Most of the differences are in working with stored procedures, which some companies discourage using. Stored procedures are pre-compiled units of work on SQL engines that are wickedly fast. But their proprietary nature makes them difficult to port to other platforms.
Imagine having thousands of stored procedure calls scattered throughout the organization. When you receive an initiative to change database engines (yes, this does happen!), all those stored procedures will be rendered useless.
One way around this is to layer the code so that the calls are made to the layers rather than directly to the stored procedures. However, you still need to rewrite the stored procedures in the new engines. They are not portable. You would not have to do any of this if you stuck to ANSI SQL coding.
Want to Start
Learning SQL Today?
Get started with your education in SQL. It could be the best move you make to learn a much-needed skill and get your foot in the door.
SQL is not a sexy language. But it is in high demand. It’s not difficult to learn, and it’s a great skill that provides opportunities. You will use SQL if you learn it. Companies have a lot of data stored inside database engines. Knowing SQL is the key to unlock that data for your team or organization.