Have you ever tried to create a web scraping application before? Or, perhaps you have used one online that someone else created?
If you have coded web scraper before, you know what a pain in the you-know-what this effort can be. It requires you to get down to the granular level of each element in many cases.
You feel proud of when you finally get the scraper the way you want it and then BAM! it changes without warning. Your scraper no longer works. This can (and does!) also happen with the third-party scrapers that you find. Although those third-party developers eventually may fix the problem (or not!) it takes time for them to get around to doing so.
APIs to the Rescue
An API, or Application Programming Interface, is a set of commands that are published by a website owner. The advantage for the website owner is the access to the data can be controlled. In many cases, the owner can charge for access, but many APIs (or a subset of them) are free.
The owner of the website takes care of the internal coding with the added benefit that if they change anything, they will handle all the details of the change. Users of the API need only make calls
A good API will never break the interface, even when changes are made. What this means is when you write a line of code that gives you certain information from the website, you won't have to worry about future changes.
Where to Find APIs
This article isn't about how to write code for an API. There are plenty of resources for you by doing a simple Google search. However, you may not know what APIs are available, and that is what this article is meant to help you with.
Most companies that publish APIs will include a section on the website for developers. After all, you will need to use programming to interface with these APIs. Therefore, the easiest way to find a website that supports an API is to use the word "developer" in a search.
More specifically, you can use the Google advance operator inurl like this:
You can choose to use one or the other (developer or api) or you can use both. Experiment with it to see what results you get.
Which Programming Language Should You Use?
Most APIs will support multiple languages. The companies realize that people likely won't learn another computer language just so they can use an API. Of course, there are exceptions to this. But, all equal, APIs will support multiple languages.
In certain cases, the language that you know may not be supported by some APIs. You have a few choices if this is the case. One is to outsource your development efforts with someone who knows the APIs and the languages they support. If the requested information is not too complicated, you should be able to find someone who can program what you are looking for a relatively small price.
The other option is to learn the program yourself. When I first was thinking up ideas for this article, I thought to myself what if non-programmers read it? Then, I dismissed this because if you are here it's because you either are in the data science field already, or looking to get it. Either way, you will need to learn coding.
Most providers of APIs know that Python is almost the lingua franca in the programming world, if there could be such a thing. Therefore, it's a good bet they will include Python in their mix for programming their APIs. Therefore, you should get up to speed on programming in Python.
Python and R are the two leaders in the data science world. Whether this lead will continue unabated, is anyone's guess. But, this is where it stands today, at the time of this writing.
I have shown you a good way to find APIs. But, this will require coding from you or someone you hire. If you decide to do it yourself, I recommend learning Python as many API publishers are including this language to access their command sets. You can learn Python for free using this resource.