Technical API how-to without the headache
The thing that I’m asked to do over and over again is automate pulling data from an API. Despite holding the title “Data Scientist” I’m on a small team, so I’m not only responsible for building models, but also pulling data, cleaning it, and pushing and pulling it wherever it needs to go. Many of you are probably in the same boat.
When I first began my journey for learning how to make HTTP requests, pull back a JSON string, parse it, and then push it into a database I had a very hard time finding clear, concise articles explaining how to actually do this very important task. If you’ve ever gone down a Google black hole to resolve a technical problem, you’ve probably discovered that very technical people like to use technical language in order to explain how to perform the given task. The problem with that is, if you’re self-taught, as I am, you not only have to learn how to do the task you’ve been asked to do but also learn a new technical language. This can be incredibly frustrating.
If you want to avoid learning technical jargon and just get straight to the point, you’ve come to the right place. In this article, I’m going to show you how to pull data from an API and then automate the task to re-pull every 24 hours. For this example, I will utilize the Microsoft Graph API and demonstrate how to pull text from emails. I will refrain from using pre-prepared API packages and rely on HTTP requests using the Python Requests package — this way you can apply what you learn here to nearly any other RESTful API that you’d need to work on.
If you're having trouble with API requests in this tutorial, here’s a tip: Use Postman. Postman is a fantastic app that allows you to set up and make API calls through a clean interface. The beauty of it is once you get the API call working, you can export the code in Python and then paste it right into your script. It’s amazing.
*Note: This tutorial is meant to be a simple and easy to understand method to access an API, that’s it. It will likely not be robust to your exact situation or data needs, but should hopefully set you down the right path. I’ve found the below method to be the simplest to understand to quickly get to pulling data. If…