API Guide for Data Scientists

Original article can be found here (source): Artificial Intelligence on Medium

API Guide for Data Scientists

How to use APIs and REST + JSON.

APIs are the backbone of software everywhere. APIs allow us to package up and expose code for other users or software to consume. And at Spawner we use JSON for everything from delivering data to accessing NLP, Financial ML, and Computer Vision in 3–5 lines of code. In this guide we’ll provide clear code examples to get you started with working with APIs. We’ll also give you a brief on REST and JSON for understanding data interchange standards.

Why APIs in Data Science?

In short, code reuse is the reason we need more APIs in Data Science. Code reuse is core across software, but somehow in Data Science we struggle with it. With more capable APIs we can largely solve our issues of code reuse. We need to establish a mindset of deploy and maintain once, reuse everywhere.

APIs fit beautifully into the current data stack. It’s great to be able to get all your models and data into one clean API that anyone throughout your project or company can access with the right credentials. For an example of how this might look:

With this API, you’re building a way for all your users to concurrently access your data stack. They no longer have to copy models and redeploy it themselves. However, we need some easy way to get our users access to our API “endpoints.” Enter JSON.

Using JSON

JSON is a standard file format used for storing and transmitting data. Almost all APIs use JSON for sending and receiving text and numbers. XML is often used as well for its flexibility. However, in JSON we can send images and other types by encoding the data as text and numbers.

For our code example, we’re going to use the Spawner API to do some ML tasks in very few lines of code. We’ll access the Spawner API documentation to see what endpoints are available for use.

We’ll use Python for making our “requests.”

There are 2 main REST API “verbs” we’ll focus on: GET and POST. We’ll focus on GET whenever we’re retrieving data from the API and not sending data. We’ll use POST whenever we’re sending data for inference and expecting a result sent back.

To illustrate this further, here’s how this might look:

In the above GET request we’re making a request to the API where there’s only one field: <token>. Many APIs require the user to have a token for authentication. We could very simple add our token to our code for this request. Here’s the above request in code:

# Tokens available at https://spawner.ai
token = 'sp_vnz938vnzjd93nvnz'
url = "https://spawnerapi.com/fundamentals/" + token
response = requests.get(url)
print(response.json())

The above fundamentals endpoint returns ratings for all covered equities in the S&P 500. It’s a Machine Learning model written by training many examples of historical data on fundamentals across cash flows, balance sheets, and basic equity data.

The POST is different from the GET because we’re sending the data in JSON format as [{‘text’:’what is the p/e ratio of apple?}] which the API will be able to process and send the result back. Here’s the POST in code:

def answer(): text = ‘What is the p/e ratio of apple?’ url = ‘https://spawnerapi.com/answer/’ + token data = {‘text’: text} headers = {‘Content-type’: ‘application/json’} x = requests.post(url, data=json.dumps(data), headers=headers) print(x.text)

# Tokens available at https://spawner.ai
token = 'sp_vnz938vnzjd93nvnz'
text = 'What is the p/e ratio of apple?'
url = 'https://spawnerapi.com/answer/' + token
data = {'text': text}
headers = {'Content-type': 'application/json'}
x = requests.post(url, data=json.dumps(data), headers=headers) print(x.text)

The /answer endpoint is a Natural Language Processing endpoint that takes any financial question and returns an answer!

You can read about more useful Machine Learning endpoints here.

And here’s some example code for getting started with APIs in Python as well as in a Jupyter Notebook of API examples for easier use.

Closing

In your own company, you should be building your models and properly exposing them in an API, perhaps with various permutations of the same model. Whether or not you use the Spawner API or choose to build your own API ecosystem internally, keep things nice and clean, write good documentation, and encourage others to make good use of your existing code!