Original article can be found here (source): Artificial Intelligence on Medium
Writing the Code
I highly recommend following along using Repl, it is a simple to use yet powerful online IDE that works great and requires no setup. Perfect for a one-time data extraction or dataset creation project.
Let’s start by calling the necessary libraries that we need. We’ll be using
pandas to create a dataframe and save our dataset, and
time to pause the execution of the loop.
Connect to the API
Next, we need to authenticate and connect to Spotify’s API. To do so, we need our “Client ID” and “Client Secret”.
In the code above, replace the
Client ID and
Client Secret variables with your own and make sure they are inside quotes.
Retrieve IDs for each track
As I mentioned earlier, I’ll be extracting Kehlani’s albums and singles from a playlist I created which is a collection of 54 songs (~3 hours), containing every single and every song from every album she has released that is currently on Spotify, updated with her new singles.
Now we’ll write a function to get the IDs for each track of this playlist.
At the bottom of this code, where it says
ids = getPlaylistTrackIDs, the two variables separated by a comma in quotes will be the username (found in the URL) of the person who created the playlist, and then the playlist URI which you can find by hitting the setting button on the playlist where you’d find the share link.
Now, let’s check what we have so far by running the length of the
ids we grabbed to see if it matches the 54 songs we have on the playlist.
This is the result:
54['7AiMnJSODcJoKDejQ3mnoJ', '73C4vh7W8u41Vll5HvBqv7', '5kYZbBLAGrrhFKNbOs6D95', '18z6OV5lknJmKnZi7aA1zH', '1gHtbcRP4tz1O1NsxPpBea', '3kJudfRjZMItdFYVCCaSi6', '6mzaCRuLTRiz1caGOum3zT', '5h4Uqkh9RpRZwm5ADLh5uj', '3rGew9pmFEmGD9nZ12F1tN', '0dYDmow4l5hbPs5E6QLMSC', '1B3jkf6CyuiF8CQcKlUx9y', '6DkmFhzJrkVhDlcgcEy7Pc', '5dKy6Cgv6xwiRY3j3AJ7Uq', '0oz4ZqHuUaz3uEkP2vD0u8', '389hKTL3ZBPPWP3VuXfEyv', '5cw9s2zGrbny2M2p3WRmGm', '3YaMX9Cf68dxiG6RKo0pSY', '1cAL4sFzXXRMbpZnTPa7Zi', '4w5BVeKJFCj2rrrEy31s0n', '4ta2AWru6ldjg1aHzww0aK', '45DJ0PbKPdbslnyrcM80HN', '7yNu82yd6dYmGQ0H1q0jKo', '4v4HwTfMPslhWAnJxIXchn', '6XptjfnUvLfejptpjPRhCT', '1y2SK8EjL3WSnJvJEMWOoq', '4UMp46x46Zmu9OEr8m3Gl2', '7nb50hgKYhnHJLHKZ7qiKO','5y5OzukBTl0yTRMEdNmApJ', '3Hdl3BEFb1IEbL0Jq53enx', '0lsC0OkBgiLYbSsoHOzMnr', '0Pm1BZp4MpoMKkNxIXCfAu', '2droOB3xlZkhgfUM0owDTq', '2Nd2HLWrIq1DcNMiYPTQUC', '4j644tViOFAf4i0BYT12R8', '4k6hX9RKD096K1NCjjJZLc', '0aSW5EMeNnQSMJQ8QN3zIW', '0tkmYNfaEaH9HpR59ApRtE', '0kas95RruYRVqrOb07rgkh', '4BOikd4oZjOYMde9AXfrTo', '6ZRuF2n1CQxyxxAAWsKJOy', '1EGrDTfEuAiRzRdxlblpET', '32s2Dn9EVvO2f85MrpRoBV', '4jM3c9KLTO9iZPm9A7neiL', '6GCRnf1W9OKxok9fvNp3pz', '1Zm9qGPQkTAOBiVpGSnJUq', '3ucRKbRlikYHyoI17gfR0c', '6HUO25AttZZCoKAY0vUVtc', '5Qr7StTFbXhgHt9JlqJx0I', '1QzC4y8h6WFxHE4KlokhVr', '1xz905v9g71heS0BQQM9re', '5QTdOvIF2ehBMZpSIIGzIo', '7IJiDYPZy2AIJn3YVHhvD4', '23wuZgeX1oyJ43QYOTo9s7', '0tBBihoEWiWKqsO5ZlCbwS']
Create a function used to grab all track info from IDs
Now that we have a list of every track
ids we need, we will now write a function that will be used to grab the track information such as track name, album, release date, length, and popularity, along with the list of audio features that Spotify’s API has to offer from each track
Loop over tracks and apply the function
We’ll now loop over the tracks — applying the function we created— and save the dataset to a
.csv file using pandas.
My raw dataset looks like this: