How to use Python and AI in qualitative research… in Excel!

Original article was published by stefanie lai on Artificial Intelligence on Medium

How to use Python and AI in qualitative research… in Excel!

Photo by Cookie the Pom on Unsplash

As a researcher and strategist, I like to do my content analysis in Excel. One day I wondered: what if I could just plug Python into Excel? That way I’d be able to use all the great natural language processing (NLP) and machine learning (ML) libraries.

Purists would say just run analysis in Python in a Jupyter Notebook. But honestly, as someone who’s not yet fluent in code this can feel dauting and difficult to integrate into an accustomed workflow, and sometimes it’s just nice to have access to Excel’s UI to sort, copy and paste etc.

For organizations, incorporating pre-written and custom Python functions into Excel can be a way of getting more functionality to your teams without extensive technical onboarding. This solution strikes the balance between being able to customize algorithms and still feeling simple and intuitive.

The best approach to incorporating algorithmic and machine learning driven tools into qualitative analysis is to see them as just that — tools. We can’t expect AI to come up with answers and strategies unaided; this will only lead to disappointment and feeling that these add-ons are gimmicky.

So why use ML / AI at all if humans are still the experts? I find that especially when I am working with a larger amount of data, these tools can help objectively look for trends as well as act as a check against impression biases.

Using the xlwings library, I’ve created ‘control panels’ to call these functions in an excel workbook which offers both ease of use and flexibility. Below I will demonstrate a few NLP functions that can aid in qualitative analysis. I’ve used a simple dataset of Amazon Echo reviews sorted by their ratings.

Keep in mind the greatest miracle of plugging in Python is being able to access API’s (application programming interfaces), which means you can access everything from AI generated text to complex mathematical formulas to the current moon phase.

Counting the most common words

Top N words demonstrated on a simple dataset of reviews of Amazon Echo. This simple exercise shows the people who love the Echo also probably use it to control their lights.

Have you ever wondered as you read through a block of text data if your impression that a word keeps coming up is real? And wouldn’t it be great if there was the data to back it up?

Word counting is one of those things that an algorithm can do super easily so why not? …even if it’s just used as a gut check.

The custom range allows you to run the function across different groups of data, and exclude the most common words.

Levelling up on counting words — TF-IDF finds the most important words or phrases

Most important 3-word phrases from top and lowest scored reviews compared against all reviews. The results back up the hypothesis that the reviewers giving good ratings are using Echo with Philips Hue lights.

TF-IDF stands for term frequency–inverse document frequency — it reflects how important certain words are in a block of text compared to a collection of texts. In content analysis, this becomes a powerful tool when looking at what emerges as the most important words or phrases used by a certain segment when compared to the sample as a whole.

The control panel allows for running the function on different segments of data for easy comparison. A user can call the function using different parameters — a process of tweaking, testing and comparison that lets you get acquainted with how the algorithm functions. This is extremely important for understanding the methodology as well as your dataset, therefore improving the quality of analysis and insights.

To improve the quality and robustness of your analysis, you can even custom train a TF-IDF model. For example, if you have a broad dataset that encompasses many types of consumer feedback, this could be considered a generic and broad dataset. This generic dataset then becomes the base against which to compare your target data, thus helping to identify and pull out the most important words and phrases.

Getting themes using K-means — an unsupervised machine learning method

The K-means algorithm identifies themes such as using the Echo for TV and music, using it as an alarm clock / timepiece, using voice control around the apartment, and to control Philips Hue lights. It’s not perfect — I would throw out the first cluster for example as too generic, but that’s where the researcher’s instincts come in.

K-means is an algorithm which groups pieces of data by similarity across a number of dimensions. Effectively, in content analysis this returns themes from a set of text data.

I find this method exciting because it is an unsupervised machine learning model — meaning the algorithm is not trained on a previous dataset and therefore outcomes are not pre-determined. This closely approximates and complements the qualitative research process.

Using this method in Python natively allows you to thoroughly explore the text, probe and triangulate the output against the rest of the data, and can be used to create themes and even customer personas. However, incorporated into a simple Excel function it offers a quick and powerful hint at some avenues to explore.

Sentiment analysis

Unsurprisingly a largely positive sentiment is detected for a batch of reviews that also have a high rating.

Sentiment scoring is perhaps the NLP function that I find most easily falls into the ‘gimmicky’ trap, however within this simple implementation it’s another ‘why not’ — you might get another data point to support an insight.

The sentiment analysis function in my demonstration above is algorithmic rather than machine-learning driven. It can be applied with relatively good accuracy across any type of text data. It returns a percentage negative, neutral and positive as well as an overall sentiment score ranging from very negative to very positive. It can be especially useful to compare overall sentiment and tone between two contrasting segments.

Of course, there are trained ML sentiment analysis models — meaning the models ‘learn’ and are tested and tweaked for accuracy using sets of coded data. If done right, these models can be more accurate and sensitive, and would be especially valuable if a lot of the data you are dealing with is about a certain topic or of a certain style. Text processing such as breaking down parts of speech and identifying common phrases can also help boost a sentiment analysis algorithm’s performance.

All of these features can be built into a custom function and then called the same way via a simple control panel in Excel.

Just to wrap up ..

I have demonstrated a few functions that I like to use when doing qualitative content analysis. But the takeaway here is that the possibilities are endless! If there’s any data point you feel might be useful, or an ‘if only’ that has been in the back of your mind, this could be a relatively simple and inexpensive way to dip a toe into the world of Python-driven analysis.

Some other easy and potentially very valuable applications I can think of are using Python to sort and highlight emails your organization receives, or to sort through reviews of your business or products. Of course some cool dataviz via matplotlib and Seaborn libraries is also a possibility but that’s an article for another day.

Stef is a freelance qualitative researcher and strategist and Python developer. Get in touch at