Source: Deep Learning on Medium
For our specific sentiment analysis use case, we fine-tune a BERT model to perform a classification step of 3 possible classes: positive, negative, and neutral. What this means is that we need to take the pre-trained BERT model from Google and teach it how to analyze sentiments from sentences.
In short, the model needs to do the following steps:
- take in a sentence — in our case an English sentence — and produce a contextual word embeddings representation for the sentence (specifically for the [CLS] token)
- pass the [CLS] contextual word embedding vector into a 3-dimension output vector (each dimension corresponds to the 3 classes)
- perform a softmax on the 3-dimension output vector to determine whether the sentiment is positive, negative, or neutral.
What this means is that BERT takes a sentence and converts it to a mathematical representation that machines can understand and work with.
Here’s one way to think about it
BERT is an English-as-a-second-language student trying to understand English. He could read a sentence in English but it would be way easier if he translates it into a language that BERT is fluent in — let’s say, BERT-lish.
So BERT does just that — he translates the input sentence to BERT-lish. And finally, BERT determines if the BERT-lish sentence is positive, negative, or neutral.
This is of course a gross simplification of the BERT model, but hey, I don’t want to get more complaints about how long the article would have been if I tried to fully explain it. Here are some good resources if you want to learn more about BERT.
Fine-tuning BERT for sentiment analysis
I fine-tuned BERT, specifically, BertForSequenceClassification using the
bert-base-uncased model from the huggingface transformers library.
The dataset contains 14,640 tweets that were manually classified as negative, positive, or neutral. I created a training set with 80% of the dataset and kept the other 20% for validation.
I ran the training on a Google Colab notebook with a GPU and obtained the following validation results. The entire training process took approximately 20 minutes to complete.
The results show that the model has an 83% accuracy — that is, it is able to accurately predict the correct sentiment given an input sentence 83% of the time.
It also appears that the model is better at predicting comments that are positive than negative or neutral.
Great, now that we have a decent model that can predict sentiments, let’s test it out on a bunch of comments made by Singaporeans!
A random Sesame Street aside. I’ve been obsessively binging the incredible podcast The Weirdest Thing I Learned This Week and discovered that according to folklore, vampires are obsessive counters.
That is, they feel compelled to count every single object around them — which is why people threw poppy seeds on the burial grounds of suspected vampires so that they would be so preoccupied with counting them that they wouldn’t attack humans. So perhaps Sesame Street was actually on point with the characterization of Count Dracula. Anyway, on to the results.
Results: Sentiment analysis of comments on The Straits Times articles
The Straits Times is my favorite national newspaper. And judging from how fervently Singaporeans from all walks of life comment on the news articles, I’m sure it is also the beloved source of #realnews for many Singaporeans.
And so, to answer the two questions I posed earlier about (i) exactly how negative are Singaporeans, and (ii) do we dislike all topics equally, I collected a bunch of Facebook comments from 5 randomly selected articles on The Straits Times to run the analysis on.
The 5 articles I chose to study were:
- Husband runs off when wife is pregnant and defaults on paying maintenance
- China launches gigantic telescope in hunt for life beyond earth
- Tsai Ing-wen re-elected Taiwan President; KMT’s Han Kuo-yu concedes defeat
- Forum: Promote plant-based diet to cut Singaporean’s carbon footprint
- Menacing mynas, pigeons and crows: Complaints about nuisance birds rising
There were 257 comments in total across the 5 articles.
How good is BERT at analyzing the sentiments of Straits Times comments?
In the previous step, we fine-tuned the BERT model on the Twitter airline sentiments dataset, and verified that it works well. Typically, a good computer scientist would verify that this model also works well on a test dataset of interest — in our case the Straits Times comments.
What this means is that we should collect a bunch of comments from The Straits Times’ posts and manually label them as positive, negative, or neutral. However, I’m doing this in my spare time on a Sunday, so ain’t nobody got time for that.
Instead, I’m going to take the short-cut of using the model to predict the sentiments in The Straits Times’ comments and then looking through the predictions to see if they make sense.
Let’s take a look at the comments on any article — for example this one: Forum: Promote plant-based diet to cut Singaporeans’ carbon footprint. The article is an opinion letter that asks the government to intervene or “nudge” Singaporeans towards plant-based diets.
Correctly classified comments
The model does a pretty good job identifying the correct classes of many comments, despite some comments having spelling and grammar issues.