Review of Automatic Sarcasm Detection: Survey



The importance and use of sentiment analysis are growing each day as companies and businesses use it to improve their service by tracking the customer’s satisfaction, behavior, and needs over time, thus granting them insights on their own customers and products. Yet, there are several challenges in detecting accurate sentiment polarities from negation and sarcasm to spam and fake.

Automatic sarcasm detection is an important step to correctly predict the sentiment of a text. A survey conducted by Joshi, Bhattacharyya, and Hattacharyya published in ACM Computing Survey’17 under the name; “Automatic Sarcasm Detection: A Survey” compiles all the past work done in the topic from used datasets and approaches to issues and trends.

Based on the survey, the most common solution to the problem is to cast it as a classification task; classifying the text into sarcastic or not, the most common approaches to the problem are:

  1. Rule-based Approaches: detection of sarcasm based on specific rules; for example, hashtag sentiment depends on detecting whether there is a contradiction between the meaning of the tweet and the tags used, parse–based lexicon generation algorithm looks into the occurrence of negative phrase in a positive sentence whereas another classifier looks into the occurrence of a positive verb in a negative situation.
  2. Statistical Approaches: SVM, Logistic Regression, Naive Bayes, Decision Trees, and Fuzzy Clustering have been used to classify texts into sarcastic or not, but these approaches depend highly on the extracted features that are used. Most of them utilize bag-of-words (BoW) as features, but others also use features based on semantic similarity, emoticons, counterfactuality as well as incorporating ellipsis, hyperbole, and imbalance in their set of features.
  3. Deep Learning-based Approaches: they are used either as a feature extractor through using similarity between word embeddings or as a classifier model by combining multiple different architectures of neural networks.

The main problems in Automatic Sarcasm Detection according to the survey varies from the accuracy and quality of the annotations of the data and skewed datasets to the complex relationship between sarcasm and sentiment as a feature.

For further reading, please refer to Joshi, B., C., 2017, “Automatic Sarcasm Detection: A Survey”, ACM Computing Survey (ACM-CSUR), Article №73, Volume 50 Issue 5.

Source: Deep Learning on Medium