The Accuracy of Sentiment Analysis: Four Pitfalls

The widespread use of platforms like forums, social networks, and blogs has led to an explosion of user-generated data. Consumers, seeking guidance on purchases and entertainment choices, often turn to reviews to inform their decisions.

However, manually processing this vast sea of data to understand user sentiment is incredibly time-consuming. This is why an increasing number of companies and organizations are turning to automated sentiment analysis methods to help them understand it.

What Is Sentiment Analysis?

Sentiment analysis involves examining people’s opinions and emotions, usually by analyzing language patterns. While it may appear to be a simple text classification task, a deeper dive reveals several significant challenges that can impact the accuracy of sentiment analysis. Let’s explore some of these common pitfalls:

Irony and sarcasm
Different types of negations
Word ambiguity
Multipolarity

We’ll delve into each of these challenges to understand how they affect the performance of sentiment classifiers and discuss potential solutions.

Sentiment Analysis Challenge No. 1: Sarcasm Detection

Sarcasm poses a unique challenge for sentiment analysis because it involves expressing negative sentiments using positive language. Sentiment analysis models that aren’t equipped to detect sarcasm can be easily misled.

Sarcasm is particularly prevalent in user-generated content like social media posts and comments. Detecting sarcasm accurately relies heavily on understanding the context, topic, and environment in which the text appears, making it tricky for both humans and machines.

The ever-changing nature of language used in sarcastic expressions makes it difficult to train effective sentiment analysis models. Shared knowledge of common topics, interests, and historical context is crucial for both parties involved to grasp the intended sarcasm.

Looking at sarcasm from a linguistic perspective, where it has been extensively studied, Elisabeth Camp in her one of the most-cited pieces of research in this field outlines four primary types of sarcasm:

Propositional: The sarcasm appears to be a neutral statement but carries an implicit sentiment.
Embedded: Sentiment incongruity is woven into the words and phrases themselves.
Like-prefixed: The use of “like” implies a denial of the statement being made.
Illocutionary: Nonverbal cues like body language and gestures contribute to the sarcastic meaning.

Elisabeth Camp's four types of sarcasm: Propositional ("This looks like a perfect plan!"), Embedded ("I love being ignored."), Like-prefixed ("Like those guys believe a word they say."), and Illocutionary "(shrugs shoulders) Very helpful indeed!".

Building on Camp’s 2012 research, Stanford University researchers in 2017 presented their work “Having 2 hours to write a paper is fun!”: Detecting Sarcasm in Numerical Portions of Text on a newly identified type of sarcasm called numerical sarcasm. This type of sarcasm is widespread on social networks and involves altering numerical values to convey sarcasm. Here are some examples:

“This phone has an awesome battery back-up of 38 hours.” (Non-sarcastic)
“This phone has an awesome battery back-up of 2 hours.” (Sarcastic)
“It’s +25 outside and I am so hot.” (Non-sarcastic)
“It’s -25 outside and I am so hot.” (Sarcastic)
“We drove so slowly—only 20 km/h.” (Non-sarcastic)
“We drove so slowly—only 160 km/h.” (Sarcastic)

As illustrated, the difference in meaning hinges solely on the numerical value—hence the term numerical sarcasm.

Various approaches are used for automatic sarcasm detection, such as:

Rule-based methods
Statistical methods
Machine learning algorithms
Deep learning techniques

Deep learning-based methods are gaining traction due to their effectiveness. A study Kumar, Somani, and Bhattacharyya concluded in 2017 demonstrated that a specific deep learning model, the CNN-LSTM-FF architecture, surpassed previous approaches, achieving the highest accuracy in numerical sarcasm detection.

The prowess of deep neural networks (DNNs) extends beyond numerical sarcasm. Ghosh and Veale in their 2016 paper showcased the effectiveness of a combined convolutional neural network (CNN), long short-term memory (LSTM) network, and DNN for general sarcasm detection. Their research showed that this deep learning architecture outperformed traditional approaches like recursive support vector machines (SVMs).

Sentiment Analysis Challenge No. 2: Negation Detection

Negation in linguistics refers to the reversal of polarity for words, phrases, or even entire sentences. Linguistic rules help identify instances of negation, but it’s crucial to determine the scope of words affected by negation cues.

The scope of negation can be variable. For instance, in “The show was not interesting,” the scope only extends to the word “interesting.” However, in “I do not call this film a comedy movie,” the negation spans the entire latter part of the sentence. Words within the scope of negation take on the opposite polarity, altering their original meaning.

A common approach to handle negation in sentiment analysis, employed by many advanced techniques, is to label all words between a negation cue and the following punctuation mark as negated. However, the effectiveness of this method can be influenced by language nuances in different contexts.

Sentences can contain various types of several forms to express a negative opinion:

Morphological negation: Denoted by prefixes like “dis-” and “non-” or suffixes like “-less.”
Implicit negation: Conveying negativity without explicit negative words, like in “With this act, it will be his first and last movie.”
Explicit negation: Directly using negative words like “not,” as in “This is not good.”

Datasets used for training and testing sentiment classification models benefit from incorporating diverse examples of these negations to enhance the model’s ability to handle them. Recent research on recurrent neural networks (RNNs) suggests that different LSTM model architectures excel at detecting various types of negations in sentences.

A paper Effect of Negation in Sentiment Analysis analyzing 500 reviews from Amazon and Trustedreviews.com found that incorporating negation detection significantly improved the accuracy of sentiment analysis models.

Sentiment Analysis Challenge No. 3: Word Ambiguity

Word ambiguity presents another hurdle in sentiment analysis. The challenge lies in the fact that the sentiment polarity of some words is heavily dependent on context, making it impossible to predefine their polarity.

Lexicon-based approaches are popular in sentiment analysis, utilizing opinion lexicons that contain sentiment values for various words. Publicly available lexicons like SentiWordNet, General Inquirer, and SenticNet exist. However, due to the context-dependent nature of word polarity, developing a universal lexicon that covers every word’s polarity in all situations is impossible. Consider these For example:

“The story is unpredictable.”
“The steering wheel is unpredictable.”

These examples highlight how context influences sentiment. In the first sentence, “unpredictable” carries a positive connotation, while in the second sentence, it implies a negative sentiment.

Sentiment Analysis Challenge No. 4: Multipolarity

Sentiment analysis can be further complicated when a single piece of text expresses multiple sentiments or opinions. Relying solely on an overall sentiment score in such cases can be misleading, similar to how an average can obscure the individual values it represents.

For instance, an article or review might discuss multiple entities like people, products, or companies, offering both praise and criticism within the same text.

An overall sentiment score wouldn’t capture these nuances. Therefore, it’s essential to identify and extract individual entities or aspects within the text, assign sentiment labels to each, and only calculate the overall polarity if necessary.

Take this example: “The audio quality of my new laptop is so cool but the display colors are not too good.”

Some sentiment analysis models might assign a negative or neutral polarity to this sentence. However, a more nuanced approach would involve recognizing “audio” as a positive aspect and “display” as a negative aspect.

For a more comprehensive understanding of this approach, I recommend the insightful paper Deep Learning for Aspect-based Sentiment Analysis by Bo Pang and Lillian Lee from Cornell University.

Improving Sentiment Analysis Accuracy: These Aren’t Edge Cases

This article delved into common sentiment analysis challenges: sarcasm, negations, word ambiguity, and multipolarity. Being aware of these pitfalls is crucial for developing robust sentiment analysis models. Addressing these challenges significantly enhances the accuracy of sentiment classification models. I hope this article has served as a valuable introduction to the complexities of sentiment analysis.