State-of-the-art technologies in NLP allow us to analyze natural languages on different layers: from simple segmentation of textual information to more sophisticated methods of sentiment categorizations.
However, it does not inevitably mean that you should be highly advanced in programming to implement high-level tasks such as sentiment analysis in Python.
The algorithms of sentiment analysis mostly focus on defining opinions, attitudes, and even emoticons in a corpus of texts. The range of established sentiments significantly varies from one method to another. While a standard analyzer defines up to three basic polar emotions (positive, negative, neutral), the limit of more advanced models is broader.
Consequently, they can look beyond polarity and determine six "universal" emotions (e.g. anger, disgust, fear, happiness, sadness, and surprise):
Source: Spectrum Mental Health
Moreover, depending on the task you're working on, it's also possible to collect extra information from the context such as the author or a topic that in further analysis can prevent a more complex issue than a common polarity classification - namely, subjectivity/objectivity identification.
For example, this sentence from Business insider: "In March, Elon Musk described concern over the coronavirus outbreak as a "panic" and "dumb," and he's since tweeted incorrect information, such as his theory that children are "essentially immune" to the virus." expresses subjectivity through a personal opinion of E. Musk, as well as the author of the text.
Sentiment Analysis in Python with TextBlob
The approach that the TextBlob package applies to sentiment analysis differs in that it’s rule-based and therefore requires a pre-defined set of categorized words. These words can, for example, be uploaded from the NLTK database. Moreover, sentiments are defined based on semantic relations and the frequency of each word in an input sentence that allows getting a more precise output as a result.
Once the first step is accomplished and a Python model is fed by the necessary input data, a user can obtain the sentiment scores in the form of polarity and subjectivity that were discussed in the previous section. We can see how this process works in this paper by Forum Kapadia:
TextBlob’s output for a polarity task is a float within the range
[-1.0, 1.0] where
-1.0 is a negative polarity and
1.0 is positive. This score can also be equal to
0, which stands for a neutral evaluation of a statement as it doesn’t contain any words from the training set.
Whereas, a subjectivity/objectivity identification task reports a float within the range
[0.0, 1.0] where
0.0 is a very objective sentence and
1.0 is very subjective.
There are various examples of Python interaction with TextBlob sentiment analyzer: starting from a model based on different Kaggle datasets (e.g. movie reviews) to calculating tweet sentiments through the Twitter API.
But, let’s look at a simple analyzer that we could apply to a particular sentence or a short text. We first start with importing the TextBlob library:
# Importing TextBlob from textblob import TextBlob
Once imported, we'll load in a sentence for analysis and instantiate a
TextBlob object, as well as assigning the
sentiment property to our own
# Preparing an input sentence sentence = '''The platform provides universal access to the world's best education, partnering with top universities and organizations to offer courses online.''' # Creating a textblob object and assigning the sentiment property analysis = TextBlob(sentence).sentiment print(analysis)
sentiment property is a
namedtuple of the form
Where the expected output of the analysis is:
Moreover, it’s also possible to go for polarity or subjectivity results separately by simply running the following:
from textblob import TextBlob # Preparing an input sentence sentence = '''The platform provides universal access to the world's best education, partnering with top universities and organizations to offer courses online.''' analysisPol = TextBlob(sentence).polarity analysisSub = TextBlob(sentence).subjectivity print(analysisPol) print(analysisSub)
Which would give us the output:
Better understand your data with visualizations.
- 30-day no-questions money-back guarantee
- Beginner to Advanced
- Updated regularly (latest update June 2021)
- Updated with bonus resources and guides
One of the great things about TextBlob is that it allows the user to choose an algorithm for implementation of the high-level NLP tasks:
PatternAnalyzer- a default classifier that is built on the pattern library
NaiveBayesAnalyzer- an NLTK model trained on a movie reviews corpus
To change the default settings, we'll simply specify a
NaiveBayes analyzer in the code. Let’s run sentiment analysis on tweets directly from Twitter:
from textblob import TextBlob # For parsing tweets import tweepy # Importing the NaiveBayesAnalyzer classifier from NLTK from textblob.sentiments import NaiveBayesAnalyzer
After that, we need to establish a connection with the Twitter API via API keys (that you can get through a developer account):
# Uploading api keys and tokens api_key = 'XXXXXXXXXXXXXXX' api_secret = 'XXXXXXXXXXXXXXX' access_token = 'XXXXXXXXXXXXXXX' access_secret = 'XXXXXXXXXXXXXXX' # Establishing the connection twitter = tweepy.OAuthHandler(api_key, api_secret) api = tweepy.API(twitter)
Now, we can perform the analysis of tweets on any topic. A searched word (e.g. lockdown) can be both one word or more. Moreover, this task can be time-consuming due to a tremendous amount of tweets. It's recommended to limit the output:
# This command will call back 5 tweets within a “lockdown” topic corpus_tweets = api.search("lockdown", count=5) for tweet in corpus_tweets: print(tweet.text)
The output of this last piece of code will bring back five tweets that mention your searched word in the following form:
[email protected]: How Asia's densest slum contained the virus and the economic catastrophe that stares at the hardworking slum population...
The last step in this example is switching the default model to the NLTK analyzer that returns its results as a
namedtuple of the form:
Sentiment(classification, p_pos, p_neg):
# Applying the NaiveBayesAnalyzer blob_object = TextBlob(tweet.text, analyzer=NaiveBayesAnalyzer()) # Running sentiment analysis analysis = blob_object.sentiment print(analysis)
Finally, our Python model will get us the following sentiment evaluation:
Sentiment(classification='pos', p_pos=0.5057908299783777, p_neg=0.49420917002162196)
Here, it's classified it as a positive sentiment, with the
p_neg values being ~
In this article, we've covered what Sentiment Analysis is, after which we've used the TextBlob library to perform Sentiment Analysis on imported sentences as well as tweets.