This article demonstrates how to conduct Sentiment Analysis on live Twitter data using Python and TextBlob.

Previously, I authored an article on a similar subject, focusing on Sentiment Analysis on Tweets using TextBlob and leveraging the NLTK’s Twitter Corpus.

GetOldTweets-python allows you to:

  • Retrieve tweets from any user
  • Search for tweets containing specific text
  • Find tweets from specific date ranges
  • Locate tweets based on geographic location
  • Filter tweets by language
  • Search tweets by hashtags
  • Filter tweets by the number of retweets
  • And much more...

Additionally, GetOldTweets-python offers the capability to export tweets to a CSV file, enabling you to save tweets first and then process them later.

TextBloboffers an API capable of executing various Natural Language Processing (NLP) tasks such as Part-of-Speech Tagging, Noun Phrase Extraction, Sentiment Analysis, Classification (using Naive Bayes and Decision Trees), Language Translation and Detection, Spelling Correction, and more.

TextBlob is built upon Natural Language Toolkit (NLTK).

Sentiment Analysis involves examining the sentiment of a text or document and categorizing it into classes such as positive or negative. Essentially, it classifies text as either positive or negative, but additional categories like neutral, highly positive, and highly negative can also be included.

Installing TextBlob

You have to run the following command to install TextBlob:

pip install -U textblob
python -m textblob.download_corpora
 

Simple TextBlob Sentiment Analysis Example

Let's look at a basic TextBlob example that performs Sentiment Analysis on a given text. The sentiment property provides two scores for the text: Polarity and Subjectivity.

The polarity score is a float within the range [-1.0, 1.0] where negative value indicates negative text and positive value indicates that the given text is positive.

The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective.

from textblob import TextBlob

text = TextBlob("It was a wonderful movie. I liked it very much.")

print (text.sentiment)
print ('polarity: {}'.format(text.sentiment.polarity))
print ('subjectivity: {}'.format(text.sentiment.subjectivity))
'''
Output:

Sentiment(polarity=0.62, subjectivity=0.6866666666666666)
polarity: 0.62
subjectivity: 0.686666666667
'''

text = TextBlob("I liked the acting of the lead actor but I didn't like the movie overall.")
print (text.sentiment)
'''
Output:

Sentiment(polarity=0.19999999999999998, subjectivity=0.26666666666666666)
'''

text = TextBlob("I liked the acting of the lead actor and I liked the movie overall.")
print (text.sentiment)
'''
Output:

Sentiment(polarity=0.3, subjectivity=0.4)
'''
    

Using GetOldTweets-python to fetch Tweets

  • Clone the GetOldTweets-python repository
  • Navigate to the cloned repository's directory
  • Run the Main.py file, which includes the example code
python Main.py

Note:

At the time of writing this article, the GetOldTweets-python repository does not support adding a language filter to search queries. However, there is a pull request that adds this functionality, though it has not yet been merged into the main branch. Hopefully, it will be merged soon.

You can refer to this fork of GetOldTweets-python for language search support.

Searching Tweets for our own Search Term

Within the cloned GetOldTweets-python repository folder, import the "got" package, noting that there are separate packages for Python 2 and Python 3.

import sys
if sys.version_info[0] < 3:
    import got
else:
    import got3 as got
    

Let’s try to search for 15 tweets containing the term "PythonProgramming" between January 1, 2023, and January 2, 2023.

tweetCriteria = got.manager.TweetCriteria().setQuerySearch('PythonProgramming').setSince("2023-01-01").setUntil("2023-01-02").setMaxTweets(15)

# You can use "setLang" only if the package supports language-based search queries
# tweetCriteria = got.manager.TweetCriteria().setQuerySearch('PythonProgramming').setSince("2023-01-01").setUntil("2023-01-02").setMaxTweets(15).setLang('en')

# Get the first fetched tweet
tweet = got.manager.TweetManager.getTweets(tweetCriteria)[0]

# Print result
print(tweet.username) # Output: CodingMaster
print(tweet.text) # Output: Python is a versatile language used for web development, data science, and more! #PythonProgramming
print(tweet.retweets) # Output: 5
print(tweet.mentions) # Output: 
print(tweet.hashtags) # Output: #PythonProgramming

# Print all tweets
tweets = got.manager.TweetManager.getTweets(tweetCriteria)

for tweet in tweets:
    print(tweet.text + '\n')

'''

Output:

Python is a versatile language used for web development, data science, and more! #PythonProgramming

JavaScript is great for front-end development. #PythonProgramming #WebDev

Loving the new features in Python 3.9! #PythonProgramming

I prefer Python over Java for data science. #PythonProgramming #DataScience

Learning Python is a must for aspiring data scientists. #PythonProgramming #MachineLearning

Excited about the upcoming Python conference! #PythonProgramming #TechEvents

Why Python is the best language for beginners? #PythonProgramming

Top Python libraries for data analysis. #PythonProgramming #DataScience

Can't wait to try out the new Python framework. #PythonProgramming #Programming

Which is your favorite Python IDE? #PythonProgramming

'''
    

Clean Tweets

Let’s write a function to clean tweets. We remove mentions, hashtags, URL links, and punctuations from the tweets using regular-expression.

import re # importing regex
import string

def clean_tweet(tweet):
    '''
    Remove unnecessary elements from the tweet 
    like mentions, hashtags, URL links, punctuations
    '''
    # Remove old style retweet text "RT"
    tweet = re.sub(r'^RT[\s]+', '', tweet)
 
    # Remove hyperlinks
    tweet = re.sub(r'https?:\/\/.*[\r\n]*', '', tweet)
    
    # Remove hashtags
    tweet = re.sub(r'#', '', tweet)

    # Remove mentions
    tweet = re.sub(r'@[A-Za-z0-9]+', '', tweet)  

    # Remove punctuations
    tweet = re.sub(r'['+string.punctuation+']+', ' ', tweet)

    return tweet 

# Testing clean_tweet function
sample_tweet = "Python is a versatile language used for web development, data science, and more! #PythonProgramming"

print(clean_tweet(sample_tweet))  

'''
Output:

Python is a versatile language used for web development data science and more PythonProgramming

'''
    

Get Sentiment of the Tweet

We pass the cleaned tweet text to the TextBlob class which creates a TextBlob object. It contains sentiment polarity and subjectivity of the text. Polarity greater than zero is positive, lesser than zero is negative and equal to zero can be considered as neutral.

from textblob import TextBlob

def get_tweet_sentiment(tweet):
    '''
    Get sentiment value of the tweet text
    It can be either positive, negative, or neutral
    '''
    # Create TextBlob object of the passed tweet text
    blob = TextBlob(clean_tweet(tweet))

    # Get sentiment
    if blob.sentiment.polarity > 0:
        sentiment = 'positive'
    elif blob.sentiment.polarity < 0:
        sentiment = 'negative'
    else:
        sentiment = 'neutral'

    return sentiment

# Testing tweet sentiment
sample_tweet = "Python is a versatile language used for web development, data science, and more! #PythonProgramming"

print(get_tweet_sentiment(sample_tweet)) # Output: positive
    

Process Tweets

We create a new function that processes tweets to return an array of tweets and their respective sentiment values.

    def get_processed_tweets(tweets):
    '''
    Get array of processed tweets containing 
    the tweet text and its sentiment value
    '''
    processed_tweets = []

    for tweet in tweets:
        tweet_dict = {}
        tweet_dict['text'] = tweet.text 
        tweet_dict['sentiment'] = get_tweet_sentiment(tweet.text)

        # If the tweet contains retweet
        # then only append the single tweet 
        # and don't append the retweets of the same tweet
        if tweet.retweets > 0:
            if tweet_dict not in processed_tweets:
                processed_tweets.append(tweet_dict)
        else:
            processed_tweets.append(tweet_dict)

    return processed_tweets

    # Getting tweets with sentiment value
    tweetCriteria = got.manager.TweetCriteria().setQuerySearch('PythonProgramming').setSince("2023-01-01").setUntil("2023-01-02").setMaxTweets(10).setLang('en')
        
    tweets = got.manager.TweetManager.getTweets(tweetCriteria)
        
    tweets_with_sentiment = get_processed_tweets(tweets)
        
    for item in tweets_with_sentiment:
            print(item)
            print('')

'''
Output: 
{'text': 'Python is a versatile language used for web development, data science, and more! #PythonProgramming', 'sentiment': 'positive'}

{'text': 'Loving the new features in Python 3.9! #PythonProgramming', 'sentiment': 'positive'}

{'text': 'I prefer Python over Java for data science. #PythonProgramming #DataScience', 'sentiment': 'positive'}

{'text': 'Learning Python is a must for aspiring data scientists. #PythonProgramming #MachineLearning', 'sentiment': 'positive'}

{'text': 'Excited about the upcoming Python conference! #PythonProgramming #TechEvents', 'sentiment': 'positive'}

{'text': 'Why Python is the best language for beginners? #PythonProgramming', 'sentiment': 'positive'}

{'text': 'Top Python libraries for data analysis. #PythonProgramming #DataScience', 'sentiment': 'positive'}

{'text': 'Can't wait to try out the new Python framework. #PythonProgramming #Programming', 'sentiment': 'positive'}

{'text': 'Which is your favorite Python IDE? #PythonProgramming', 'sentiment': 'neutral'}
   
'''
    

Get Percentage of Positive, Negative, and Neutral Tweets

We previously obtained the sentiment value of each tweet. Now, let’s calculate the percentage and count of positive, negative, and neutral tweets.

Here, we fetch 1000 tweets and process them.

tweetCriteria = got.manager.TweetCriteria().setQuerySearch('PythonProgramming').setSince("2023-01-01").setUntil("2023-01-02").setMaxTweets(1000).setLang('en')

tweets = got.manager.TweetManager.getTweets(tweetCriteria)

tweets_with_sentiment = get_processed_tweets(tweets)

positive_tweets = [tweet for tweet in tweets_with_sentiment if tweet['sentiment'] == 'positive']
negative_tweets = [tweet for tweet in tweets_with_sentiment if tweet['sentiment'] == 'negative']
neutral_tweets = [tweet for tweet in tweets_with_sentiment if tweet['sentiment'] == 'neutral']

positive_percent = 100 * len(positive_tweets) / len(tweets_with_sentiment)
negative_percent = 100 * len(negative_tweets) / len(tweets_with_sentiment)
neutral_percent = 100 * len(neutral_tweets) / len(tweets_with_sentiment)

print('Positive Tweets  | Count: {}, Percent: {} %'.format(len(positive_tweets), positive_percent))
print('Negative Tweets | Count: {}, Percent: {} %'.format(len(negative_tweets), negative_percent))
print('Neutral Tweets  | Count: {}, Percent: {} %'.format(len(neutral_tweets), neutral_percent))

'''
Output:
Positive Tweets  | Count: 680 , Percent: 68 %
Negative Tweets | Count: 50 , Percent: 5 %
Neutral Tweets  | Count: 270 , Percent: 27 %

'''