Building a Twitter Bot with Python
May 5, 2020 • 10 Minute Read
Introduction
Twitter needs no introduction—everyone knows about it. The popular social media platform uses hashtags to represent topics, and anyone is authorized to make their own hashtags. A significant amount of discussion happens every second using these hashtags. Industries such as marketing use Twitter regularly to find out what the general sentiment is towards specific hashtags, and based on that, they offer services to their clients.
Tweets and hashtags need to be collected in order to analyze them. Copying each tweet that mentions a particular hashtag is not an option because hundreds of tweets are published each minute. A smart computer program, or bot, is required to fetch the trending hashtags and tweets and save them for further analysis.
In this guide, you will learn how to make a custom twitter bot that can fetch hashtags and tweets for you.
Getting the Twitter API
To enable your bot to interact with Twitter, you first have to sign up as a developer on Twitter. Navigate here to sign up and get the API keys.
In this guide, you will make a GET request to fetch the tweets and hashtags, and for that, you need "CONSUMER_KEY", "CONSUMER_SECRET", "ACCESS_KEY", and "ACCESS_SECRET" keys.
Before jumping into the code, it's important to understand the rate limits on Twitter requests. Twitter has free and premium developer accounts that vary in how many requests you can make in a 15-minute window. Rate limiting is subject to change, so to get the current specifications, read the official doc.
Required Configurations and Python Libraries
Implement the twitter bot using the Tweepy library. The API documentation is well organized.
Install Tweepy in Python.
pip install tweepy
Create the config.json file and put in all the aforementioned API keys.
{
"CONSUMER_KEY" : "<KEY>",
"CONSUMER_SECRET" : "<KEY>",
"ACCESS_KEY" : "<KEY>",
"ACCESS_SECRET" : "<KEY>"
}
Coding the Bot
Twitter keeps hashtags and tweets separated by locations. To refer to different geolocations, the term WOEID (Where On Earth IDentifier) is used. You can find the WOEID for available countries here.
Importing Required Libraries
You'll work with these libraries throughout the guide. Import them into your file.
import tweepy
import json
import schedule
import time
import datetime
import os
import csv
Initiating API
Access the config.json and initiate the API with all the access keys. Establishing a connection with the server can sometimes be problematic due to internet connectivity, server response time, etc. It's always good to handle the errors for better clarity.
def initiate_api():
try:
with open('config.json', 'r') as f:
config = json.load(f)
auth = tweepy.OAuthHandler(config["CONSUMER_KEY"], config["CONSUMER_SECRET"])
auth.set_access_token(config["ACCESS_KEY"], config["ACCESS_SECRET"])
api = tweepy.API(auth)
return api
except:
print("Problems with config.json")
return None
Accepting Only English Tweets
For this guide, you'll focus only on English tweets. Tweets in other languages, such as Chinese, Arabic, Hindi, etc. will not be considered. You can include other languages easily in this function without affecting other parts of the application.
def isEnglish(text):
try:
text.encode(encoding='utf-8').decode('ascii')
except UnicodeDecodeError:
return False
else:
return True
Getting the WOEID of Countries
This function will help to get the WOEID of the locations for specific locations. It takes api object and location list as an argument and returns the WOEID for them.
def get_woeid(api, locations):
twitter_world = api.trends_available()
places = {loc['name'].lower() : loc['woeid'] for loc in twitter_world};
woeids = []
for location in locations:
if location in places:
woeids.append(places[location])
else:
print("err: ",location," woeid does not exist in trending topics")
return woeids
Fetching the Tweets
This function will get the tweets for the given hashtag. The api object will fetch the tweets for the given query. This code is only fetching English tweets, but you can manipulate this to get other languages as well.
'''
Getting Tweets for the given hashtag with max of 1000 popular tweets with english dialect
'''
def get_tweets(api, query):
tweets = []
for status in tweepy.Cursor(api.search,
q=query,
count=1000,
result_type='popular',
include_entities=True,
monitor_rate_limit=True,
wait_on_rate_limit=True,
lang="en").items():
# Getting only tweets which has english dialects
if isEnglish(status.text) == True:
tweets.append([status.id_str, query, status.created_at.strftime('%d-%m-%Y %H:%M'), status.user.screen_name, status.text])
return tweets
It will return the list of tweets for the given query. The tweet will include the ID, hashtag, creation time, user handle, and tweet body.
In this function, you'll fetch the trending hashtags for a given location.
Note: If you are working with free developer account, you have a very limited number of requests. This code is programmed in such a way that if your hourly requests are exhausted, this bot will wait for one hour and then resume.
def get_trending_hashtags(api, location):
woeids = get_woeid(api, location)
trending = set()
for woeid in woeids:
try:
trends = api.trends_place(woeid)
except:
print("API limit exceeded. Waiting for next hour")
#time.sleep(3605) # change to 5 for testing
trends = api.trends_place(woeid)
# Checking for English dialect Hashtags and storing text without #
topics = [trend['name'][1:] for trend in trends[0]['trends'] if (trend['name'].find('#') == 0 and isEnglish(trend['name']) == True)]
trending.update(topics)
return trending
Getting Everything Together
This function will pull all the functions together. All the fetched tweets will be saved in the trending_tweets directory. Every time the bot runs, it will save trending hashtags and tweets in different csv files with the timestamp.
def twitter_bot(api, locations):
today = datetime.datetime.today().strftime("%d-%m-%Y-%s")
if not os.path.exists("trending_tweets"):
os.makedirs("trending_tweets")
file_tweets = open("trending_tweets/"+today+"-tweets.csv", "a+")
file_hashtags = open("trending_tweets/"+today+"-hashtags.csv", "w+")
writer = csv.writer(file_tweets)
hashtags = get_trending_hashtags(api, locations)
file_hashtags.write("\n".join(hashtags))
print("Hashtags written to file.")
file_hashtags.close()
for hashtag in hashtags:
try:
print("Getting Tweets for the hashtag: ", hashtag)
tweets = get_tweets(api, "#"+hashtag)
except:
print("API limit exceeded. Waiting for next hour")
#time.sleep(3605) # change to 0.2 sec for testing
tweets = get_tweets(api, "#"+hashtag)
for tweet in tweets:
writer.writerow(tweet)
file_tweets.close()
There's a commented code, time.sleep(). Use this if you want to play around with the file saving process.
Main Function
Finally, the main function will call the bot promptly. Due to fewer requests, I am using only one location in the locations list. You can put any number of countries into it and this code will handle everything.
The schedule package is used to keep the program running all the time. Currently, the bot will fetch data at 00:00 every day; however, you can change the schedule according to your needs. There's a commented part that schedules the bot every 10 seconds. Practically, you'll need a huge number of requests and very good hardware infrastructure to handle requests every 10 seconds.
def main():
'''
Use location = [] list for getting trending tags from different countries.
I have limited number of request hence I am using only 1 location
'''
#locations = ['new york', 'los angeles', 'philadelphia', 'barcelona', 'canada', 'united kingdom', 'india']
locations = ['new york']
api = initiate_api()
schedule.every().day.at("00:00").do(twitter_bot, api, locations)
#schedule.every(10).seconds.do(twitter_bot, api, locations)
while True:
schedule.run_pending()
time.sleep(1)
if __name__ == "__main__":
main()
Conclusion
In this guide, you have learned how to make a Twitter bot that can fetch trending hashtags and tweets of different geographic locations. With a little more modification to this code, you can use this for any business use case.
Twitter is one of the most popular ways to study human behavior and attitudes towards a topic. To study tweets, we must first collect a lot of them, and fetching them automatically is always the best solution. In the next guide, we'll build a machine learning model that will do the sentiment analysis for these tweets. Click here to read Twitter Sentiment Analysis in Python.