I’d wanted to make a Reddit bot for a little while, really just as a toy project, and finding the excellent PRAW library for Python (as well as finally having some time off) gave me the kick I needed to get it done. It’s also April Fool’s Day, so I thought I’d make something a bit stupid.
I decided to make a bot that would simulate a conversation with the wonderful racounteur that is DJ Khaled. If you don’t know what I’m talking about, watch this video (the second half is the important bit):
Cool guy, huh? My idea was that the bot would monitor the top X number of threads on /r/all, and look through the comments to see if anyone mentioned our pal Khaled’s name. If so, they’d engage them in the conversation from the video.
If our player got through the conversation, they’d be rewarded with a small amount of money, so they could “go buy your mom a house”. By small I mean small - I’m using ChangeTip to send 5 bits, which at the time of writing is equal to £0.00083. Someone’s mom might not get to move for a little bit longer.
I’m going to dive into how I made the thing, but if your time’s a bit limited, you can see the source code on my GitHub, and here’s a screenshot of what the finished conversation looks like.
The bot is built in Python, using PRAW, which provides a very nice wrapper for the Reddit API. We start out by initializing the Reddit object and logging in, like so:
The string given to the Reddit constructor is the User-Agent field for the Reddit API. The API eschews API tokens and the like - instead, individual users are tracked by the User-Agent string. Reddit are quite strict about applications being honest with this parameter - have a look at their documentation if you’re interested.
The actual crawling of Reddit comments is quite simple using PRAW. Here’s a stripped-down example:
This code will fetch the subreddit /r/all. It then goes through the top 100 submissions in that subreddit, and for each one looks through the top comments. If it finds a case-insensitive reference to DJ Khaled in a comment, it replies to it and stops looking through that submission.
The number of comments actually returned is a maximum of 200, but not all of them may be usable. Instead, some of them may be
MoreComments objects - these are used to fetch further comments. We ignore these. Reddit’s API is quite strict with rate limiting - they quote around 30 requests a minute, which is not very many - so it makes sense to only fetch what we need.
Our bot runs in an infinite loop, which looks like this:
The bot will run once a minute. An
HTTPError exception will be caught if it fails to connect to Reddit - this could be because Reddit is down, or because their servers are busy (the 503 error we all know and love). If this happens, we simply give up on the current iteration and start again.
process_new() method does pretty much what I outlined above, with a few additions:
As well as replying, we add our reply comment to a queue, which is processed by the
process_queue() method. There are actually three queues - one for each stage of the conversation.
process_new() method also keeps track of seen submissions. This ensures that we only start one conversation on each submission - we need to stop people abusing our bot for free money, after all. Even DJ Khaled isn’t infinitely rich. This list is also written to a file, so that it can persist if the bot crashes or needs to be restarted for some reason.
RateLimitExceeded exception is thrown if we try to comment too often. How often a Reddit account is allowed to comment depends on the amount of karma it has - to start with, one can only comment once every five minutes. This makes our bot a bit slow, but for our purposes this doesn’t really matter, and hopefully it will speed up if people upvote the stuff it comes out with.
process_queue() method looks like this:
This code will process five items from each queue - this is to keep our number of requests down. The code fetches the comment using PRAW, and checks it to see if any of the replies contain DJ Khaled’s name. If so, we reply to it with the appropriate response, and move the reply comment into the next queue if there is one, removing the old comment from its queue. In this way we gradually move conversations along.
The responses are stored in their own Python file:
I did this rather than putting them in a plain text file so as to use Python’s multiline strings - I had some issues with getting newlines to behave properly simply using
\n, which I suspect has to do with Reddit’s Markdown processing.
When we put it all together, the completed bot looks like this:
And that’s pretty much it - it doesn’t take much! If you’re thinking of building a Reddit bot, here are some links that might be useful:
- Reddit API docs
- The source code for this bot
- DJ Khaled approving the most powerful servers
- The actual bot on Reddit
Any questions, don’t hesitate to drop a comment in the box below. Go buy your mom a house.