Collecting tweets with Twibot and ActiveRecord
I recently launched a website that among other things displays "tweets" (status messages on Twitter) from a predefined set of Twitter users. In this article I'll show you how you can create a stateful Twibot bot with a custom handler to put tweets in your database using ActiveRecord from a Rails application.
Creating a Twibot bot
Back in March when I first announced Twibot, I showed several examples of how you can use the provided DSL to create a Twitter bot, like this one:
require 'twibot'
# Respond to @replies if they come from the right crowd
#
reply :from => [:cjno, :irbno] do |message, params|
post_reply message, "I agree"
end
Run ruby bot.rb --login myaccount and you've got a bot up and running.
While DSL is awesome for many cases, it may not always be enough. Let's say you're running a website where users can register their Twitter account to appear on your site. To fetch the tweets with Twibot, you'd need a bot which changes over time. In the example above, we've defined two users to accept tweets from, but in our website example, we really want to fetch the list of users from the database.
Stateful bots
When you use the Twibot DSL, at the very least a Twibot::Bot and Twibot::Handler object is created for you behind the scenes. Bypassing the DSL, you can create these yourself. Rehashing the above example, take a look at this:
require "twibot"
class MyBotHandler < Twibot::Handler
def initialize
# The two parameters are the same as those accepted by the DSL:
# pattern and options
super(nil, :from => [:cjno, :irbno])
end
def handle(message, params)
post_reply message, "I agree"
end
end
# Create a bot instance, and hook it up with our handler
# The bot should use the default configuration, but override any settings found in
# configuration file
# The default location of the file is config/bot.yml
# The second parameter tells Twibot to prompt for credentials, if none are provided
# through configuration
bot = Twibot::Bot.new(Twibot::Config.default << Twibot::FileConfig.new, true)
bot.add_handler(:tweet, MyBotHandler.new)
bot.run!
This yields a few more lines of code, but the extended possibilities should be quite clear. The handler is now a stateful object, which can alter it's state while the bot is running. This is particularly interesting for our website example: we can alter the internal :from option whenever there are new users available in the database. This way, the bot will respond to tweets (and store them in the database) from new users whenever they are available.
Creating the tweet consuming bot
To solve the task at hand, all we really need is a custom Twibot::Handler (like the one above), a Tweet model of some kind (I won't go into the specifics of that here, it's entirely up to you how you want to record data), and a small script that creates a bot instance, attaches the handler and runs it (like above). This could be solved by a rake task.
The handler: storing tweets
To get a basic handler going that can store tweets, we need a handler that:
- Recognizes Twitter users stored in the database
- Saves Tweet objects in its
handlemethod
class TweetCollector < Twibot::Handler
def initialize
super(nil, :from => screen_names)
end
# Store tweets in the database
#
def handle(message, params)
Tweet.from_status(message).save
end
protected
# Return array of users screen names
#
def screen_names
users.collect { |user| user.screen_name.downcase }
end
# Array of Twitter users to store tweets from
#
def users
@users ||= TwitterUser.all
end
end
The Tweet.from_status method is assumed to convert a
Twitter::Status object (from
Twitter4R) to a local Tweet object (which is an ActiveRecord model). The resulting object is saved to the database.
So far, so good. Unfortunately, the current implementation does not pick up new users after the bot is run, since it only every updates the user list once - inside the constructor. There are several ways to fix this. I'll show you one.
Picking up new users after the bot has started
Twitter is a real time medium. For this reason, the bot probably needs to poll the Twitter service fairly often to be useful. This means that we should probably avoid loading up a new list of users on each handle. We'll give the bot an option - users_ttl - number of seconds the list of users is good. When this many seconds has passed, we'll reload the users from the database.
class TweetCollector < Twibot::Handler
def initialize(users_ttl = 30.minutes)
super(nil, :from => screen_names)
@users_ttl = users_ttl
@users_updated = Time.now
end
# Store tweets in the database
#
def handle(message, params)
Tweet.from_status(message).save
update_users
end
protected
# Update internal user list, if it has not been updated the last @users_ttl seconds
def update_users
return if @users_updated > @users_ttl.ago
@users = nil
@options[:from] = screen_names
@users_updated = Time.now
end
# Return array of users screen names
#
def screen_names
users.collect { |user| user.screen_name.downcase }
end
# Array of Twitter users to store tweets from
#
def users
@users ||= TwitterUser.all
end
end
Making the bot come alive is a simple addition: we add an internal timestamp to keep track of when the users list was last updated. Then, everytime we handle a tweet, we check if we need to update the list. If the list is updated, the new users will be in the loop from here on.
I set the default timeout to 30 minutes. How often you need to update the users depends on how often people register Twitter users at your site, and how long a delay your willing to allow from a user is registered to when his tweets appear on the site.
Making it stable
The bot will work as intended as is, but we're still missing the one piece that will allow your bot to run continuously. Keeping the process running long, you're bound to loose your database connection at some point. Now, ActiveRecord provides ActiveRecord.with_connection to help out with this (among other things). Unfortunately, it's not currently very useful unless you're executing raw SQL. Luckily,
it will be soon.
Until it's fixed, we'll just roll our own. All we really need is a method that takes a block, executes it, and retries if it fails due to a dead database connection. Something like this:
def monitor_connection
begin
yield
rescue ActiveRecord::StatementInvalid => e
if e.to_s =~ /away/
ActiveRecord::Base.establish_connection and retry
else
raise e
end
end
end
Using this, we can now present the final version of our bot:
class TweetCollector < Twibot::Handler
def initialize(users_ttl = 30.minutes)
super(nil, :from => screen_names)
@users_ttl = users_ttl
@users_updated = Time.now
end
# Store tweets in the database
#
def handle(message, params)
monitor_connection { Tweet.from_status(message).save }
update_users
end
protected
# Update internal user list, if it has not been updated the last @users_ttl seconds
def update_users
return if @users_updated > @users_ttl.ago
@users = nil
monitor_connection { @options[:from] = screen_names }
@users_updated = Time.now
end
# Return array of users screen names
#
def screen_names
users.collect { |user| user.screen_name.downcase }
end
# Array of Twitter users to store tweets from
#
def users
@users ||= TwitterUser.all
end
end
Put this in lib/twitter/tweet_collector.rb
You can put it all together in a rake task. Create lib/tasks/twitter/bot.rake with this:
requrie 'twibot'
require 'twitter/tweet_collector'
namespace :twitter do
desc "Run the tweet collecting Twitter bot"
task :bot => :environment do
bot = Twibot::Bot.new(Twibot::Config.default << Twibot::FileConfig.new, true)
bot.add_handler :tweet, TweetCollector.new
bot.run!
end
end
Running the bot is a simple matter of
rake twitter:bot
Extension ideas
Remember that you can use the first parameter to Twibot::Handler#initialize to provide a string or regex pattern to only match certain tweets. Refer to the
Readme on how to use these.
The bot I'm currently uses also employs a similar system to update locally cached Twitter user data. Every few hours, when the bot receives a tweet, it looks up the locally cached user data, and updates attributes from the remote Twitter::User object.
Hopefully, this inspires you to create some stateful bots. If you have any trouble with Twibot, this example, or want to help out in any way, do get in touch. I'd also love to hear back from anyone using Twibot - let me know what you're doing with it!
Comments are closed