Bootstrapping Development Databases
At Ooga Labs, we strive to spin up new
developers quickly. One problematic area in the past has been creating developer
sandbox databases. Models and migrations, despite our
best efforts, are fragile. As
models get refactored, methods are inevitably removed or renamed, and early
migrations dependent on the old model behaviors are left to rot
until that ill-advised db:migrate months later unleashes calamity
and ruin.
It's a Rails best practice to treat your db/schema.rb
definition as the authoritative source for your database schema. To a quote a
comment in that file:
If you need to create the application database on another system, you should be using db:schema:load, not running all the migrations from scratch. The latter is a flawed and unsustainable approach (the more migrations you'll amass, the slower it'll run and the greater likelihood for issues).
Unfortunately, that only works if you are database agnostic and/or limiting your
db usage to features that ActiveRecord::SchemaDumper can handle. Don't
get me wrong, it can do an impressive amount. But in a high-volume production
environment, there's a lot of incentive to actually take advantage of
some of the SQL extensions your RDBMS supports. In MySQL, these might include:
-
Custom data types like
BIGINT,VARBINARY, andMEDIUMBLOB. - Foreign keys.
- Index prefix lengths.
- Compound primary keys.
To tell Rails to dump your database-specific schema using SQL statements instead of
an ActiveRecord::Schema file, you need to specify your schema format
as :sql:
config/environment.rb
Rails::Initializer.run do |config| ... config.active_record.schema_format = :sql ... end
Now you can use the Rake task db:structure:dump to create
db/development_structure.sql. This file can be used to create tables
in a developer's sandbox database (using the mysql client,
for example).
To make this process easier, I've created a couple simple Rake tasks:
lib/tasks/db.rake — Dump schema & Bootstrap tasks
namespace :z do namespace :db do desc 'Dump schema structure to db/bootstrap/YYYYMMDDhhmmss.sql' task :dump_schema => 'db:structure:dump' do dir = Rails.root.join('db') snapshot = dir.join('bootstrap', "#{Time.now.utc.strftime('%Y%m%d%H%M%S')}.sql") FileUtils.mkdir_p(File.dirname(snapshot)) FileUtils.mv(dir.join("#{Rails.env}_structure.sql"), snapshot) puts "Created snapshot: #{snapshot}" end desc 'Bootstrap a new db from db/bootstrap/YYYYMMDDhhmmss.sql' task :bootstrap => 'db:create' do snapshot = Dir.glob(Rails.root.join('db', 'bootstrap', '*.sql')).last puts "Bootstrapping snapshot: #{snapshot}" conn = ActiveRecord::Base.establish_connection.connection File.read(snapshot).split("\n\n").each { |command| conn.execute(command) } puts 'Running recent migrations...' Rake::Task['db:migrate'].invoke puts 'Done!' end end end
The first task, z:db:dump_schema, dumps the schema (using
db:structure:dump) and then copies that file under
db/bootstrap/YYYYMMDDhhmmss.sql. Check this snapshot into your
git/subversion repository periodically.
The second task, z:db:bootstrap,
looks in that same directory for the latest snapshot, creates the
database, loads the snapshot, and finally processes any recent migrations after
the snapshot time. The snapshot file includes the timestamps of the migrations
used to create it, so migrations will not run twice even if the files still exist
in db/migrate/*.rb. Also note that this will not destroy any existing
tables if the database already exists — it will just fail. You'll need to
rake db:drop it yourself if this isn't a new installation.
Incidentally, these tasks are namespaced under the z: prefix because
I prefer to see my custom tasks grouped together at the bottom of the
rake -T output.