Backup MySQL Production Data using Capistrano and SSHKit

I wanted to automate the following tasks for rubyplus.com:

  1. Backup mysql production data
  2. Download the production data dump to laptop
  3. Import the production data dump to development database on laptop
  4. Backup the data by uploading the compressed production data to S3

First Attempt

I am using Capistrano 2.15.9 and SSKKit 1.11.1. My first attempt that automates most of the steps is as follows. For step 1, here is the Capistrano task:

namespace :myserverbkup do
  desc 'This is mysql backup capistrano task'
  task :backup, :roles => :db, :only => { :primary => true } do
    filename = "./cool-app.dump.#{Time.now.to_f}.sql.bz2"
    text = capture "cat #{deploy_to}/current/config/database.yml"
    yaml = YAML::load(text)

    run "mysqldump -u #{yaml['production']['username']} -p #{yaml['production']['database']} | bzip2 -c > #{filename}" do |ch, stream, out|
      ch.send_data "#{yaml['production']['password']}\n" if out =~ /^Enter password:/
    end
  end    
end

Run:

cap myserverbkup:backup

on your laptop where you deploy your Rails app to create the backup file named by the variable filename. This Capistrano task was stolen from somewhere online. It uses the mysql credentials for production defined in the database.yml and creates the data dump. It does not require you to enter any password as long as you can deploy your Rails app from the machine where you run this Capistrano task.

For step 2, I wrote a small Ruby program using SSHKit.

require 'sshkit/dsl'
require 'sshkit'

include SSHKit::DSL

SSHKit::Backend::Netssh.configure do |ssh|
  ssh.ssh_options = {
      user: 'deploy-user-name',
      auth_methods: ['publickey']
  }
end

on 'www.example.com' do
  puts 'Downloading database backup file'
  download! '/home/user/cool-app.dump.148213442.628791.sql.bz2', '.'
  puts 'Done'
end

I saved this under lib/tasks/download.rb which can be run like this (again you must be able to deploy from this machine, which means you have setup SSH keys):

ruby lib/tasks/download.rb

This downloads the data dump into the home directory of the Rails project.

Downloading database backup file
  INFO Downloading ./cool-app.dump.1486427812.612312.sql.bz2 10.14%
  INFO Downloading ./cool-app.dump.1486427812.612312.sql.bz2 20.27%
  INFO Downloading ./cool-app.dump.1486427812.612312.sql.bz2 30.06%
  INFO Downloading ./cool-app.dump.1486427812.612312.sql.bz2 40.2%
  INFO Downloading ./cool-app.dump.1486427812.612312.sql.bz2 50.33%
  INFO Downloading ./cool-app.dump.1486427812.612312.sql.bz2 60.12%
  INFO Downloading ./cool-app.dump.1486427812.612312.sql.bz2 70.26%
  INFO Downloading ./cool-app.dump.1486427812.612312.sql.bz2 80.05%
  INFO Downloading ./cool-app.dump.1486427812.612312.sql.bz2 90.18%
  INFO Downloading ./cool-app.dump.1486427812.612312.sql.bz2 100.0%
Done

You can see SSHKit shows download progress without requiring you to write any code. Now, I unzip the bz2 file manually (this step needs to be automated by executing gunzip in the terminal using Ruby). To import the data into my local database, I run:

mysql -u root -p -h localhost < my-cool-app.sql

Second Attempt

After a good night's sleep, the next day I had the mental energy to improve the quick-and-dirty Capistrano task. I combined another Capistrano task that I found online to make everything into one Capistrano task:

# Run cap myserverbkup:backup to create the backup file named filename
namespace :myserverbkup do
  desc 'This is mysql backup capistrano task'
  task :backup, :roles => :db, :only => { :primary => true } do
    filename = "./prod-db.dump.#{Time.now.to_f}.sql.bz2"
    text = capture "cat #{deploy_to}/current/config/database.yml"
    yaml = YAML::load(text)

    run "mysqldump -u #{yaml['production']['username']} -p #{yaml['production']['database']} | bzip2 -c > #{filename}" do |ch, stream, out|
      ch.send_data "#{yaml['production']['password']}\n" if out =~ /^Enter password:/
    end

    puts "Fetching remote data"
    get "#{filename}", "prod-db.dump.sql.bz2"

    # unzip the file. Can't use exec() for some reason so backticks will do
    puts "Uncompressing dump"
    `bzip2 -d prod-db.dump.sql.bz2`
    cmd = "mysql -u root --database=your-dev-db-name < prod-db.dump.sql"
    puts "Importing production data : #{cmd}"
    `#{cmd}`
    puts "Deleting prod-db.dump.sql"
    `rm -f prod-db.dump.sql`

    puts "Be sure to run rake db:migrate to ensure your database schema is up to date!"
  end    
end

The only customization in this Capistrano task is that I am using bz2 to compress and bzip2 command to uncompress the downloaded file. The only pending task is uploading the data dump to S3 before deleting it on the laptop. The Amazon S3 file upload is discussed in the article Uploading Files to S3 using AWS SDK gem.


Related Articles