How to Use Rake Tasks to Generate Raw SQL From ActiveRecord Migrations

Recently at the NYT on the product I’m working on, we came across the problem of having to convert ActiveRecord migrations into raw SQL in order to be processed on our production database. Our infrastructure/systems team needs SQL in order to generate migrations; since not all teams at the NYT use Rails, ActiveRecord’s migrations cannot be automatically applied to the production database.

So the senior developer I work closely with wrote a rake task that parsed the SQL generated by ActiveRecord migration tasks into a .sql file, which our systems team could then use to generate the necessary database migrations. Although I had no role in actually writing the code, I thought that the rake task touched on a few interesting concepts in Ruby, so I felt compelled to post it and annotate it a little bit.

Here’s the code in its entirety. Its saved in a .rake file in the tasks folder in the lib directory.

namespace :db do
  [ :migrate, :rollback ].each do |n|
    namespace n do |migration_task|

      original_task = migration_task.instance_variable_get("@scope").join ":"

      [:with_sql, :to_sql ].each do |t|

        desc "Run migration, and generated SQL" if t == :with_sql
        desc "Generate migration SQL" if t == :to_sql
        task t => :environment do |sql_task|

          case original_task
          when "db:migrate"
            filename = 'upgrade.sql'
          when "db:rollback"
            filename = 'rollback.sql'
          else
            raise "unkown migration type #{original_task}"
          end

          ActiveRecord::Base.connection.class.class_eval do
            # alias the adapter's execute for later use
            alias :old_execute :execute

            SQL_FILENAME = filename
            RUN_SQL = sql_task.name.ends_with?("with_sql")

            # define our own execute
            def execute(sql, name = nil)
              # check for some DDL and DML statements
              if /^(create|alter|drop|insert|delete|update)/i.match sql
                File.open(SQL_FILENAME, 'a') { |f| f.puts "#{sql};\n" }
                old_execute sql, name if RUN_SQL
              else
                # pass everything else to the aliased execute
                old_execute sql, name
              end
            end

          end

          # create or delete content of migration.sql
          File.open(SQL_FILENAME, 'w') { |f| f.puts "-- Script created @ #{Time.now}" }

          # invoke the normal migration procedure now
          Rake::Task[original_task].invoke

          puts "Ran #{original_task} and wrote sql to #{filename}"
        end
      end
    end
  end

end

At a high level, this code does the following: * Open up the db namespace and add a sub rake task atop the db:migrate and db:rollback commands * Hijacks the execute method and writes the SQL otherwise executed by Rails to a file. * Permit users to specify whether they want Rails to actually execute the SQL code or just to write it to a file for later use

namespace :db do
  [ :migrate, :rollback ].each do |n|
    namespace n do |migration_task|

      original_task = migration_task.instance_variable_get("@scope").join ":"

This code opens up the :db namespace and executes the same block of code for the :migrate and :rollback sub name spaces. The :with_sql and :to_sql commands get appended to both namespaces, for a total of 4 new rake commands. The last line here just reverse engineers the original command; the original_task local variable returns ‘db:migrate’ or ‘db:rollback’.

[:with_sql, :to_sql ].each do |t|

desc "Run migration, and generated SQL" if t == :with_sql
desc "Generate migration SQL" if t == :to_sql
task t => :environment do |sql_task|

  case original_task
  when "db:migrate"
    filename = 'upgrade.sql'
  when "db:rollback"
    filename = 'rollback.sql'
  else
    raise "unkown migration type #{original_task}"
  end

This sets up the creation of a file based on the type of migration task that is being considered. ‘upgrade.sql’ will execute a migration while ‘rollback.sql’ will reverse one. The idea is that adding these .sql files will make it as easy as possible for a database architect to execute a migration on a staging or production server, rather than deciphering the Rails migration herself. I generally take for granted the fact that I work in Rails; now that I’m no longer at Flatiron, I need to remember that not all databases uses the Rails ORM to connect to an application, but SQL remains a common language that most databases understand. Although ActiveRecord seems to abstract away so much of the underlying database connections, at the end of the day all its doing is executing SQL, which you can write directly into your Ruby code. In fact, a common complaint I’ve heard about Rails is that for more complex queries, you can’t rely just on ActiveRecord to generate efficient queries and instead have to fall back on executing raw SQL.

ActiveRecord::Base.connection.class.class_eval do
  # alias the adapter's execute for later use
  alias :old_execute :execute

  SQL_FILENAME = filename
  RUN_SQL = sql_task.name.ends_with?("with_sql")

  # define our own execute
  def execute(sql, name = nil)
    # check for some DDL and DML statements
    if /^(create|alter|drop|insert|delete|update)/i.match sql
      File.open(SQL_FILENAME, 'a') { |f| f.puts "#{sql};\n" }
      old_execute sql, name if RUN_SQL
    else
      # pass everything else to the aliased execute
      old_execute sql, name
    end
  end

end

This is the core of the rake task, and it includes some nifty metaprogramming components, including class_eval and the alias method. The code overrides ActiveRecord’s default #execute method. When the interpreter hits the class_eval method, it executes the following code as if that code was included in the class. In this case, execute is an instance method, since class_eval is a method in the Module class and so the code in the enclosed methods is executed in the context of that class (e.g., as an instance method). Confused? Don’t worry, I’m still trying to fully grasp how some of these metaprogramming methods work.

Anyways, the use of class_eval enables the rake task to extend the functionality of the standard #execute method and have it write out to a text file. The default method takes in raw sql as an argument and executes it on the connection’s class. By writing our own method, this allows us to insert logic before the execution occurs. The alias stores the Rails execute method as :old_execute, which is called only if the rake task includes the :with_sql suffix. If the sql that Rails would otherwise execute includes a migration task (e.g., CREATE, ALTER, DROP, INSERT, DELETE or UPDATE), then this sql is written to a file saved to the root Rails directory.

If you’re interested in the concepts covered in this code, here are a few links that are helpful:

Class Eval vs. Instance Eval

More about class_eval

Ruby’s alias and alias_method

Comments