Rails 5.2: Migrating from Paperclip to ActiveStorage

Published by マギルゥーベルベット on

I recently came across a problem I needed to solve in a Rails app. The migration from the deprecated Paperclip plugin for Rails to work with file uploads. Now that ActiveStorage is released in Rails 5.2 and built-in without using 3rd party libraries, we wanted to use it instead of Paperclip. There was one big problem though: The migration guide on the GitHub repo requires that the paperclip gem must stay installed and that you basically need to hack on the model files *after* the migration. But we wanted a migration like every other, which doesn’t involve the modification of files in production. So I was looking for a solution to remove the paperclip gem as very first, and then start to migrate the database and files.

First we need to create a migration to create the required tables for ActiveStorage.

# This migration comes from active_storage (originally 20170806125915)
class CreateActiveStorageTables < ActiveRecord::Migration[5.2]
  def change
    create_table :active_storage_blobs do |t|
      t.string   :key,        null: false
      t.string   :filename,   null: false
      t.string   :content_type
      t.text     :metadata
      t.bigint   :byte_size,  null: false
      t.string   :checksum,   null: false
      t.datetime :created_at, null: false

      t.index [ :key ], unique: true
    end

    create_table :active_storage_attachments do |t|
      t.string     :name,     null: false
      t.references :record,   null: false, polymorphic: true, index: false
      t.references :blob,     null: false

      t.datetime :created_at, null: false

      t.index [ :record_type, :record_id, :name, :blob_id ], name: "index_active_storage_attachments_uniqueness", unique: true
    end
  end
end

Than a separate migration for the ActiveStorage conversion. This is the challenging part.

# rubocop:disable Rails/Output
# ruoocop:disable Rails/FilePath
class ConvertToActiveStorage < ActiveRecord::Migration[5.2]
  require "find"
  require_relative "../../config/deploy/config"

  # determine paperclip file location
  if File.exist? DeployConfig::path #Rails.env.production?
    @shared_photo_path = File.join(DeployConfig::path, "shared/public/system/photos").to_s
  else
    @shared_photo_path = Rails.root.join("public/system/photos").to_s
  end

  puts "ConvertToActiveStorage: Using -> #{@shared_photo_path}"

  def up
    get_blob_id = "LAST_INSERT_ID()"

    @active_storage_blob_statement = ActiveRecord::Base.connection.raw_connection.prepare(<<-SQL)
      INSERT INTO active_storage_blobs (
        `key`, `filename`, `content_type`, `metadata`, `byte_size`, `checksum`, `created_at`
      ) VALUES (?, ?, ?, '{}', ?, ?, ?);
    SQL

    @active_storage_attachment_statement = ActiveRecord::Base.connection.raw_connection.prepare(<<-SQL)
      INSERT INTO active_storage_attachments (
        `name`, `record_type`, `record_id`, `blob_id`, `created_at`
      ) VALUES (?, ?, ?, #{get_blob_id}, ?);
    SQL

    ActiveRecord::Base.transaction do
      Product.find_each.each do |product|
        next if product.photo_file_name.blank?

        make_active_storage_records(product)
      end
    end

    puts "Conversion to ActiveStorage complete!"
    puts "Please run ./paperclip-to-activestorage-migration.rb to migrate your files."
    puts "When all went fine, you can remove the 'public/system' directory."
    puts "In case anything goes wrong, the original paperclip columns still exist."
    puts "Just revert this migration and try again."
  end

  def down
    blobs = ActiveRecord::Base.connection.raw_connection.prepare(<<-SQL)
      TRUNCATE TABLE active_storage_blobs
    SQL
    blobs.execute

    attachments = ActiveRecord::Base.connection.raw_connection.prepare(<<-SQL)
      TRUNCATE TABLE active_storage_attachments
    SQL
    attachments.execute
  end

  # find file path without using the original paperclip gem
  # to have a future proof migration without hacking on the model
  # files in production
  # paperclip integration was already removed from the models
  # so this ain't working -> "product.photo.path"
  def self.get_photo_path(filename)
    photo_path = nil
    Find.find(@shared_photo_path) do |path|
      if path.include?("original") && path.end_with?(filename)
        photo_path = path
      end
    end
    return photo_path
  end


  private

  def make_active_storage_records(product)
    blob_key = key(product)
    filename = product.photo_file_name
    content_type = product.photo_content_type
    file_size = product.photo_file_size
    file_checksum = checksum(product)
    created_at = DateTime.strptime(product.updated_at.iso8601)

    @active_storage_blob_statement.
      execute(blob_key, filename, content_type, file_size, file_checksum, created_at)

    # This will allow `product.photo` calls to return an asset.
    blob_name = "photo"
    record_type = "Product"
    record_id = product.id

    @active_storage_attachment_statement.
      execute(blob_name, record_type, record_id, created_at)
  end

  def key(product)
    # use filename as key
    product[:photo_file_name]
  end

  def checksum(product)
    return "" if product[:photo_file_name].nil?
    photo_path = ConvertToActiveStorage::get_photo_path product[:photo_file_name]
    return "" if photo_path.nil?
    Digest::MD5.base64digest(File.read(photo_path))
  end
end
# ruoocop:enable Rails/FilePath
# rubocop:enable Rails/Output

We need to find the old Paperclip files using a recursive file search which is very slow if you have many files. With paperclip this would be just Model.photo.path, but since we removed Paperclip already this is no longer possible. The good part of this migration though is, that Paperclip is no longer required and makes migrations easier. Hacking on model files in production is just nonsense. So the cost of time is acceptable.

Now that the database migration is complete, the next step is to also migrate over the files to ActiveStorage. This comes with a separate script and also doesn’t require Paperclip, but also comes with a time cost due to the recursive file search.

#!bin/rails runner

# NOTE: use "bundle exec rails runner -e production ./paperclip-to-activestorage-migration.rb" in production

# rubocop:disable Rails/ApplicationRecord

require "find"

class ActiveStorageBlob < ActiveRecord::Base
end

class ActiveStorageAttachment < ActiveRecord::Base
  belongs_to :blob, class_name: "ActiveStorageBlob"
  belongs_to :record, polymorphic: true
end

# import migration for the #get_photo_path method
require_relative "db/migrate/20180801152358_convert_to_active_storage"

count = ActiveStorageAttachment.count
puts "#{count} files needs to be queried and copied from 'public/system'..."
puts "This may take a long time depending on disk I/O speed."

ActiveStorageAttachment.find_each do |attachment|
  name = attachment[:name]
  product = attachment.record.send(name).record

  source = ConvertToActiveStorage::get_photo_path(product.photo_file_name)
  dest_dir = File.join(
    "storage",
    attachment.blob.key.first(2),
    attachment.blob.key.first(4).last(2)) rescue nil
  dest = File.join(dest_dir, attachment.blob.key) rescue nil

  FileUtils.mkdir_p(dest_dir)
  # Synced console output slows down the operation heavily!
  # puts "Copying #{source} to #{dest}"
  FileUtils.cp(source, dest) unless source.nil? || dest.nil?
end

puts "All files copied. Don't remove 'public/system' before you didn't tested if everything still works!"

# rubocop:enable Rails/ApplicationRecord

That’s it. A flexible migration from Paperclip to ActiveStorage without actually using the paperclip gem at all. I can’t promise that this works for your database and files too. This migration is also just intended for a single model across the entire application. I can’t promise this will work for multiple models. Regarding the controllers and views, they were changed exactly as mentioned in the official Paperclip migration guide and everything is working well. Rails finds all the files without problems. File uploads are now working better than ever. We had many problems with Paperclip not picking up the files every time, only at random.