If you ever need to restore exact records from one database to another, Marshal
might come in handy.
Marshal.dump
is part of the ruby core and available in all ruby versions without the need to install anything. This serializes complete ruby objects including id
, object_id
and all internal state.
Marshal.load
deserializes a string to an object. A deserialized object cannot be saved to database directly as the the dumped object was not marked dirty, thus rails does not see the need to save it, even if the object is not present in the database. This is why I created a new object from the attributes of the deserialized one.
The marshalled string is base64 encoded, otherwise you might run into issues with mismatching encodings of the dumped string and the file encoding.
Below is a use case where I had to export records from a local database, transfer it to a server and import it in the server database.
1. export
def dump_to_file(list)
content = Base64.encode64(Marshal.dump(list.to_a))
file_path = File.expand_path('marshalled.dat')
File.open(file_path, 'w') do |file|
file.puts content
end
end
# usage:
dump_to_file Site.where(user: ...)
2. transfer
scp marshalled.dat myuser@myserver:/tmp/
3. import
def import_from_file
string = File.read("/tmp/marshalled.dat")
list = Marshal.load(Base64.decode64(string))
puts "#{list.length} objects about to import"
restore_list(list)
end
def restore_list(unmarshalled)
unmarshalled.map { |obj| restore(obj) }
end
# An unmarshalled Object is not saved to DB as it's not marked dirty.
# Also `.dup` does not help as this will clear the ID.
def restore(obj)
obj.class.create!(obj.attributes)
end
# usage
import_from_file
Note
When importing objects including ID, the database's private key sequence is not modified and might be incorrect. In this case you should reset the sequence (e.g.
ActiveRecord::Base.connection.reset_pk_sequence!('sites')
)