Main > Diary > Development

« Beauty Out of Muck | Main | Ira the Chihuahua »

May 13, 2006

Recollating MySQL, How I Broke the latin_swedish_ci Blues...

I got mad. When I get mad sometimes I write ruby. Tonight I wrote ruby.

A couple of weeks ago the server that hosts this site melted down. I replaced the box and migrated all of the data into a freshly compiled copy of MySQL. Everything seemed to work peachy until I noticed that all of my UTF-8 characters were coming out as questions marks. It's very annoying when you want to say ???????? and you get ????????. ????!

What was worse was seeing exacly how much work it was going to take to change the collation in all of the columns in all of the tables within all of the databases on my system. Bleh.

I wrote a script in Ruby to change all of the collations en masse for a particular database. This is useful for me because I really don't use any other collation other than utf8_bin, save for the odd column set to utf8_general_ci.

If you're interested in such progmatic solution, read on...

The below 100 lines took me about an hour to write, so be careful if you're running it on a production system. Take care to know if you need any other collations other than the TARGET_COLLATION defined in the top of the class.

#!/usr/bin/ruby18

require 'mysql'

class RecollateMysqlDatabase
  HOST = "localhost"
  USERNAME = "username"
  PASSWORD = "password"
  DATABASE = "mt_ablog"

  TARGET_CHARSET = "utf8"
  TARGET_COLLATION = "utf8_bin"

  def initialize
    @dbh = nil

    begin
      @dbh = Mysql.real_connect(HOST, USERNAME, PASSWORD, DATABASE)
      puts "Connected to MySQL Server Version" +
        @dbh.get_server_info
    rescue MysqlError => e
      mysql_exception_handler(e)
    end
  end

  def mysql_exception_handler(e)
      print "Error code: ", e.errno, "\n"
      print "Error message: ", e.error, "\n"

    begin @dbh.close rescue nil end

    exit 255
  end

  def recollate_tables(preview = true)

    puts "Fetching tables..." if preview

    # get a list of tables
    tables = @dbh.list_tables
    # A place to store our operations
    table_column_hash = Hash.new
    alter_table_query_list = [ ]

    # for each table get a list of columns:
    tables.each do |table|
      puts "Have table: #{table}" if preview
      res = @dbh.query("SHOW FULL COLUMNS FROM `#{table}`")
      while row_hash = res.fetch_hash do
          if row_hash["Collation"] != "NULL" and
            row_hash["Collation"] != TARGET_COLLATION
              column_hash = { "Field" => row_hash["Field"],
                              "Type" => row_hash["Type"] }
              table_column_hash[table] = [ ] if table_column_hash[table].nil?
              table_column_hash[table].push(column_hash)
          end
      end
      table_column_hash.each_key do |table|
        puts "\nIn #{table}, these columns need re-collating:\n" if preview
        table_column_hash[table].inject(false) do |comma,column_hash|
          printf "#{((comma) ? ", " : "\t")}" + column_hash["Field"] if preview
          comma = true
          alter_table_query_list.push <<-EOQ
            ALTER TABLE `#{table}` CHANGE `#{column_hash["Field"]}`
              `#{column_hash["Field"]}` #{column_hash["Type"]}
              CHARACTER SET #{TARGET_CHARSET}
              COLLATE `#{TARGET_COLLATION}`
            EOQ
        end
        printf ".\n" if preview
      end
      printf "\n" if preview
    end

    puts "-= PREVIEW OF SQL, NO CHANGES HAVE BEEN MADE =-" if preview
    alter_table_query_list.inject(0) do |i,query|
      puts query if preview
      if not preview
        printf "Executing query ##{i}...\r"
        STDOUT.flush
        begin
          @dbh.query(query)
        rescue MysqlError => e
          mysql_exception_handler(e)
        end
      end
      i += 1
    end
    printf "\n"
    puts "-= PREVIEW OF SQL, NO CHANGES HAVE BEEN MADE =-" if preview
  end
end

rmd = RecollateMysqlDatabase.new
rmd.recollate_tables(preview = true)

printf "Okay to proceed? (YES) -> "
response = STDIN.gets.chomp
if response != "YES"
  puts 'I need the answer "YES" in all caps to proceed! Exiting...'
  exit 1
end

rmd.recollate_tables(preview = false)
puts "Done!"
exit 0

Instructions: modify the top of the class connect with the username and password for the host and database you are interested in recollating. Run the script and if you are satisfied that the ALTER TABLE queries look sane, answer YES to let the script do the work.

Good luck!

Posted by jordanh at May 13, 2006 1:32 AM |

Trackback Pings

TrackBack URL for this entry:
http://jordan.husney.com/mt/mt-tb.cgi/261.

Comments

Post a comment




Remember Me?