Friday, January 23, 2009

Converting MySQL to UTF-8 the easy way

This is just a quick note to myself.

So you're running on the latest version of INSERT NAME OF FANCY WEB FRAMEWORK HERE? Think character set problems are a relic of the past? Not so with MySQL. The default configuration is using Latin-1. When you install MySQL, the first you should do is ensure MySQL is using UTF-8. This problem will go away at some point when distributions change the defaults, but until then.

Meanwhile, if you're like me, you might have created some tables before discovering the problem. It is, after all, difficult to see before you create the tables. So here's a recipe for converting a whole database (idea stolen from Wordpress).

First type in these commands, replacing mydb with the name of your database:

USE information_schema;
SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', column_type, ' CHARACTER SET utf8;') FROM columns WHERE table_schema = 'mydb' and data_type LIKE '%char%';
SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', column_type, ' CHARACTER SET utf8;') FROM columns WHERE table_schema = 'mydb' and data_type LIKE '%text%';
SELECT CONCAT('ALTER TABLE ', table_name, ' CHARACTER SET utf8;') FROM tables WHERE table_schema = 'mydb';

This should output the commands you need to feed into MySQL to do the change. If you start the MySQL shell with -s, it's easier to copy-paste. Then type

USE mydb;
ALTER DATABASE mydb CHARACTER SET utf8;
[... pasted commands ...]

The problem here is that the character set is stored on multiple levels. Both column, table and database level. The second line fixes the character set of the database, and the pasted in commands fixes the columns and tables.

This works for Django. If you have used a framework that allows you to put UTF-8 characters into the Latin-1 columns, you need to do something else. The Wordpress link has the details.

1 comment:

  1. This does not work on all the tables, I have tried several times.

    ReplyDelete