In this scenario, the key is to understand that the data base contains UNICODE characters and also contains non-ascii characters, say LATIN1. This mix is a problem.
Best solution I found on Linux:
Precondition: Perform a dump
pg_dump -h localhost MyASCIIDataBase > /tmp/db.sql
1- Gnu iconv
Let iconv do the magic:
iconv -f iso-8859-1 -t utf-8 /tmp/db.sql -o /tmp/iconv.sql
Create an empty Unicode(UTF-8) database and run:
psql -h localhost MyIconvUTFDataBase < iconv.sql
Note: This solution will work for non-ascii characters, such as latin1 or windows charsets, but you will have trouble with UTF-8 characters.
2- Gnu recode (Under Ubuntu, run apt-get install recode)
Lets make recode work on the dump file:
recode -f UTF-8 /tmp/db.sql
Next, create another empty Unicode(UTF-8) database and run:
psql -h localhost MyUTFDataBase < db.sql
Note: This solution will work for all UTF-8 characters, but non-ascii characters will be deleted.
Finally, you can combine these two solutions to get what you want, one database with all UTF-8 characters.
The way I did it was to dump each single table from MyIconvUTFDataBase that was updated from windows (with latin1), and then restore each single table on MyUTFDataBase (restored from recode):
Steps to do it for a single table:
1- Remove table form MyUTFDataBase
2- Dump the table from the iconv data base:
pg_dump -h localhost -t '"schema"."tableName"' MyIconvUTFDataBase -f /tmp/single_table.sql
3- Restore the table to the "recode" data base (MyUTFDataBase):
psql -f single_table.sql MyUTFDataBase
Hope this post might be helpful for you.
psql -f single_table.sql MyUTFDataBase
Hope this post might be helpful for you.
No comments:
Post a Comment