官术网_书友最值得收藏!

Preventing encoding-related issues

Encoding-related problems are some of the most frequent problems that occur when people start with a fresh PostgreSQL setup. In PostgreSQL, every database in your instance has a specific encoding. One database might be en_US@UTF-8, while some other database might have been created as de_AT@UTF-8 (which denotes German as it is used in Austria).

To figure out which encodings your database system is using, try to run psql -l from your Unix shell. What you will get is a list of all databases in the instance that include those encodings.

So where can we actually expect trouble? Once a database has been created, many people would want to load data into the system. Let's assume that you are loading data into the aUTF-8 database. However, the data you are loading contains some ASCII characters such as ?, ?, and so on. The ASCII code for ? is 148. Binary 148 is not a valid Unicode character. In Unicode, U+00F6 is needed. Boom! Your import will fail and PostgreSQL will error out.

If you are planning to load data into a new database, ensure that the encoding or character set of the data is the same as that of the database. Otherwise, you may face ugly surprises.

To create a database using the correct locale, check out the syntax of CREATE DATABASE:

test=# \h CREATE DATABASE
Command:     CREATE DATABASE
Description: create a new database
Syntax:
CREATE DATABASE name
    [ [ WITH ] [ OWNER [=] user_name ]
           [ TEMPLATE [=] template ]
           [ ENCODING [=] encoding ]
           [ LC_COLLATE [=] lc_collate ]
           [ LC_CTYPE [=] lc_ctype ]
           [ TABLESPACE [=] tablespace_name ]
           [ CONNECTION LIMIT [=] connlimit ] ]

ENCODING and the LC* settings are used here to define the proper encoding for your new database.

主站蜘蛛池模板: 衢州市| 根河市| 贡嘎县| 界首市| 龙岩市| 静安区| 金湖县| 正定县| 剑河县| 宁津县| 江津市| 永福县| 松阳县| 朝阳区| 广元市| 富阳市| 陵川县| 兴文县| 甘孜| 正蓝旗| 台中县| 安丘市| 扎囊县| 通辽市| 抚顺市| 竹山县| 县级市| 滁州市| 资源县| 喜德县| 拉孜县| 阿勒泰市| 哈尔滨市| 灵寿县| 永登县| 哈巴河县| 上杭县| 文安县| 巨鹿县| 丰城市| 商都县|