When importing a csv file that contains strings with special characters into a varchar column in a table in a Postgresql database, I notice that the registered trademark symbols (®) and em dashes (—) are both getting stored as �. The � is also what gets exported from the database.
How can I get the database to recognize/accept/store the ® and — symbols?
Thanks in advance for your help!
I imported the csv using the Import Data wizard in dBeaver. The data was imported "successfully" but the ® and — symbols got stored as � symbols. I expected the special characters to be accepted in a varchar column.
2
Answers
Storing such characters is no problem, as long as the database encoding can encode these characters (
UTF8
is almost always the correct choice).All you have to do is set
client_encoding
to the encoding of the CSV files when you import the data. DBeaver doesn’t seem to allow you to select the encoding of the CSV file, so you should use a different tool. If you useCOPY
(orpsql
‘scopy
), you can add theENCODING
option to specify the encoding of the input file.https://www.postgresql.org/docs/current/multibyte.html
Postgresql supports various character-sets, including UTF-8 which should cover all bases. The encoding of your database is unfortunately set during
initdb
and the default is dependent on your system. The command would be:The page documents no way to change it afterwards or on a per-table-basis, so you probably need to recreate the database and migrate your data (using pg_dump for example).