skip to Main Content

When importing a csv file that contains strings with special characters into a varchar column in a table in a Postgresql database, I notice that the registered trademark symbols (®) and em dashes (—) are both getting stored as �. The � is also what gets exported from the database.

How can I get the database to recognize/accept/store the ® and — symbols?

Thanks in advance for your help!

I imported the csv using the Import Data wizard in dBeaver. The data was imported "successfully" but the ® and — symbols got stored as � symbols. I expected the special characters to be accepted in a varchar column.

2

Answers


  1. Storing such characters is no problem, as long as the database encoding can encode these characters (UTF8 is almost always the correct choice).

    All you have to do is set client_encoding to the encoding of the CSV files when you import the data. DBeaver doesn’t seem to allow you to select the encoding of the CSV file, so you should use a different tool. If you use COPY (or psql‘s copy), you can add the ENCODING option to specify the encoding of the input file.

    Login or Signup to reply.
  2. https://www.postgresql.org/docs/current/multibyte.html

    Postgresql supports various character-sets, including UTF-8 which should cover all bases. The encoding of your database is unfortunately set during initdb and the default is dependent on your system. The command would be:

    initdb -E UTF8
    

    The page documents no way to change it afterwards or on a per-table-basis, so you probably need to recreate the database and migrate your data (using pg_dump for example).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search