What happens if I send UTF-8 data to a mySQL database with a character set of latin1?

Kropotkin
May 22, 2023
214 views
0 votes
2 Answers

Please explain this situation: I have a mySQL database which is set to have its connection as latin1 and the character set of tables and columns to latin1. I send it UTF-8 encoded data (e.g. from a web form which encodes as UTF-8 via PHP). Later I retrieve that data and display it on a web page set to use UTF-8 encoding.

Will I see what I put in? I know I will for ASCI but how about for e.g. ü – a German umluat. And for a more exotic non Latin1 character?

In fact my simple test of sending UTF-8 encoded Japanese characters to the database from WordPress and then viewing them on a webpage suggests that there are no problems. I suspect that there is no conversion and that the database just stores the bytes it gets. But, if this is the case what is the significance of setting a character set? It is not a collation, which (I think) is to do with sorting.

Thank you

Tags: mysql

Answers

- GuillaumeF
- May 17, 2023 at 11:21 pm
- 0 votes
0
The database will attempt to convert your latin1 connection data to its internal encoding (typically utf8mb4, which is determined during installation). This process is likely to corrupt your string.

Even if the conversion were successful, searching or ordering this column would not be possible. If no alternatives are available, it would be preferable to store your UTF-8 string as a varbinary type.

Login or Signup to reply.

- RickJames
- May 22, 2023 at 8:44 pm
- 0 votes
0
See Trouble with UTF-8 characters; what I see is not what I stored

latin1 can handle umlaut-u and other Western European characters. But you must tell MySQL that the client is talking latin1. And tell declare the columns to be utf8mb4. (Or whatever combination you have.)

Latin1 cannot handle any Asian character set.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.