I seem to have some problem with encoding, but I can’t pinpoint it.
PHPMyAdmin says:
Server type: MariaDB
Server version: 10.3.39-MariaDB-log - MariaDB Server
Server charset: ISO 8859-2 Central European (latin2)
Server connection collation: utf8mb4_unicode_ci
I can’t change the SQL server in any way, this is provided by my website hosting provider.
All my dbs, tables and columns use utf8mb4_unicode_ci
. All files are encoded as UTF-8
. The values display properly both in PHPMyAdmin and in MySQLWorkbench. Other scripts on my website work fine, displaying english, russian, chinese, etc. Just this one is not complying for some reason. I tried inserting the data through PHPMA, Workbench, and even from the very same script.
I connect using PDO, with charset specified, via an included file:
<?php
if (!isset($pdo))
{
$DBHOST = 'localhost';
$DBNAME = '***';
$DBUSER = '***';
$DBPASS = '***';
$DBCHRS = 'utf8mb4';
$dsn = 'mysql:host='.$DBHOST.';dbname='.$DBNAME.';charset='.$DBCHRS;
$options = [PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION, PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC, PDO::ATTR_EMULATE_PREPARES => false, PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => false];
try
{
$pdo = new PDO($dsn, $DBUSER, $DBPASS, $options);
}
catch (PDOException $e)
{
echo 'Could not connect to the database!<br>Message: ', $e->getMessage(), '<br>Code: ', $e->getCode();
exit();
}
}
?>
and then
$json = [];
$json['people'] = [];
$json['relations'] = [];
$stmt = $pdo->prepare('SELECT * FROM `tree_people`;');
$stmt->execute();
while($row = $stmt->fetch(PDO::FETCH_ASSOC))
$json['people'][] = $row;
$stmt = $pdo->prepare('SELECT * FROM `tree_relations`;');
$stmt->execute();
while($row = $stmt->fetch(PDO::FETCH_ASSOC))
$json['relations'][] = $row;
/*/
header('Access-Control-Allow-Origin: *');
header('Content-Type: application/json; charset=utf-8');
/*/
header('Content-Type: text/html; charset=utf-8');
//*/
//*/
echo '<pre>';
print_r($json);
echo '</pre>';
//*/
echo '<pre>';
echo json_encode($json, JSON_UNESCAPED_UNICODE);
echo '</pre>';
JSON fails to generate (empty string?) and the content displayed with print_r
has all the special characters showing as �.
I’m going crazy, what is going on?
Edit: I can properly read data from other tables. However it is only possible when I write the data to the DB through my own editor, not through PMA. See https://herhor.net/news/?id=1 when I first inserted it via PMA, it was full of unknown characters.
However, now when I read it in PMA or Workbench, it is full of scrambled characters.
It seems that there is some mismatch between the encoding used by PMA/Workbench and the one used by all my scripts. Shouldn’t the DB deal with it automatically?
Edit2:
As requested, SHOW VARIABLES LIKE 'char%';
for Workbench:
character_set_client utf8mb4
character_set_connection utf8mb4
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8mb4
character_set_server latin2
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/
For PMA:
character_set_client utf8mb4
character_set_connection utf8mb4
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8mb4
character_set_server latin2
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/
For PHP PDO:
Array
(
[0] => stdClass Object
(
[Variable_name] => character_set_client
[Value] => latin2
)
[1] => stdClass Object
(
[Variable_name] => character_set_connection
[Value] => latin2
)
[2] => stdClass Object
(
[Variable_name] => character_set_database
[Value] => utf8mb4
)
[3] => stdClass Object
(
[Variable_name] => character_set_filesystem
[Value] => binary
)
[4] => stdClass Object
(
[Variable_name] => character_set_results
[Value] => latin2
)
[5] => stdClass Object
(
[Variable_name] => character_set_server
[Value] => latin2
)
[6] => stdClass Object
(
[Variable_name] => character_set_system
[Value] => utf8
)
[7] => stdClass Object
(
[Variable_name] => character_sets_dir
[Value] => /usr/share/mysql/charsets/
)
)
Also here is the SQL import/export file for both tables: https://pastebin.com/33Ap4Vje
2
Answers
It would be much better if you provided the exact string you’re having trouble with, but I’m guessing you’re dealing with Non-UTF8 data.
json_encode()
in PHP will return empty string if it gets characters (bytes) outside UTF8 range I guess. Try to re-encode your string before passing it tojson_encode()
.You can use
utf8_encode()
on your string, or in case your string is encoded with Windows-1252/ANSI you can use the following approach:Take a look at this answer, which specifically discusses this problem on PDO:
Update:
According to your update, seems that your data was scrambled in the first place (When you inserted that using PHPMYADMIN), but if it is working properly when using website editor (Which works with PDO right?), problem might be caused by
charset collation
set in your DB, One possible solution is to:.sql
filecharset collation
and make sure it isutf8_general_ci
and your data is ok(not messed up by encoding), save the file and add it as a new DB and use that one.SET NAMES UTF8
query every time you interact with your DB, like in PDO you can use something like This.This good answer might help you too.
As your edit shows, the problem is definitely with the PDO connection as it is showing
latin2
in the variouscharacter_set_*
variables.Does adding
SET NAMES utf8mb4
to your$options
array help: