skip to Main Content

I seem to have some problem with encoding, but I can’t pinpoint it.

PHPMyAdmin says:

Server type: MariaDB
Server version: 10.3.39-MariaDB-log - MariaDB Server
Server charset: ISO 8859-2 Central European (latin2)
Server connection collation: utf8mb4_unicode_ci

I can’t change the SQL server in any way, this is provided by my website hosting provider.

All my dbs, tables and columns use utf8mb4_unicode_ci. All files are encoded as UTF-8. The values display properly both in PHPMyAdmin and in MySQLWorkbench. Other scripts on my website work fine, displaying english, russian, chinese, etc. Just this one is not complying for some reason. I tried inserting the data through PHPMA, Workbench, and even from the very same script.

I connect using PDO, with charset specified, via an included file:

<?php
if (!isset($pdo))
{
    $DBHOST = 'localhost';
    $DBNAME = '***';
    $DBUSER = '***';
    $DBPASS = '***';
    $DBCHRS = 'utf8mb4';

    $dsn = 'mysql:host='.$DBHOST.';dbname='.$DBNAME.';charset='.$DBCHRS;
    $options = [PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION, PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC, PDO::ATTR_EMULATE_PREPARES => false, PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => false];
    try
    {
        $pdo = new PDO($dsn, $DBUSER, $DBPASS, $options);
    }
    catch (PDOException $e)
    {
        echo 'Could not connect to the database!<br>Message: ', $e->getMessage(), '<br>Code: ', $e->getCode();
        exit();
    }
}
?>

and then

$json = [];
$json['people'] = [];
$json['relations'] = [];

$stmt = $pdo->prepare('SELECT * FROM `tree_people`;');
$stmt->execute();
while($row = $stmt->fetch(PDO::FETCH_ASSOC))
    $json['people'][] = $row;

$stmt = $pdo->prepare('SELECT * FROM `tree_relations`;');
$stmt->execute();
while($row = $stmt->fetch(PDO::FETCH_ASSOC))
    $json['relations'][] = $row;

/*/
header('Access-Control-Allow-Origin: *');
header('Content-Type: application/json; charset=utf-8');
/*/
header('Content-Type: text/html; charset=utf-8');
//*/

//*/
echo '<pre>';
print_r($json);
echo '</pre>';
//*/

echo '<pre>';
echo json_encode($json, JSON_UNESCAPED_UNICODE);
echo '</pre>';

JSON fails to generate (empty string?) and the content displayed with print_r has all the special characters showing as �.

I’m going crazy, what is going on?

Edit: I can properly read data from other tables. However it is only possible when I write the data to the DB through my own editor, not through PMA. See https://herhor.net/news/?id=1 when I first inserted it via PMA, it was full of unknown characters.
However, now when I read it in PMA or Workbench, it is full of scrambled characters.

It seems that there is some mismatch between the encoding used by PMA/Workbench and the one used by all my scripts. Shouldn’t the DB deal with it automatically?

Edit2:
As requested, SHOW VARIABLES LIKE 'char%'; for Workbench:

character_set_client    utf8mb4
character_set_connection    utf8mb4
character_set_database  utf8mb4
character_set_filesystem    binary
character_set_results   utf8mb4
character_set_server    latin2
character_set_system    utf8
character_sets_dir  /usr/share/mysql/charsets/

For PMA:

character_set_client    utf8mb4
character_set_connection    utf8mb4
character_set_database  utf8mb4
character_set_filesystem    binary
character_set_results   utf8mb4
character_set_server    latin2
character_set_system    utf8
character_sets_dir  /usr/share/mysql/charsets/

For PHP PDO:

Array
(
    [0] => stdClass Object
        (
            [Variable_name] => character_set_client
            [Value] => latin2
        )

    [1] => stdClass Object
        (
            [Variable_name] => character_set_connection
            [Value] => latin2
        )

    [2] => stdClass Object
        (
            [Variable_name] => character_set_database
            [Value] => utf8mb4
        )

    [3] => stdClass Object
        (
            [Variable_name] => character_set_filesystem
            [Value] => binary
        )

    [4] => stdClass Object
        (
            [Variable_name] => character_set_results
            [Value] => latin2
        )

    [5] => stdClass Object
        (
            [Variable_name] => character_set_server
            [Value] => latin2
        )

    [6] => stdClass Object
        (
            [Variable_name] => character_set_system
            [Value] => utf8
        )

    [7] => stdClass Object
        (
            [Variable_name] => character_sets_dir
            [Value] => /usr/share/mysql/charsets/
        )

)

Also here is the SQL import/export file for both tables: https://pastebin.com/33Ap4Vje

2

Answers


  1. It would be much better if you provided the exact string you’re having trouble with, but I’m guessing you’re dealing with Non-UTF8 data. json_encode() in PHP will return empty string if it gets characters (bytes) outside UTF8 range I guess. Try to re-encode your string before passing it to json_encode().

    You can use utf8_encode() on your string, or in case your string is encoded with Windows-1252/ANSI you can use the following approach:

    $str = mb_convert_encoding($str, "UTF-8", "Windows-1252");
    

    Take a look at this answer, which specifically discusses this problem on PDO:

    while($row = $stmt->fetch(PDO::FETCH_ASSOC))
    {
      foreach($row as &$value)
      {
        $value = mb_convert_encoding($value, "UTF-8", "Windows-1252");
      }
      unset($value);
      $json['relations'][] = $row;
    }
    

    Update:
    According to your update, seems that your data was scrambled in the first place (When you inserted that using PHPMYADMIN), but if it is working properly when using website editor (Which works with PDO right?), problem might be caused by charset collation set in your DB, One possible solution is to:

    • Export your DB as a .sql file
    • Read the file till you find charset collation and make sure it is utf8_general_ci and your data is ok(not messed up by encoding), save the file and add it as a new DB and use that one.
    • Make sure to use SET NAMES UTF8 query every time you interact with your DB, like in PDO you can use something like This.

    This good answer might help you too.

    Login or Signup to reply.
  2. As your edit shows, the problem is definitely with the PDO connection as it is showing latin2 in the various character_set_* variables.

    Does adding SET NAMES utf8mb4 to your $options array help:

    $options = [
        PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
        PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC,
        PDO::ATTR_EMULATE_PREPARES => false,
        PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => false,
        PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8mb4'
    ];
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search