I want replace in a column a string of characters with UTF8 characters, using REGEXP_REPLACE and/or REPLACE.
$search = mysqli_real_escape_string($id_mysql, $_POST['search']);
$new = mysqli_real_escape_string($id_mysql, $_POST['new']);
mysqli_query ($id_mysql,
"UPDATE table_name
SET column_name = REGEXP_REPLACE(column_name, '$search', '$new')
WHERE column_name REGEXP '$search'"
) or die(mysqli_error($id_mysql));
With :
$_POST['search'] = 'blabla';
$_POST['new'] = 'x{26A1}';
This inserts the string x{26A1}, not the UTF8 character.
x{26A1} works in $POST[‘search’] but not in $POST[‘new’].
I tried u{26A1}, u26A1, x26A1.
Edit :
To better understand the problem, I have strip mysqli_real_escape_string.
Sorry, x{26A1} works in the search with PHP REGEX, not MySQL REGEX. MySQL use u{26A1}.
(Forget the hexadecimal code, it’s an example.)
With :
$search = "u{264B}";
$new = "u{264C}";
It works.
BUT: if i sent u{264B} or "u{264B}" in the search with the form, it does not work.
$search = $_POST['search'];
$new = "u{264C}";
It seems that with the form and the $_POST, u{264B} becomes a string.
I know double quotes are important. PHP needs double quotes for understand it’s a unicode character. With single quotes, PHP understand it’s a string. I think it’s the key of the problem. But i don’t know how use double quotes with the $_POST. I tried differents ways with no success.
And it’s a simple example. I want use more complex searchs and replacements.
@stackoverflow your related question has nothing to do with my question.
2
Answers
Convert the hex string to unicode character in PHP:
Or you can use PHP’s Unicode character escaping (assuming you are trying to insert the high voltage character:
We need to know more about the dataflow.
Assuming the HTML is expecting UTF-8:
<form> <input>
.CHARACTER SET utf8mb4
. TheCOLLATION
does not matter (at least for this discussion).INSERT
statement.For example, entering
⚡
should showE29AA1
when doingSELECT HEX(col)...
to see the hex for the character in the table.If you want to be "nice" to the user, you could allow them to enter
0xE29AA1
orU+26A1
. The former is the UTF-8 (utf8mb4) hex with0x
as a common indicator of such. The latter is the common way to spell the Unicode "codepoint" for that same character. The Ox and U+ would require your PHP code to discover such, convert it to the character, etc.In some contexts,
u{26A1}
will be automatically converted to that single character.HTML "entities" —
⚡
will show as the character as the page is being rendered. Yes, you could store those 8 characters in the database, with the assumption that the only use for the text is to render it via HTML.⚡
is the decimal equivalent of⚡
— either works equally well on HTML pages.See also https://unicode.scarfboy.com/?s=%E2%9A%A1
Note:
%E2%9A%A1
is the "URL encoding" for the character.