BACKGROUND — I’m creating an Arabic-English dictionary that uses transliterations as the unique identifiers for terms (e.g. to distinguish between أكل /ʔakl/ & أكل /ʔakal/). Many Arabic letters don’t have Latin-script equivalents, so I use certain special characters like āēīōū, ṣḍṭẓ & so on.
I have the following query:
$asInflection = Inflection::where('translit', $term->slug);
I’ve just noticed that my use of special characters returns incorrect query results.
In one case, the $term->slug is ʔamal & there is an Inflection::where(‘translit’, ʔāmāl). Laravel returns a match, which it should not; it’s absolutely imperative for the dictionary’s proper functioning that these characters not be treated the same. I’m not sure if the issue is with PHP or with MySQL; I’m pretty sure the issue nothing to do with Laravel itself, but I imagine there is something I can do via Laravel to solve it.
Any advice is appreciated.
3
Answers
I think there is a character recognition problem from the Database.
if you want to match some speacial charcters then you should use
utf8_unicode_ci
type of Collation fortranslit
type column instead ofutf8_general_ci
.So here you need to replace column collation type from
utf8_general_ci
toutf8_unicode_ci
You can try this:
āēīōū, ṣḍṭẓ
can be represented in utf8mb4, which is the MySQL CHARACTER SET that you should be using.latin1
should not be used for Arabic.returns false (0). And it only because of the last
a
.أكل /ʔakal/
is this in utf8mb4; here is the HEX() of it:so I don’t see the need for
āēīōū, ṣḍṭẓ
Maybe it is the keyboard-entry that is lacking?UNICODE is the controlling organization. PHP, MySQL, Laravel, etc, simply follow its rules. (I don’t know about
Inflection
.)Run this to see what collations you have. I don’t see any that are specific to Arabic: