skip to Main Content

Im running this query:

SELECT id FROM posts WHERE title LIKE '%CERTIFIED INSTALLER%';

The text in the database is stored as ‘ᴄᴇʀᴛɪꜰɪᴇᴅ ɪɴꜱᴛᴀʟʟᴇʀꜱ’ which is of a unique font.

The above query returns 0 results, but whenever I change ‘ᴄᴇʀᴛɪꜰɪᴇᴅ ɪɴꜱᴛᴀʟʟᴇʀꜱ’ text font in the database to something like san-serif it returns the results.

Why is this?

2

Answers


  1. It isn’t a different font, but a different character in Unicode.

    If you paste the text into https://www.babelstone.co.uk/Unicode/whatisit.html, you’ll see it informs you of what the characters actually are:

    U+1D04 : LATIN LETTER SMALL CAPITAL C
    U+1D07 : LATIN LETTER SMALL CAPITAL E
    U+0280 : LATIN LETTER SMALL CAPITAL R
    U+1D1B : LATIN LETTER SMALL CAPITAL T
    U+026A : LATIN LETTER SMALL CAPITAL I
    U+A730 : LATIN LETTER SMALL CAPITAL F
    U+026A : LATIN LETTER SMALL CAPITAL I
    U+1D07 : LATIN LETTER SMALL CAPITAL E
    U+1D05 : LATIN LETTER SMALL CAPITAL D
    U+0020 : SPACE [SP]
    U+026A : LATIN LETTER SMALL CAPITAL I
    U+0274 : LATIN LETTER SMALL CAPITAL N
    U+A731 : LATIN LETTER SMALL CAPITAL S
    U+1D1B : LATIN LETTER SMALL CAPITAL T
    U+1D00 : LATIN LETTER SMALL CAPITAL A
    U+029F : LATIN LETTER SMALL CAPITAL L
    U+029F : LATIN LETTER SMALL CAPITAL L
    U+1D07 : LATIN LETTER SMALL CAPITAL E
    U+0280 : LATIN LETTER SMALL CAPITAL R
    U+A731 : LATIN LETTER SMALL CAPITAL S
    

    Taking the "ᴄ" as an example, you can look it up elsewhere, such as at https://symbl.cc/en/1D04/, which informs us that:

    Latin Letter Small Capital C. Phonetic Extensions.
    The symbol "Latin Letter Small Capital C" is included in the "Latin letters" > sub block of the "Phonetic Extensions" block and was approved as part of > Unicode version 4.0 in 2003.

    Whereas you can see a standard capital C is actually a different character in Unicode, referred to as "Latin Letter Capital C", https://symbl.cc/en/0043/:

    Latin Capital Letter C. Basic Latin.
    The symbol "Latin Capital Letter C" is included in the "Uppercase Latin alphabet" sub block of the "Basic Latin" block and was approved as part of Unicode version 1.1 in 1993.

    It also means that your database (and table) uses a character set that supports the Unicode characters you’ve shown here.

    Login or Signup to reply.
  2. When a search is performed using the LIKE operator, the database compares the characters in the search string with the characters in the column data. However, character encoding and collation settings determine how the comparison is done.

    You can specify a collation for your query to ensure proper character comparison. Here’s an example of how you can modify your query to use a specific collation:

    SELECT id FROM posts WHERE title COLLATE utf8mb4_bin LIKE '%CERTIFIED INSTALLER%';
    

    *** Adjust the collation according to your database’s character encoding and collation settings.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search