skip to Main Content

I’m using MySQL 8.2 and I’ve got a table called records for which I added the following index in order to perform text search on it:

CREATE FULLTEXT INDEX all_records_idx ON records (`title`, `label`, `description`, `catalog_number`,`barcode`);

Now, I’ve got the following data (copied it as JSON):

[
  {
    "barcode": "79027066021",
    "title": "White Blood Cells",
    "slug": "White-blood-cells",
    "image": null,
    "description": "rev",
    "artist_id": 3,
    "genre_id": 1,
    "num_of_records": 1,
    "original_release_year": 2001,
    "original_release_date": null,
    "reissue_release_date": null,
    "label": null,
    "sleeve_side_color": "ff1500",
    "vinyl_color": "ff1500",
    "size": 12,
    "barcode_in_record": null,
    "catalog_number": null,
    "owned_copies": 1,
    "active": 1,
    "created_at": "2023-06-11 21:32:17.000",
    "updated_at": null,
    "deleted_at": null
  },
  {
    "barcode": "889854036119",
    "title": "Nilsson Schmilsson",
    "slug": "nilsson-schmilsson",
    "image": null,
    "description": null,
    "artist_id": 6,
    "genre_id": 1,
    "num_of_records": 1,
    "original_release_year": 1971,
    "original_release_date": null,
    "reissue_release_date": null,
    "label": null,
    "sleeve_side_color": "490d57",
    "vinyl_color": "000000",
    "size": 12,
    "barcode_in_record": null,
    "catalog_number": null,
    "owned_copies": 1,
    "active": 1,
    "created_at": "2023-06-11 21:32:17.000",
    "updated_at": null,
    "deleted_at": null
  },
  {
    "barcode": "C10777774644617",
    "title": "Abbey Road",
    "slug": "abbey-road",
    "image": null,
    "description": null,
    "artist_id": 2,
    "genre_id": 1,
    "num_of_records": 1,
    "original_release_year": 1969,
    "original_release_date": null,
    "reissue_release_date": null,
    "label": null,
    "sleeve_side_color": "46518f",
    "vinyl_color": "000000",
    "size": 12,
    "barcode_in_record": null,
    "catalog_number": null,
    "owned_copies": 1,
    "active": 1,
    "created_at": "2023-06-11 21:32:17.000",
    "updated_at": null,
    "deleted_at": null
  },
  {
    "barcode": "C10777774644618",
    "title": "Revolver",
    "slug": "revolver",
    "image": null,
    "description": null,
    "artist_id": 2,
    "genre_id": 1,
    "num_of_records": 1,
    "original_release_year": 1966,
    "original_release_date": null,
    "reissue_release_date": null,
    "label": null,
    "sleeve_side_color": "46518f",
    "vinyl_color": "000000",
    "size": 12,
    "barcode_in_record": null,
    "catalog_number": null,
    "owned_copies": 1,
    "active": 1,
    "created_at": "2023-06-11 21:32:17.000",
    "updated_at": null,
    "deleted_at": null
  },
  {
    "barcode": "C10777774644619",
    "title": "arevolat",
    "slug": "aurevolat",
    "image": null,
    "description": "this is a test",
    "artist_id": 2,
    "genre_id": 1,
    "num_of_records": 1,
    "original_release_year": 1999,
    "original_release_date": null,
    "reissue_release_date": null,
    "label": null,
    "sleeve_side_color": null,
    "vinyl_color": null,
    "size": 12,
    "barcode_in_record": null,
    "catalog_number": null,
    "owned_copies": 1,
    "active": 1,
    "created_at": null,
    "updated_at": null,
    "deleted_at": null
  }
]

So, there are currently 5 records in the table. In order to test the search functionality, I ran the following query:

SELECT * FROM records WHERE MATCH(`title`, `label`, `description`, `catalog_number`,`barcode`)
AGAINST ('*rev*' IN BOOLEAN MODE)

I was expecting to get 3 records back:

  • 1 for barcode 79027066021 where it should match to the description "description": "rev".
  • 1 for barcode C10777774644618 where it should match to the title "title": "Revolver".
  • 1 for barcode C10777774644619 where it should match to the title "title": "arevolat".

But I’m only getting back the first two records:

  • 1 for barcode 79027066021 where it should match to the description "description": "rev".
  • 1 for barcode C10777774644618 where it should match to the title "title": "Revolver".

This one is not present:

  • 1 for barcode C10777774644619 where it should match to the title "title": "arevolat".

Even though I’m using * as part of the search.

Any idea what’s going on?

Thanks

2

Answers


  1. In MySQL, the wildcard character is % rather than *.

    Try AGAINST ('%rev%' IN BOOLEAN MODE)

    Sometimes, * acts as a modifier to match ‘zero or more instances of the thing preceding it.’

    See https://dev.mysql.com/doc/refman/8.0/en/pattern-matching.html

    However, the above Pattern Matching applies to the extended regular expressions (such as when using RLIKE).

    For a Boolean Full-text Search, the * is a truncation operator. The * at the end of your expression means that rev must be present at the start of one or more words. This explains the missing record in your results.

    I suspect the first * is ignored.

    More info: https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html

    Login or Signup to reply.
  2. "%", "*", and other punctuation marks have specific and different meanings for AGAINST, LIKE, and RLIKE.

    MATCH(title, ...) AGAINST('*rev*')  -- ignores the leading punctuation mark
    title LIKE '%rev%'  -- finds the desired 3 rows, but only in `title`; and slow
    title RLIKE 'rev'  -- finds the desired 3 rows, but only in `title`; and slow
    title LIKE '%rev%' OR label LIKE '%rev%' OR ... -- even slower
    

    FULLTEXT, when appropriate, is almost always faster than LIKE or RLIKE; often much faster. I suggest you change the requirements and/or expectations. As you see from above, there is no syntax that is both simple and fast.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search