skip to Main Content

My table looks like this:

CREATE TABLE `character_unique` (
        `id` int(11) NOT NULL,
        `name` varchar(256) NOT NULL,
        `category` varchar(64) DEFAULT NULL,
        `name_without_stop_words` varchar(320) DEFAULT NULL,
        `name_first_word_with_exception` varchar(320) DEFAULT NULL,
        `master` varchar(320) DEFAULT NULL,
        `nb_letter` int(11) DEFAULT NULL,
        `count` int(11) DEFAULT NULL,
        PRIMARY KEY (`id`),
        KEY `character_unique_count_index` 
(`count`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

I need to group the data by the columns name_first_word_with_exception and category. In the grouped records, I want to display the name_without_stop_words column with the shortest string length in the group.

In general, it works fine, except for situations like this. The category "Harry Potter" with "name_first_word_with_exception" = "Patronus" contains the following values for "name_without_stop_words":

Patronus d'Harry Potter - Translucide
Patronus Ron Weasley - Translucide
Patronus Hermione Granger - Translucide
Patronus Remus Lupin
Patronus Albus Dumbledore
Patronus Severus
Patronus Minerva McGonagall
Patronus Remus Lupin
Patronus Severus Snape

But when grouping, the shortest name turns out to be "Patronus Albus Dumbledore," although the actual shortest name here is "Patronus Severus"

I tried using a subquery, in this case, everything works correctly, but verrrry slowly!

SELECT *, 
    (SELECT name_without_stop_words 
     FROM character_unique 
     WHERE category LIKE CONCAT('%', 'Harry Potter', '%') 
        AND name_first_word_with_exception = character_name 
     ORDER BY nb_letter LIMIT 1) as name_without_stop_words 
FROM (
    SELECT id as unque_character_id,
       null as id_character_added_manually,
       category,
       name_first_word_with_exception as character_name,
       master,
       count
    FROM character_unique 
    WHERE category LIKE CONCAT('%', 'Harry Potter', '%')
    GROUP BY character_name, category
) as t1;

Here is how the correct version of the grouped records looks like:

enter image description here

How can I find the shortest string in a group?

2

Answers


  1. Chosen as BEST ANSWER

    I find solution:

    SET SESSION sql_mode = '';
    
    SELECT id                             AS unque_character_id,
           NULL                           AS id_character_added_manually,
           category,
           name_first_word_with_exception,
           name_without_stop_words        AS character_name,
           master,
           count
    FROM (
        select character_unique.name_first_word_with_exception, character_unique.category, name_without_stop_words, master, count, id
        from character_unique
        join (
            SELECT cu.name_first_word_with_exception, category, MIN(LENGTH(name_without_stop_words)) as min_lngt
            FROM character_unique AS cu
            WHERE cu.category LIKE CONCAT('%', 'Harry Potter', '%')
            GROUP BY name_first_word_with_exception, category
        ) as join_query on character_unique.name_first_word_with_exception = join_query.name_first_word_with_exception
        where character_unique.category like concat('%', 'Harry Potter', '%')
            and length(character_unique.name_without_stop_words) = join_query.min_lngt
        group by name_first_word_with_exception, category
    ) AS t1;
    

  2. I need to group the data by the columns name_first_word_with_exception and category. In the grouped records, I want to display the name_without_stop_words column with the shortest string length in the group.

    Assuming you’re not stuck in a 5.7 release, the solution here is a windowing function:

    SELECT *
    FROM (
        select id as unque_character_id,
           null as id_character_added_manually,
           category,
           name_first_word_with_exception as character_name,
           master,
           count,
           row_number() over (partition by character_name, category 
                              order by length(name_without_stop_words) ) rn
        from character_unique 
    ) t
    WHERE rn = 1
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search