My table looks like this:
CREATE TABLE `character_unique` (
`id` int(11) NOT NULL,
`name` varchar(256) NOT NULL,
`category` varchar(64) DEFAULT NULL,
`name_without_stop_words` varchar(320) DEFAULT NULL,
`name_first_word_with_exception` varchar(320) DEFAULT NULL,
`master` varchar(320) DEFAULT NULL,
`nb_letter` int(11) DEFAULT NULL,
`count` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `character_unique_count_index`
(`count`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
I need to group the data by the columns name_first_word_with_exception
and category
. In the grouped records, I want to display the name_without_stop_words
column with the shortest string length in the group.
In general, it works fine, except for situations like this. The category "Harry Potter" with "name_first_word_with_exception" = "Patronus" contains the following values for "name_without_stop_words":
Patronus d'Harry Potter - Translucide Patronus Ron Weasley - Translucide Patronus Hermione Granger - Translucide Patronus Remus Lupin Patronus Albus Dumbledore Patronus Severus Patronus Minerva McGonagall Patronus Remus Lupin Patronus Severus Snape
But when grouping, the shortest name turns out to be "Patronus Albus Dumbledore," although the actual shortest name here is "Patronus Severus"
I tried using a subquery, in this case, everything works correctly, but verrrry slowly!
SELECT *,
(SELECT name_without_stop_words
FROM character_unique
WHERE category LIKE CONCAT('%', 'Harry Potter', '%')
AND name_first_word_with_exception = character_name
ORDER BY nb_letter LIMIT 1) as name_without_stop_words
FROM (
SELECT id as unque_character_id,
null as id_character_added_manually,
category,
name_first_word_with_exception as character_name,
master,
count
FROM character_unique
WHERE category LIKE CONCAT('%', 'Harry Potter', '%')
GROUP BY character_name, category
) as t1;
Here is how the correct version of the grouped records looks like:
How can I find the shortest string in a group?
2
Answers
I find solution:
Assuming you’re not stuck in a 5.7 release, the solution here is a windowing function: