skip to Main Content

I have a table as follows:

CREATE TABLE `table` (
    `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
    `cc` int(3) unsigned NOT NULL,
    `number` int(10) NOT NULL,
    `name` varchar(64) NOT NULL,
    `datetime` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB

The DBMS is MariaDB 10.1.26 on Debian 9.1. I had been trying to get it to list ranges of consecutive numbers. With the following query, I am able to accomplish that:

SELECT min(number) first_number, max(number) last_number, count(*) AS no_records FROM (
    SELECT c.*, @rn := @rn + 1 rn
    from (SELECT number FROM table WHERE cc = 1 GROUP BY number ORDER BY number) AS c
    CROSS JOIN (SELECT @rn := 0) r
) c
GROUP BY number - rn ORDER BY number ASC

But if I wanted items bunched together based on the value in an additional column, this doesn’t work. Say I want the items grouped only when their values for name are all the same. Say this is my data:

INSERT INTO `table` (`id`, `cc`, `number`, `name`) VALUES
(1, 1, 12, 'Hello'),
(2, 1, 2, 'Apple'),
(3, 1, 3, 'Bean'),
(4, 1, 10, 'Hello'),
(5, 1, 11, 'Hello'),
(6, 1, 1, 'Apple'),
(7, 1, 14, 'Deer'),
(8, 1, 14, 'Door'),
(9, 1, 15, 'Hello'),
(10, 1, 17, 'Hello'),

I’d like to get a report like this:

first  last   count  name
1      2      2      Apple
3      3      1      Bean
10     12     3      Hello
14     14     1      Deer
14     14     1      Door
15     15     1      Hello
17     17     1      Hello

In other words, in addition to grouping items that are consecutive, these groups are split up into separate groups when their values of name differ. (In other words, items are only in an island together if they are all consecutive AND have the same exact name). The closest (and it’s not very close) that I have come, is doing this:

SELECT min(number) first_number, max(number) last_number, count(*) AS no_records FROM (
    SELECT c.*, @rn := @rn + 1 rn
    from (SELECT number FROM table WHERE cc = 1 GROUP BY number, name ORDER BY number) AS c
    CROSS JOIN (SELECT @rn := 0) r
) c
GROUP BY number - rn, name ORDER BY number ASC

This doesn’t work, though, and what happens is it seems to return the first appearance of a name as first and the last appearance as last, with no_records being the difference in number between them, which is certainly not right at all.

I feel like this question might be related, but I’ve not been able to make much sense of it, and when I tried tweaking it to my table, it just did the equivalent of a simple SELECT * more or less. What modifications to my query do I need to make to get it to work?

Keep in mind:

  • items can be inserted in any order
  • Numbers can be repeated
  • Names can be repeated, not necessarily consecutively

2

Answers


  1. Your example is not a gaps-and-islands problem. If it is representative of your actual problem, you can just use aggregation:

    select min(number), max(number), count(*), name
    from t
    group by name;
    

    I say this because gaps-and-islands is much more challenging without window functions. And that begs the question of why you are not using a more recent version of MariaDB. The end-of-life for 10.1 is October of this year, anyway.

    EDIT:

    As a gaps-and-islands, this is a little tricky because each name has to be handled separately. The trick is to use row_number() with partitioning:

    select name, min(number), max(number), count(*)
    from (select t.*,
                 row_number() over (partition by name order by number) as seqnum
          from t
         ) t
    group by name, (number - seqnum);
    

    If you have adjacent numbers for a name and subtract a sequential value, then the result is constant. For instance:

    Name   Number Seq  Diff
    Hello    10    1    9
    Hello    11    2    9
    Hello    12    3    9
    Hello    15    4   11
    

    The diff identifies the groups for aggregation.

    Oops, I forget this was for a soon-to-be-obsolete version of MariaDB:

    select name, min(number), max(number), count(*)
    from (select t.*,
                 (select count(*)
                   from `table` t2
                   where t2.name = t.name and t2.number <= t.number
                  ) as seqnum
          from `table` t
         ) t
    group by name, (number - seqnum);
    

    For performance, you want an index on (name, number). The performance should be reasonable, unless names have more than a few hundred rows.

    Here is a db<>fiddle.

    Login or Signup to reply.
  2. There is not much to change in your query. You basically need to select name and number in the subquery and sort in the same order. Then you can group by name, number - rn in the outer query.

    SELECT
        min(number) first_number,
        max(number) last_number,
        count(*) AS no_records,
        name
    FROM (
        SELECT c.*, @rn := @rn + 1 rn
        from (
            SELECT name, number
            FROM `table`
            WHERE cc = 1
            ORDER BY name, number
            LIMIT 99999999999999999
        ) AS c
        CROSS JOIN (SELECT @rn := 0) r
    ) c
    GROUP BY name, number - rn 
    ORDER BY first_number ASC, name ASC;
    

    Result:

    first_number  last_number  no_records  name
               1            2           2  Apple
               3            3           1  Bean
              10           12           3  Hello
              14           14           1  Deer
              14           14           1  Door
              15           15           1  Hello
              17           17           1  Hello
    

    db<>fiddle

    I usually advocate against the use of session variables in this way. The reason is that such solutions depend on internal implementation, and can be broken by version updates or settings changes. For example: Once MariaDB decided to ignore the ORDER BY clause in subqueries without LIMIT. This is why I included a huge LIMIT.

    I also replaced number with first_number in the outer ORDER BY clause to avoid problems with ONLY_FULL_GROUP_BY mode.

    A more stable way to generate row numbers is to use an AOTO_INCREMENT column in a temporary table:

    drop temporary table if exists tmp_tbl;
    
    create temporary table tmp_tbl (
      rn int unsigned auto_increment primary key,
      name varchar(64) not null,
      number int not null
    );
    
    insert into tmp_tbl (name, number)
      select name, number
      from `table`
      order by name, number;
    

    The final SELECT query is identical with the outer query above:

    SELECT
        min(number) first_number,
        max(number) last_number,
        count(*) AS no_records,
        name
    FROM tmp_tbl
    GROUP BY name, number - rn 
    ORDER BY first_number ASC, name ASC;
    

    db<>fiddle

    In a more recent version (starting from MariaDB 10.2) you can use ROW_NUMBER() window function instead:

    SELECT
        min(number) first_number,
        max(number) last_number,
        count(*) AS no_records,
        name
    FROM (
        SELECT
            name,
            number,
            row_number() OVER (ORDER BY name, number) as rn
        FROM `table`
        WHERE cc = 1
    ) c
    GROUP BY name, number - rn 
    ORDER BY first_number ASC, name ASC;
    

    db<>fiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search