skip to Main Content

I have a table like this

userId   story  novel
1        a      b
1        a      b
1        a      c
1        b      c
1        b      c
2        x      x
2        x      y
2        y      y
3        m      n
4        NULL   NULL

How do I find the most story and novel count per user?

What I am looking for is the highest distinct count of story and novel for each user. So if a user has no story then story_count should be 0.

Desired output looks like this

userId   story  story_count  novel novel_count
1        a      3            c     3
2        x      2            y     2
3        m      1            n     1
4        NULL   0            NULL  0

This is my faulty current attempt

SELECT userId, story, COUNT(story) as story_count, novel, COUNT(novel) as novel_count  
FROM logs WHERE user = (SELECT DISTINCT(user)) GROUP BY story, novel;

4

Answers


  1. I hope this works:

    
    select story.userid, story, story_count, novel, novel_count 
    from
        (
            select userid, story, count(*) story_count, 
            row_number() over (partition by userid order by count(*) desc) rown_num
            from logs
            where row_num=1
        )story
    left join 
        (
            select userid, novel, count(*) novel_count, 
            row_number() over (partition by userid order by count(*) desc) rown_num
            from logs
            where row_num=1
        )novel on novel.userid=story.userid
    
    Login or Signup to reply.
  2. Based on Tim’s answer,I provide an upgrade solution

    create table `logs`(
     userId int,
     story varchar(10),
     novel varchar(10)
    );
    
    insert into `logs`(userId,story,novel) values
    (1,'a',' b'),
    (1 ,'a','c'),
    (1 ,'b','c'),
    (2 ,'x','x'),
    (2 ,'x','y'),
    (2 ,'y','y');
    
    SELECT t1.userId,t1.story,t1.story_count,t2.novel,t2.novel_count
    FROM
    (
    SELECT userId, story, COUNT(*) AS story_count,
           RANK() OVER (PARTITION BY userId ORDER BY COUNT(*) DESC) rn
    FROM logs
    GROUP BY userId, story
    ) as t1
    join
    (
    SELECT userId, novel, COUNT(*) AS novel_count,
           RANK() OVER (PARTITION BY userId ORDER BY COUNT(*) DESC) rn
    FROM logs
    GROUP BY userId, novel
    )as t2
    ON t1.userId = t2.userId and t1.rn = t2.rn
    WHERE t1.rn =1
    

    DB Fiddle Demo

    Login or Signup to reply.
  3. You should really use a window function if this is possible because thus, you can keep your query much shorter and simpler. If this is not possible and you really need to do it without them, you can also create two subqueries for both the story data and the novel data according to your conditions and then join them. Something like this:

    SELECT storydata.userId, storydata.story, storydata.counter AS story_count,
    noveldata.novel, noveldata.counter AS novel_count  
    FROM
    (SELECT DISTINCT l.userId, l.story, sub.counter
    FROM logs l
    JOIN
    (SELECT userId, COALESCE(MAX(story_count),0) AS counter
    FROM
    (SELECT userId, story,
    COUNT(story) as story_count  
    FROM logs 
    GROUP BY userId, story) sub
    GROUP BY userId) sub
    ON l.userId = sub.userId
    GROUP BY l.userId, l.story
    HAVING COUNT(l.userId) = sub.counter OR l.story IS NULL) AS storydata
    JOIN
    (SELECT noveldata.userId, noveldata.novel, noveldata.counter AS counter
    FROM
    (SELECT DISTINCT l.userId, l.novel, sub.counter
    FROM logs l
    JOIN
    (SELECT userId, COALESCE(MAX(novel_count),0) AS counter
    FROM
    (SELECT userId, novel,
    COUNT(novel) as novel_count  
    FROM logs 
    GROUP BY userId, novel) sub
    GROUP BY userId) sub
    ON l.userId = sub.userId
    GROUP BY l.userId, l.novel
    HAVING COUNT(l.userId) = sub.counter OR l.novel IS NULL) AS noveldata) AS noveldata
    ON storydata.userId = noveldata.userId;
    

    But as you can see, this will become very complicated although it will produce the correct outcome. See here:
    db<>fiddle

    Therefore, I once again highly recommend to use a DB version that provides window functions.

    Login or Signup to reply.
  4. Try the following for MySQL 5.7:

    set @u=0;
    set @tot=0;
    set @rn=1;
    select userid, story, story_count, novel, novel_count
    from
    (
      select userid, story, story_count, novel, novel_count,  
             if(@u<>userid or (@rn=1 and @tot=tot_cnt), @rn:=1, @rn:= @rn+1) rnk,
             @u:=userid, @tot:=tot_cnt
      from
      (
        select T.userid, T.story, T.story_count, D.novel, D.novel_count, 
               T.story_count+D.novel_count tot_cnt
        from
        (
          select userid, story, count(story) story_count  
          from table_name group by userid, story
        ) T 
        join
        (
          select userid, novel, count(novel) novel_count  
          from table_name group by userid, novel
        ) D
        on T.userid=D.userid
      ) joined_counts
      order by userid, tot_cnt desc
    ) ranked_counts
    where rnk = 1
    order by userid
    

    This query will select all the equal highest counts of story and novel for each user, if you want to select only one highest count (from the multiple equal highest counts) then replace the if statement with this if(@u<>userid, @rn:=1, @rn:= @rn+1) rnk.

    See a demo.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search