skip to Main Content

I have the table below:

create table tablex (fileID int, file varchar(10), name varchar(10));
insert into tablex (fileID, file, name) values (1, 'file1' , 'AAA'),(1, 'file1' , 'AAA'),(1, 'file1' , 'AAA'),(2, 'file2' , 'AAA'),(2, 'file2' , 'AAA'),(2, 'file2' , 'AAA'),(1, 'file1' , 'BBB'),(1, 'file1' , 'BBB'),(2, 'file2' , 'BBB'),(2, 'file2' , 'BBB'),(3, 'file3' , 'BBB'),(3, 'file3' , 'BBB'),(1, 'file1' , 'CCC'),(1, 'file1' , 'CCC');

tablex

fileID | file | name
1 | file1 | AAA
1 | file1 | AAA
1 | file1 | AAA
2 | file2 | AAA
2 | file2 | AAA
2 | file2 | AAA
1 | file1 | BBB
1 | file1 | BBB
2 | file2 | BBB
2 | file2 | BBB
3 | file3 | BBB
3 | file3 | BBB
1 | file1 | CCC
1 | file1 | CCC

I expect to get only the names that appears in more than one file, and then count the number of occurrences in the file with max id, and also the number of occurrences in the other files. Ex.

AAA – 3 occurrences in file2, 3 occurrences in other files

BBB – 2 occurrences in file3, 4 occurrences in other files

I’m trying to use window function (still learning), but not sure if this approach is the best path to go as i had to add a distinct to make it work

select distinct t.name, count(t.name) over (partition by t.name) countAll,
       count(t.name) over (partition by t.name, fileId) countLatestFile,
       count(t.name) over (partition by t.name) - count(t.name) over (partition by t.name, fileId) countOthers
  from tablex t
  join (select name from tablex group by name having count(distinct fileId)  > 1) tdups
   on t.name = tdups.name;

Any other ideas?

3

Answers


  1. Using the window function dense_rank, do it like this:

    First cte was used to arrange names by rank according to fileID. Get the latest file from each name using cte2 (having max rank).

    Then, we join both CTEs to produce the desired result:

    with cte as (
      select *, dense_rank() over (partition by name order by fileID) rn,
               count(name) over (partition by name) countAll
      from tablex
    ),
    cte2 as (
      select name, max(rn) max_rn
      from cte
      group by name
      having max(rn) > 1
    )
    select c.name, max(c.countAll) as countAll, max(c.file) as latestFile, count(c.name) as countLatestFile, 
           max(c.countAll) - count(c.name) as countOthers
    from cte c
    inner join cte2 c2 on c.rn = c2.max_rn and c.name = c2.name
    group by c.name
    

    Demo here

    Login or Signup to reply.
  2. You may use the max window function and conditional aggregation as the following:

    with max_ids as
    (
      select *,
        max(fileID) over (partition by name) maxid
      from tablex
    )
    select name,
           count(*) countAll,
           sum(fileID = maxid) countLatestFile,
           sum(fileID <> maxid) countOthers
    from max_ids 
    group by name
    having count(distinct fileId)  > 1
    

    See demo

    Login or Signup to reply.
  3. Possible to do without using a window function:

    select name,
    (select max(fileId) 
    from tablex
    where name=t.name) as max_fileid,
    count(case when fileid= (select max(fileId) 
                            from tablex
                            where name=t.name)
                then 1
            end) as count_for_max_fileid,
    count(case when fileid != (select max(fileId) 
                            from tablex
                            where name=t.name)
                then 1
            end)as count_for_other
    from tablex t
    where name in (select name from tablex group by name having count(distinct fileid)>1)
    group by name
    ;
    
    -- result set:
    +------+------------+----------------------+-----------------+
    | name | max_fileid | count_for_max_fileid | count_for_other |
    +------+------------+----------------------+-----------------+
    | AAA  |          2 |                    3 |               3 |
    | BBB  |          3 |                    2 |               4 |
    +------+------------+----------------------+-----------------+
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search