skip to Main Content

I was asked this question in a job interview:

enter image description here

There is a table with vehicle names mentioned in a column. Output when we check for name=car we must get as 4 i.e the maximum count of continuous occurrence of ‘car’ in the column.

Can somebody help me get this solution in PostgreSQL?

I tried using rank and dense rank but it didn’t work.

2

Answers


  1. Disclaimer: As already stated by @Mureinik in the comments: This can only work if there is a dedicated column for ordering (e.g. an increasing ID or a timestamp). The reason is, that in the database the column order is not determined. So you always need a sort order criterion if you’d deal with specific orders. In the example below I added an id column.


    step-by-step demo: db<>fiddle

    SELECT
        name,
        COUNT(*) OVER (PARTITION BY group_id)                        -- 2
    FROM (
        SELECT                                                       -- 1
            *,
            SUM(eq) OVER (ORDER BY id) as group_id
        FROM (
            SELECT
                *,
                (name != lag(name) OVER (ORDER BY id))::int as eq
            FROM mytable
        ) s
    ) s
    ORDER BY count DESC                                              -- 3
    LIMIT 1                                                          -- 4
    
    1. First part is similar to what I already described here, for example. You can find this in the subqueries. It is about window function partitioning while keeping the original sort order. It creates unique group ids for each specific name group.
    2. Using COUNT() window function to count the records of each group.
    3. Afterwards order the biggest count first.
    4. Return only the first record (which contains the highest count number)
    Login or Signup to reply.
  2. This is a gaps and islands problem, you could try the difference between two row numbers to solve it.

    This can be done only if there is a column you can order by (e.g. an increasing ID or a timestamp) as mentioned by @Mureinik and @S-Man :

    with cte as (
      SELECT
        *, row_number() over(order by id) - row_number() over (partition by name order by id) as grp
      FROM mytable
    )
    select name, count(1)
    from cte
    where name = 'car' 
    group by name, grp
    order by count(1) desc
    limit 1
    

    To get the maximum count of continuous occurrence for all names, just remove the where clause

    Demo here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search