skip to Main Content

I am trying to retrieve the latest few records from the table based on grouping one of the column values which is not a primary key. This means there will be more than one row for a specific column value which I want the results grouped on. The data base is MySQL based.

Table Metrics schema

TestID BuildId TestResult  TestMetrics
a123     b345     Pass       metric1
a234     b234     Fail       metric2
a345     b345     Fail       metric3
a456     b123     Fail       metric4
a567     b234     Pass       metric5

E.g. I want to retrieve the latest two build data from the above table.

Expected Result:

TestID BuildId TestResult  TestMetrics
a123     b345     Pass       metric1
a345     b345     Fail       metric3
a234     b234     Fail       metric2
a567     b234     Pass       metric5

TestID is the primary key. Now I would like to retrieve the latest 2 build information from the table and display the result ordered by buildId in the descending order (latest).

What I have right now is

select * from Metrics as m inner join 
   (select n.BuildId from Metrics as n group by n.BuildId order by n.BuildId desc limit 2)
on m.BuildId = n.BuildId
order by BuildId desc

This looks highly inefficient as the table contains huge amount of metrics data. There are where clauses where I can filter on based on a specific dev branch the test ran.

Is there a better way to achieve the expected results?

3

Answers


  1. Chosen as BEST ANSWER

    Ervis's answer pointed me in the right direction despite not being exactly what I was looking for. So posting this post as the answer as the correct solution in the context of the question.

    I wanted the window function to generate the same number for the rows with the same buildId and that is where DENSE_RANK() comes into picture. Below is the exact solution I wanted.

    with mm as 
      (select *, DENSE_RANK() OVER(order by BuildId desc) as nrow from metrics) 
    select * 
    from mm 
    where nrow <=2 
    order by buildId, nrow;
    
    

  2. One way to do this is to use a subquery with the MAX function to find the latest BuildId for all the TestMetrics values, and then join the result with the Metrics table to get the corresponding TestID and TestResult values.

    In code, you should have

    SELECT m.TestID, m.BuildId, m.TestResult, m.TestMetrics
    FROM Metrics m
    JOIN (
      SELECT TestMetrics, MAX(BuildId) AS latestBuildId
      FROM Metrics
      GROUP BY TestMetrics
      ORDER BY latestBuildId DESC
      LIMIT 2
    ) latest ON m.TestMetrics = latest.TestMetrics AND m.BuildId = latest.latestBuildId
    ORDER BY m.BuildId DESC;
    
    Login or Signup to reply.
  3. I believe you can achieve that by using the window function row_number

    e.g

    with mm as 
      (select *, ROW_NUMBER() OVER(partition by buildid order by TestID desc) as nrow from metrics) 
    select * 
    from mm 
    where nrow <=2 
    order by buildId, nrow;
    
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search