I know similar Qs have been answered before here, but I’m unable to get the desired output based on answers I’ve reviewd.
I am trying to join two tables:
The objective is to join only records in Table A with the most recent date (A.last_run_date) for a given combination of dp_id AND pm_id, with records from Table B.
Desired Output
Here’s what I came up with, but this is unfortunately not working. The query is returning all the rows, instead of the most recent date (A.last_run_date) for a given combination of dp_id AND pm_id:
SELECT b.a_i_id,
b.a_e_id,
a.last_run_date,
a.cm_id,
a.dp_id
FROM (SELECT dp_id,
cm_id,
Max (last_run_date) AS latest_run
FROM a
GROUP BY dp_id,
cm_id) A1
INNER JOIN a
ON a.dp_id = A1.dp_id
AND a.cm_id = A1.cm_id
INNER JOIN b
ON b.dp_id = a.dp_id
AND b.cm_id = a.dp_id
WHERE Cast (a.last_run_date AS DATE) BETWEEN '2023-04-28' AND '2023-04-30'
Can someone please advise what I’m doing wrong?
Thanks
2
Answers
Try this. I just wrote the query out of mind and not tested. You may need to tweak it, but I hope you get the idea.
Your friend to get a most recent row is the analytic function
ROW_NUMBER
.Define the
partition by
with the columns that you will be joining and theorder by
on the column defining the order usingDESC
, so that the candidate (highest) row will get the number 1.Finally filter only row with
rn = 1
. The join is than a trivial case (leaving it out)Note you should always check in this aproach that the columns from
partition by
andorder by
are unique in the table (i.e. the last row is defined stable). Otherwise the query will work, but you can get in each execution a different result.