I am having a table as follows.
entity_id|effective_date|value|
A |2023-09-09 |234 |
A |2023-09-06 |345 |
B |2023-09-02 |341 |
C |2023-09-01 |347 |
I want to find all unique entity IDs with the maximum effective date and their respective value. I am trying the following query.
select distinct entity_id, value, max(effective_date) start_date
from refdata.investment_raw ir
where attribute_id = 232
and entity_id in (select invest.val as investment_id
from refdata.ved soi
inner join refdata.ved invest
on soi.entity_id = invest.entity_id
and current_date between invest.start_date and invest.end_date
and invest.attribute_code = 'IssuerId'
and soi.attribute_code = 'SO'
and soi.val in ('1','2')
and current_date between soi.start_date and soi.end_date)
group by entity_id, value
With this I am getting following in the result set.
entity_id|effective_date|value|
A |2023-09-09 |234 |
A |2023-09-06 |345 |
B |2023-09-02 |341 |
C |2023-09-01 |347 |
My expected result set is
entity_id|effective_date|value|
A |2023-09-09 |234 |
B |2023-09-02 |341 |
C |2023-09-01 |347 |
In the expected, you can see the maximum effective date and unique entity IDs with their respective values. In the actual result set, I am getting entity ID A as duplicate record. When I remove the distinct value from the query and group by clause, I get my expected result but without the value column. I want the respective distinct entity ID value as well without duplication. What is wrong with my query?
3
Answers
You can use ROW_NUMBER-filtering approach
We produce first a list of entities with their max effective date using
group by
, then we join the table with this list :Demo here
It seems like you want the PostgreSQL-specific DISTINCT ON, which is not the same thing as DISTINCT.