Ex.
SELECT *
FROM A
JOIN B ON A.idx = B.idx
JOIN C ON A.idx = C.idx
WHERE A.create_dt BETWEEN '2023-05-01' AND '2023-05-31'
OR A.last_dt BETWEEN '2023-05-01' AND '2023-05-31'
OR B.create_dt BETWEEN '2023-05-01' AND '2023-05-31'
OR B.last_dt BETWEEN '2023-05-01' AND '2023-05-31'
OR C.create_dt BETWEEN '2023-05-01' AND '2023-05-31'
OR C.last_dt BETWEEN '2023-05-01' AND '2023-05-31';
It is a PostgreSQL DB
While multi-joining the tables, I want to SELECT
only the data where the create_dt
or last_dt
of each table is the latest.
However, there is a problem that the speed is too slow. How to solve in this case?
Performance is fine when I leave only the A table WHERE
condition.
2
Answers
You might try adding the following indices to the three tables:
These indices, if used, should speed up the joins in your query.
Then you certainly cannot join on
idx
since the latest per table won’t share the sameidx
value.Your query does not do what you say. This one does:
All parentheses required.
A bit verbose. But it’s as fast as this gets – provided you have these indexes:
It will be two index seeks per table, directly picking the one qualifying row every time.
I am joining with an unconditional
CROSS JOIN
, since each subquery returns exactly one row, provided at least one qualifies.If one of the subqueries finds no row, the result is empty. Maybe you really want a
FULL OUTER JOIN
to preserve results from the other tables if one comes up empty. Or just 3 result rows.Then again, I wouldn’t be surprised if you didn’t exactly say what you really need.