I have an archive table in which only the changes to a record are stored with a validity date. An example would be a price archive for products.
And if you now want to select the status at a given date, I have the following SELECT.
The problem is that this gets slower and slower the more changes there are, because in the first step the database first reads all records and then in the next step appends those that have a greater validity date. In the end, only the ones with the largest validity date remain on the far left.
Is there any way to make this more effective?
SELECT d1.*
FROM data d1
LEFT JOIN data d2
ON d1.id = d2.id AND
d1.valid_from_date < d2.valid_from_date
WHERE
d2.id IS NULL AND
d1.valid_from_date <= '2023-01-01';
CREATE TABLE data (
id int,
valid_from_date DATE,
foo varchar(255),
bar varchar(255),
PRIMARY KEY (id, valid_from_date)
);
2
Answers
Try adding this index on the date and id columns.
EDIT According to your comment you already have an index (your PK) on
(id, valid_from_date)
.EDIT If you’re looking for the latest row for each
id
on or before a particular date, this query might help.The subquery gets the date you want for every id, then the JOIN retrieves the details. The MariaDB / MySQL query planner is quite clever about MAX … GROUP BY queries given appropriate indexes.
Read Markus Winand’s excellent ebook to learn a lot more about this topic.
You can try to use
NOT EXISTS
insteadLEFT JOIN
andNOT NULL
condition like: