this is an example of initial table data:
| id | reservation |
| -------- | ----------- |
| 1 | 15 |
| 2 | 0 |
| 3 | 0 |
| 4 | 16 |
| 5 | 16 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 17 |
| 10 | 17 |
| 11 | 17 |
| 12 | 17 |
| 13 | 17 |
| 14 | 17 |
| 15 | 0 |
| 16 | 0 |
| 17 | 0 |
| 18 | 0 |
...
I would like to update the table with reservation = 18 only where I find the first 3 consecutive rows without resevation, that is reservation = 0.
After the update, the resulting table would be:
| id | reservation |
| -------- | ----------- |
| 1 | 15 |
| 2 | 0 |
| 3 | 0 |
| 4 | 16 |
| 5 | 16 |
| 6 | 18 |<
| 7 | 18 |<
| 8 | 18 |<
| 9 | 17 |
| 10 | 17 |
| 11 | 17 |
| 12 | 17 |
| 13 | 17 |
| 14 | 17 |
| 15 | 0 |
| 16 | 0 |
| 17 | 0 |
| 18 | 0 |
...
Is it possible using Recursive Common Table Expressions?
How can i make it?
Thanks!
3
Answers
This is a gaps-and-islands problem. You’re trying to find islands of consecutive
id
s wherereservation
is0
. Problems like this can’t be solved with recursive CTEs. Typically, they’re solved using windowing and ranking functions likeLAG()
orROW_NUMBER()
. Such functions were added to MySQL in version 8.0. This means that most articles on gaps-and-islands refer to different databases. The functionality is the same though.One way to solve the problem is to take advantage of the consecutive IDs. The
ROW_NUMBER()
ranking function returns the row number in the subset of rows without reservations. The difference of the row number from ID will be the same inside each island. The following query :Returns
Once we have a way to identify the islands, we can find islands with exactly 3 members:
This returns
ROW_NUMBER() OVER (ORDER BY island_id) as Range_ID
is used to give an identifier to each range, allowing us to update the IDs returned in the first (or last) range only.This can be converted to a CTE itself and used to update the Reservations table :
The
UPDATE ... JOIN
is a MySQL peculiarity. In other databases (PostgreSQL, SQL Server) you’d writeWhy count(*)=3 ?
This avoids fragmentation of the remaining islands. If there were three ranges with 5,3, and 6 rooms free, selecting the first available range would result in a new 2-room range that would be harder to use. Selecting the exact match would create no new range while selecting the largest would create a 3-room range, that would be more useful.
This is a common problem in disk and RAM allocation – or any allocation scenario.
To take advantage of larger areas one would need to come up with a strategy that reduces fragmentation. This can be done by changing the
Range_ID
orderThe first strategy, using
(ORDER BY island_id)
will keep picking the same ranges first.A best match strategy can be specified with
ORDER BY MAX(ID)-MIN(ID) DESC
:Surprisingly, this is not a good strategy as it results in a lot of small islands. That’s why file systems try to avoid it for larger files
A worst match strategy results in less fragmentation and larger ranges
You can use ROW_NUMBER() to give a continuous sequence, subtracted from your
id
column, to find groups of consecutive 0 values:The innermost subquery (t1) gives us the groups of consecutive ids and then the next subquery (t2) gives us the count within each group.
If you are concerned about fragmentation, which you should be as pointed out by Panagiotis, you can:
which will give you an exact match on the gap size if available (by replacing it with high out of range value), otherwise it will take the largest gap to reduce the number of tiny gaps created.
Here’s a db<>fiddle showing the steps.
For MySQL 8.0 or versions below, it’s possible to do it without using a window function or doing the row_number sequencing (supposing all your id values are consecutive). Since you only require 3 consecutive 0, we can use table joining. Check the steps below:
Then , Use
CASE
to replace the reservation value whose id has the value between the calculated initial id and two ids right after it by usingBETWEEN
.By the way, if ID is indeed consecutive and unique, please make sure it’s made the table’s PRIMARY KEY to improve performance.