skip to Main Content

I have booking sample data as follows:

-----------------------------------------------------------------------------------
hotel_id | roomtype_id | status   |  booking_timestamp  | checkin_date | checkout_date
-----------------------------------------------------------------------------------
1        | 1           | new      | 2023-02-02 00:15:30 | 2023-02-03   | 2023-02-05
1        | 1           | new      | 2023-02-04 09:10:15 | 2023-02-04   | 2023-02-04
1        | 1           | new      | 2023-02-07 03:00:10 | 2023-02-09   | 2023-02-10
1        | 1           | new      | 2023-02-08 04:10:10 | 2023-02-09   | 2023-02-10
1        | 1           | cancelled| 2023-02-08 10:15:30 | 2023-02-09   | 2023-02-10
2        | 2           | new      | 2023-02-03 06:40:14 | 2023-02-06   | 2023-02-07
2        | 2           | new      | 2023-02-05 08:10:14 | 2023-02-06   | 2023-02-08

Each row represent 1 room of a particular room type. I’m looking for effective way to calculate the total number of room sold exclude cancellation for each date in a particular date range of each room type in each hotel. Let’s say the date range that I want is started from 2023-02-01 to 2023-02-12. The expected output and explanation should be the following.

--------------------------------------------------------------
hotel_id | roomtype_id | date       | room_sold
--------------------------------------------------------------
1        | 1           | 2023-02-01 | 0          No new check-in yet
1        | 1           | 2023-02-02 | 0          No new check-in yet
1        | 1           | 2023-02-03 | 1          one room got checked-in (1st row of data)
1        | 1           | 2023-02-04 | 2          another room got check-in (2nd row of data)
1        | 1           | 2023-02-05 | 0          all rooms of this room type got check-out
1        | 1           | 2023-02-06 | 0          No new check-in yet
1        | 1           | 2023-02-07 | 0          No new check-in yet
1        | 1           | 2023-02-09 | 1          2 checked-in but 1 cancelled (row 3 to 5)
1        | 1           | 2023-02-10 | 0          all rooms of this room type got check-out
2        | 2           | 2023-02-01 | 0          No new check-in yet
2        | 2           | 2023-02-02 | 0          No new check-in yet
2        | 2           | 2023-02-03 | 0          No new check-in yet
2        | 2           | 2023-02-04 | 0          No new check-in yet
2        | 2           | 2023-02-05 | 0          No new check-in yet
2        | 2           | 2023-02-06 | 2          2 rooms got checked-in (row 6 and 7)
2        | 2           | 2023-02-07 | 1          1 room checked-out (row 6)
2        | 2           | 2023-02-08 | 0          1 room checked-out (row 7)
2        | 2           | 2023-02-09 | 0          No new check-in yet
2        | 2           | 2023-02-10 | 0          No new check-in yet
2        | 2           | 2023-02-11 | 0          No new check-in yet
2        | 2           | 2023-02-12 | 0          No new check-in yet

May I have your advise how to solve this problem?

3

Answers


  1. WITH RECURSIVE
    cte1 AS (
      SELECT DATE_FORMAT(MIN(checkin_date), '%Y-%m-01') the_date, MAX(checkout_date) last_date
      FROM test
      UNION ALL
      SELECT the_date + INTERVAL 1 DAY, last_date
      FROM cte1
      WHERE the_date < last_date
    ),
    cte2 AS (
      SELECT DISTINCT hotel_id, roomtype_id
      FROM test
    )
    SELECT cte2.hotel_id,
           cte2.roomtype_id,
           cte1.the_date, 
           SUM(CASE status WHEN 'new' THEN 1
                           WHEN 'cancelled' THEN -1
                           ELSE 0 
                           END) amount
    FROM cte1
    CROSS JOIN cte2
    LEFT JOIN test ON the_date BETWEEN checkin_date AND checkout_date
                  AND (cte2.hotel_id, cte2.roomtype_id) = (test.hotel_id, test.roomtype_id)
    GROUP BY 1,2,3
    ORDER BY 1,2,3
    
    hotel_id roomtype_id the_date amount
    1 1 2023-02-01 0
    1 1 2023-02-02 0
    1 1 2023-02-03 1
    1 1 2023-02-04 2
    1 1 2023-02-05 1
    1 1 2023-02-06 0
    1 1 2023-02-07 0
    1 1 2023-02-08 0
    1 1 2023-02-09 1
    1 1 2023-02-10 1
    2 2 2023-02-01 0
    2 2 2023-02-02 0
    2 2 2023-02-03 0
    2 2 2023-02-04 0
    2 2 2023-02-05 0
    2 2 2023-02-06 2
    2 2 2023-02-07 2
    2 2 2023-02-08 1
    2 2 2023-02-09 0
    2 2 2023-02-10 0

    fiddle

    If there exists a calendar table which contains the dates range which you need the output to obtain for then use it instead of cte1 (maybe with according WHERE). Accordingly if you have a table with hotels and rooms identifiers then use it instead of cte2.

    Login or Signup to reply.
  2. You should generate a series of dates for the given date range using a RECURSIVE and join this series of dates with your booking data. After that, group the joined data by hotel_id, roomtype_id, and date. Lastly, use conditional aggregation to count the number of rooms sold for each date, excluding cancellations:

    WITH RECURSIVE dates AS (
     SELECT DATE('2023-02-01') AS date
     UNION ALL
     SELECT DATE(date + INTERVAL 1 DAY)
     FROM dates
     WHERE date < '2023-02-12'
    )
    SELECT
     hotel_id,
     roomtype_id,
     date,
     COUNT(CASE WHEN bookings.status = 'new' THEN 1 END) AS room_sold
    FROM dates
    LEFT JOIN bookings ON dates.date BETWEEN bookings.checkin_date AND bookings.checkout_date
    GROUP BY hotel_id, roomtype_id, date;
    
    Login or Signup to reply.
  3. Generate a date range list using recursive first, and then join them with the check-in date.

    Notes: According to your expected output, I don’t calculate the check-out date to the amount, unless the check-in date and check-out date are on the same date.

    SET @startDate = '2023-02-01',
        @endDate = '2023-02-12';
    
    WITH RECURSIVE cteDate AS (
        SELECT @startDate AS `Date`
        UNION ALL
        SELECT DATE_ADD(`Date`, INTERVAL 1 DAY) AS `Date` FROM cteDate
        WHERE DATE_ADD(`Date`, INTERVAL 1 DAY) <= @endDate
    ),
    cteRoom AS (
        SELECT DISTINCT hotel_id, roomtype_id
        FROM tblData
    )
    SELECT cte.`Date`,
        cteRoom.hotel_id,
        cteRoom.roomtype_id,
        SUM(CASE WHEN tbl.status = 'new' THEN 1
                 WHEN tbl.status = 'cancelled' THEN -1
                 ELSE 0 END) AS TotalCheckedInRoom
    FROM cteDate cte
    CROSS JOIN cteRoom cteRoom
    LEFT JOIN tblData tbl
    ON ((cte.`Date` >= tbl.checkin_date AND cte.`Date` < tbl.checkout_date)             -- if date = checkout date, don't count
        OR (cte.`Date` = tbl.checkin_date AND tbl.checkin_date = tbl.checkout_date))    -- if checkin = checkout date, count as checked in
        AND cteRoom.hotel_id = tbl.hotel_id
        AND cteRoom.roomtype_id = tbl.roomtype_id
    GROUP BY 1,2,3
    ORDER BY 2,1,3
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search