The users to submit an interval of starting and ending pages that he/she read in a specific
book. Note that the user can submit multiple intervals for the same book.
I need the query announce the most recommended five books in the system, which are picked based on how many unique pages have been read for
all the users that submitted their intervals in the first operation (sorted by books with the most read pages to books with the least read pages).
The book_user table is the pivot table which I need the query on so how to get the result below for the inserted records which are:
Reading intervals:
User 1 read from page 10 to page 30 in Book 1
User 2 read from page 2 to page 25 in Book 1
User 1 read from page 40 to page 50 in Book 2
User 3 read from page 1 to page 10 in Book 2
The most read books results:
Book 1 -> 28 pages
Book 2 -> 20 pages
I try this query:
select 'book_id',books.name as book_name,SUM(end_page - start_page) AS num_of_read_pages FROM book_user JOIN books ON books.id=book_user.book_id GROUP BY book_id ORDER BY num_of_read_pages DESC;
But it’s not get the unique page for overlap intervals
And when I ask chatgpt it gives me this query which is recursive cte but it not work it just looping
WITH RECURSIVE cte AS (
SELECT book_id, MIN(start_page) AS start_page, MAX(end_page) AS end_page
FROM book_user
GROUP BY book_id, start_page
UNION ALL
SELECT cte.book_id, cte.start_page, cte.end_page
FROM cte
JOIN book_user ON cte.book_id = book_user.book_id AND cte.start_page <= book_user.start_page AND cte.end_page >= book_user.end_page
)
SELECT book_id, SUM(end_page - start_page + 1) AS total_pages
FROM cte
GROUP BY book_id
ORDER BY total_pages DESC;
2
Answers
See example.
2.1. Anchor. Take ranges, where start_page not belongs any other range.
From set (3-10),(5-12),(4-7) take (3-10)
2.2. Join ranges
Details here demo
Output
Test data
See query for task type
gaps and islands
Demo