Postgresql - Return rows where column value is final superset of preceding rows

AlphaeTrading
September 2, 2024
157 views
0 votes
2 Answers

I am using PostgreSQL DB and have use case where I need to return all the rows from table whose column value is superset of same column value of other preceding rows.

Example – the table has following rows:

-------------------------------------------
id   |   name       |   datetime
-------------------------------------------
1    |   N          |   2024-08-17 04:42:57 
2    |   NEW        |   2024-08-17 04:42:58    
3    |   NEW YOR    |   2024-08-17 04:42:58    
4    |   NEW YORK   |   2024-08-17 04:42:59    
5    |   W          |   2024-08-17 04:56:07    
6    |   WA         |   2024-08-17 04:56:07    
7    |   WASHI      |   2024-08-17 04:56:07    
8    |   WASHINGT   |   2024-08-17 04:56:08   
9    |   WASHINGTON |   2024-08-17 04:56:08   
10   |   NEW Y      |   2024-08-17 05:12:18   
11   |   NEW YORK   |   2024-08-17 05:12:18

I need SQL which can return following result against above table:

-------------------------------------------
id   |   name       |   datetime
-------------------------------------------
4    |   NEW YORK   |   2024-08-17 04:42:59    
9    |   WASHINGTON |   2024-08-17 04:56:08   
11   |   NEW YORK   |   2024-08-17 05:12:18

Tags: postgresql sql

Answers

- ErwinBrandstetter
- September 2, 2024 at 2:08 am
- 0 votes
0
Your requirement can be translated to:

"Take rows where the name in the next row does not start with the name of the current row."

So:
```
SELECT id, name, datetime 
FROM  (
   SELECT *, lead(name, 1, '') OVER (ORDER BY id) AS next_name
   FROM   tbl
   ORDER  BY id
   ) sub
WHERE  NOT (next_name ^@ name);
```
fiddle

Assuming rows are indeed sorted by id as you make it seem, and involved columns are defined NOT NULL.

Two steps:
1. In subquery sub, take the name of the next row according to your sort order with the window function lead(), let’s call it next_name.
  To cover the corner case of the last row, default to the empty string (''), when there is no next row.
2. In the outer SELECT, filter rows where next_name does not start with name. Should be exactly what you ask for.
^@ is the "starts with" operator added with Postgres 11. See:
- PostgreSQL LIKE query performance variations
(Substitute with LIKE or ~ in older versions.)

Other possible (more verbose) solutions:
- a recursive CTEs
- a procedural solution in a PL/pgSQL function.
Login or Signup to reply.

- Amadan
- September 2, 2024 at 2:33 am
- 0 votes
0
If the table is not sorted according to name, this query can do it, as it does not rely on window. It is fairly readable, though not very fast, as it needs two subqueries per row:
```
SELECT * FROM t
WHERE EXISTS (
  SELECT * FROM t t2
  WHERE t.name ^@ t2.name AND t.name != t2.name
)
AND NOT EXISTS (
  SELECT * FROM t t3
  WHERE t3.name ^@ t.name AND t3.name != t.name
);
```
Select all rows where there exists a smaller name elsewhere (starting with the same string), and there is no larger name elsewhere (starting with the same string).
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Return rows where column value is final superset of preceding rows

Answers