Postgresql - Selecting lines with different values in a specific column

YagmurG
September 4, 2024
198 views
2 votes
3 Answers

Lets say there is a table with three columns and a million lines. How can I write a SQL statement where I get 20.000 lines, but with the restriction that a column (the second column) never has the same values?

Lets say this is the table:

Name – bpchar(17)	Bday – date	Wedding – date
Paul	1990-01-01	2010-01-01
Susi	1995-06-01	2020-01-01
Luis	1995-06-01	2022-06-15

Now lets assume there are a million rows in this table.

What I want to have now is:

up to 20.000 examples out of the million entries
but make sure that Bday is always unique in the result
values in the first and third column can appear multiple times in the result

Tags: postgresql sql

Answers

- SPYDER
- September 3, 2024 at 2:43 pm
- 0 votes
0
To get 20,000 rows from a table where each row has a unique value in a specific column (let’s call it column2).
First, we give each row a unique number based on the values in column2. This ensures that each value in column2 gets a number, starting from 1.
From these numbered rows, we keep only the first row for each unique value in column2. Finally, we select 20,000 rows from the results, making sure each row has a unique value in column2.

SQL:
```
WITH RankedRows AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY column2 
ORDER BY (SELECT NULL)) AS rn 
FROM your_table), UniqueRows AS (SELECT * FROM RankedRows WHERE rn = 1) 
SELECT * FROM UniqueRows LIMIT 20000;
```
Login or Signup to reply.

- dr
- September 4, 2024 at 4:37 pm
- 0 votes
0
You could use Row_Number() Over() analytic function to enumerate rows per Bday(s) then select just the rows having Bday_rn = 1 and LIMIT number of rows returned:
```
SELECT   Name, Bday, Wedding
FROM   ( Select Name, Bday, Wedding, 
                Row_Number() Over(Partition By Bday Order By Wedding) as Bday_rn 
         From   tbl
       )
WHERE    Bday_rn = 1
LIMIT    200000
```
… OR using another Row_Number() Over() analytic function to control number of rows returned instead of LIMIT.
```
SELECT   Name, Bday, Wedding
FROM   ( SELECT   Row_Number() Over(Order By Bday, Wedding) as rn,
                  Name, Bday, Wedding
         FROM   ( Select Name, Bday, Wedding, 
                         Row_Number() Over(Partition By Bday Order By Wedding) as Bday_rn 
                  From   tbl
                )
         WHERE    Bday_rn = 1
       )
WHERE  rn <= 20000
```
See the fiddle here.
Login or Signup to reply.

- Belayer
- September 4, 2024 at 8:14 pm
- 0 votes
0
Postgres provides a distinct on() (see documentation) operation for specifically this type request. Something like: (see demo)
```
select distinct on(bdate)
       *
  from <table>
 order by bdate;
```
NOTE: You should avoid the type bpchar. This type specification stands for blank padded character. Using this type a query containing the predicate
```
where  'Paul' = name
```
would not find a match in he above data. Typically just use text or if you need to limit size varchar(17).
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Selecting lines with different values in a specific column

Answers