Postgresql - How to query cases with multiple pattern matching values in the same column

louieriksson
October 4, 2024
88 views
1 vote
2 Answers

Ex Table:

Name	Location
John Smith	NewYork1131413431
John Smith	NewYork2222213
Marc White	Boston111118174124712
Sarah Phillips	NewYork2222123
Sarah Phillips	Boston2222918

I’m trying to query only the names in which they exist in 2 or more cities (for the case of this dataset, it will almost always be 2 cities max). So from this table, only Sarah Philips should return.

I wrote a query with the ‘like NewYork% or like Boston%’ operators with having count(location) > 1. But doing that would return John Smith as well because the NewYork string has multiple variations. The characters after the city name are also random in length.

Tags: postgresql sql

Answers

- JonasMetzler
- October 4, 2024 at 7:11 am
- 0 votes
0
If the data is always in this form, meaning the location consists of a town followed by some digits, you can use SUBSTRING with a regex check that fetches the substring before the first digit.

Then put this condition in the HAVING clause.
```
SELECT
  name
FROM yourtable
GROUP BY
  name
HAVING 
  COUNT(DISTINCT SUBSTRING(Location,'D+')) > 1;
```
This will as requested only return those people who occur in different locations, see this sample fiddle.

Note: It would be better to save the city and the rest of the string (the digits) in separate columns, this would make things easier.

Another note: You need to extend this solution if you have more possible cases to cover.

For example, if the town NewYork is sometimes written as New York, you likely want to treat both as the same town.

Then you would add a REPLACE to get rid of the space and change the condition to ...HAVING COUNT (DISTINCT SUBSTRING(REPLACE(Location,' ',''),'D+')) > 1

In case this is not sufficient and you need further assistance, please explain all possible cases you need to cover.
Login or Signup to reply.

- Florin
- October 4, 2024 at 4:07 pm
- 0 votes
0
Hello another way to solve this in PostgreSql is by using Regexp_Peplace build in function:

This is one way with the help of having clause:
```
with data as (SELECT
    name
, regexp_replace(location, '[0-9]', '', 'g') AS location
FROM
    people_location
  )
select name 
  from data
group by name
having min(location)<>max(location);
```
This is another way using Exists operator:
```
with data as (SELECT
    name
, regexp_replace(location, '[0-9]', '', 'g') AS location
FROM
    people_location
  )
select distinct name 
from data pl
where exists (select * from data where pl.name=name and pl.location<>location);
```
You may also check the fiddle created for this case.
https://dbfiddle.uk/_cIOdooQ

Hope it helps.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – How to query cases with multiple pattern matching values in the same column

Answers