PostgreSQL select self referencing with two columns, part 2

Cris
March 23, 2023
260 views
0 votes
2 Answers

Based on the initial question where I needed the counter
PostgreSQL select self referencing with two columns

in addition to the counter I need data from the client and consultation itself, here is the better question to what I need

I have a clients, relations (which define relations and their type between clients) and consultations. What I wan’t to do is get the consultations of type new for partners and count their age at the time of the consultation. The problem is, it can happen that both clients have a consultation of type ‘New’ but then I only want to count one because I’m only interested in how many partners I have had in consultations.

Here’s a simplified sql of my database

CREATE TABLE clients (
  id INT NOT NULL,
  name VARCHAR NOT NULL,
  birthday DATE NOT NULL,
  PRIMARY KEY(id)
);

CREATE TABLE relations (
  id INT NOT NULL,
  from_client_id INT NOT NULL,
  to_client_id INT NOT NULL,
  relation_type VARCHAR NOT NULL,
  PRIMARY KEY(id)
);

CREATE TABLE consultations (
  id INT NOT NULL,
  client_id INT NOT NULL,
  type_of VARCHAR NOT NULL,
  "date" timestamp without time zone,
  PRIMARY KEY(id)
);

some seed data

insert into clients (id, name, birthday)
VALUES
(1, 'John Doe', '1979-05-01'),
(2, 'Jane Doe', '1980-06-14'),
(3, 'Fiddle Blaab', '1992-03-08'), 
(4, 'Agnes Blaab', '1993-04-08'),
(5, 'Kid Blaab', '2002-04-08');

insert into relations (id, from_client_id, to_client_id, relation_type)
VALUES
(1, 1, 2, 'partner'),
(2, 3, 4, 'partner'),
(3, 3, 5, 'child'),
(4, 4, 5, 'child');

insert into consultations (id, client_id, type_of, "date")
VALUES
(1, 1, 'New', '2022-03-01 12:00'),
(2, 2, 'New', '2022-03-01 12:01'),
(3, 2, 'Follow Up', '2022-04-01 14:39'),
(4, 3, 'New', '2022-08-01 12:59'),
(5, 4, 'Follow Up', '2022-09-12 14:00');

and selecting data gives me three consultations:

SELECT 
CASE 
  WHEN date_part('year', AGE(consultations.date, clients.birthday)) BETWEEN 0 AND 17 THEN '0 - 18'
  WHEN date_part('year', AGE(consultations.date, clients.birthday)) BETWEEN 18 AND 25 THEN '18-25'
  WHEN date_part('year', AGE(consultations.date, clients.birthday)) BETWEEN 26 AND 40 THEN '26-40'
  WHEN date_part('year', AGE(consultations.date, clients.birthday)) BETWEEN 41 AND 60 THEN '41-60'
  WHEN date_part('year', AGE(consultations.date, clients.birthday)) BETWEEN 60 AND 200 THEN '60+'
END AS "Age",
count(*) AS "Count"
FROM consultations
INNER JOIN clients ON consultations.client_id = clients.id 
  AND ("consultations"."date" BETWEEN '2022-01-01 00:00' AND '2022-12-31 23:59')
  AND "consultations"."type_of" = 'New'
INNER JOIN relations ON (relations.from_client_id = consultations.client_id OR relations.to_client_id = consultations.client_id)
  AND relations.relation_type = 'partner'
GROUP BY "Age"
ORDER BY "Age";

Here is what I get:

Age Count
26-40   1
41-60   2

but what I actually want is:

Age Count
26-40   1
41-60   1

This is because both John Doe and Jane Doe which are "Partners" have a consultation of type "new", but in this case I just want to count one consultation since I’m only interested in how many partners where in a consultation, this select counts me both Jane and John as Partner.

So my expected result would be 1 for John Doe and Jane Doe, and a second one for Fiddle Blaab and Agnes Blaab.

I’ve created a db fiddle

This is driving me nuts, I know I can code this but want to do it in sql which is much faster then looping through all these records in code.

Tags: postgresql sql

Answers

Though not sure it’s the efficient one or not it this query will do the work:

select 
  min(CASE 
  WHEN date_part('year', AGE(c.date, clients.birthday)) BETWEEN 0 AND 17 THEN '0 - 18'
  WHEN date_part('year', AGE(c.date, clients.birthday)) BETWEEN 18 AND 25 THEN '18-25'
  WHEN date_part('year', AGE(c.date, clients.birthday)) BETWEEN 26 AND 40 THEN '26-40'
  WHEN date_part('year', AGE(c.date, clients.birthday)) BETWEEN 41 AND 60 THEN '41-60'
  WHEN date_part('year', AGE(c.date, clients.birthday)) BETWEEN 60 AND 200 THEN '60+'
END) AS "Age",count(distinct r.id)
  from
(select id,from_client_id client_id from relations where relation_type='partner'
  union all
select id,to_client_id client_id from relations where relation_type='partner')r
inner join consultations c on r.client_id=c.client_id and c.type_of = 'New'
inner join clients on r.client_id=clients.id
group by r.id

Age	count
41-60	1
26-40	1

fiddle

try this..> 

select 
  summary.age, 
  count(distinct summary.age) as count_age > 
from 
  (
    select 
      allr.client_id >, 
      CASE > WHEN date_part(
        'year', 
        AGE(con.date, cli.birthday)
      ) BETWEEN 0 
      AND 17 THEN '0-18' > WHEN date_part(
        'year', 
        AGE(con.date, cli.birthday)
      ) BETWEEN 18 
      AND 25 THEN '18-25' > WHEN date_part(
        'year', 
        AGE(con.date, cli.birthday)
      ) BETWEEN 26 
      AND 40 THEN '26-40' > WHEN date_part(
        'year', 
        AGE(con.date, cli.birthday)
      ) BETWEEN 41 
      AND 60 THEN '41-60' > WHEN date_part(
        'year', 
        AGE(con.date, cli.birthday)
      ) BETWEEN 60 
      AND 200 THEN '60+' > END AS age > --
      > 
    from 
      (
        select 
          r1.from_client_id as client_id 
        from 
          relations r1 
        where 
          r1.relation_type = 'partner' > 
        union 
          > 
        select 
          r2.to_client_id as client_id 
        from 
          relations r2 
        where 
          r2.relation_type = 'partner' >
      ) allr > --
      > 
      inner join consultations con on (
        allr.client_id = con.client_id 
        and con.type_of = 'New'
      ) > 
      inner join clients cli on (allr.client_id = cli.id) >
  ) summary 
group by 
  summary.age

Please signup or login to give your own answer.

Click here to cancel reply.