Postgresql - Postgres PIVOT table on aggregated count

shaikfaiz
September 18, 2023
290 views
1 vote
2 Answers

I am trying to create a pivot table in Postgres from the following table
named product_info:

CREATE TABLE product_info (
    ID varchar(10),
    date VARCHAR(20) NOT NULL,
    product VARCHAR(20) NOT NULL,
    product_cost integer 
);

INSERT INTO product_info(ID, date, product, product_cost)
VALUES
  ('1', 'AUG-23','Laptop',100),
  ('1', 'AUG-23','Phone',80),
  ('1', 'AUG-23','Keypad',50),
  ('1', 'SEP-23','Laptop',200),
  ('1', 'SEP-23','Phone',100),
  ('1', 'SEP-23','Ipad',150),
  ('1', 'SEP-23','Keypad',80),
  ('2', 'AUG-23','Laptop',100),
  ('2', 'AUG-23','Phone',90),
  ('2', 'AUG-23','Keypad',70),
  ('2', 'SEP-23','Laptop',120),
  ('2', 'SEP-23','Phone',30),
  ('2', 'SEP-23','Ipad',80),
  ('2', 'SEP-23','Keypad',40);

Tried below PIVOT Query

SELECT * FROM crosstab(
    'SELECT date, id,product, sum(product_cost)
     FROM product_info
     GROUP BY 1, 2,3
     ORDER BY 1, 2,3'
,
    'SELECT DISTINCT product FROM product_info ORDER BY 1'
) as cte(
    date text,
    id text, 
    "Ipad" numeric,
    "Keypad" numeric,
    "Laptop" numeric,
    "Phone" numeric
)

RETURNED RESULT

date	id	Ipad	Keypad	Laptop	Phone
AUG-23	1	NULL	70	100	90
SEP-23	1	80	40	120	30

I get a single row per month, with id = '1' and values for id = '2'.
I expected one row per ID.

EXPECTED RESULT

id	date	Ipad	Keypad	Laptop	Phone
1	AUG-23	NULL	50	100	80
2	AUG-23	NULL	70	100	90
1	SEP-23	150	80	200	100
2	SEP-23	80	40	120	30

Answers

As you can use only obe row with crosstab, you can do something like this.

But simpler is a normal aggregation like the second query

SELECT split_part(id, '%', 1) as id,
  split_part(id, '%', 2) as date,
        "Ipad" ,
    "Laptop" ,
    "Phone" ,
    "Keypad" 
  FROM crosstab(
    'SELECT  CONCAT(id,''%'',date) as id, product, sum(cost)
     FROM product_info
     GROUP BY 1, 2
     ORDER BY 1, 2'
,
    'SELECT DISTINCT product FROM product_info ORDER BY 1'
) as cte(
    id text,
    "Ipad" numeric,
    "Laptop" numeric,
    "Phone" numeric,
    "Keypad" numeric
)

id	date	Ipad	Laptop	Phone	Keypad
1	Aug-23	null	50	250	100
1	Sep-23	200	60	260	100
2	Aug-23	null	80	220	100
2	Sep-23	90	80	100	100

SELECT
  id,date,
  SUM(CASE WHEN product = 'Ipad' THEN cost END ) as "Ipad",
  SUM(CASE WHEN product = 'Laptop' THEN cost END ) as "Laptop",
  SUM(CASE WHEN product = 'Phone' THEN cost END ) as "Phone",
  SUM(CASE WHEN product = 'Keypad' THEN cost END ) as "Keypad"
FROM 
product_info
GROUP BY id,date
ORDER BY id,date

id	date	Ipad	Laptop	Phone	Keypad
1	Aug-23	null	250	100	50
1	Sep-23	200	260	100	60
2	Aug-23	null	220	100	80
2	Sep-23	90	100	100	80

fiddle

- ErwinBrandstetter
- September 7, 2023 at 11:24 pm
- 0 votes
0
Proper crosstab() call

Quoting the manual for crosstab():

source_sql is an SQL statement that produces the source set of data.
This statement must return one row_name column, one category
column, and one value column. It may also have one or more “extra”
columns. The row_name column must be first. The category and
value columns must be the last two columns, in that order. Any
columns between row_name and category are treated as “extra”. The
“extra” columns are expected to be the same for all rows with the same
row_name value.

Your query does not produce data in the expected form:
```
row_name | extra ... | category | value
```
This query does:
```
SELECT dense_rank() OVER (ORDER BY date, id) AS rn  -- row_name
     , to_char(date, 'MON-YY') AS date, id          -- extra
     , product                                      -- category
     , sum(product_cost) AS sum_cost                -- value
FROM   product_info
GROUP  BY date, id, product
ORDER  BY date, id, product;
```
Your "row name" is a composite of (date, id). (Simply concatenating could produce ambiguous results.) Generate a single-column row_name with the window function dense_rank() as demonstrated. date and id can be passed as "extra" columns.

Also, a SELECT query that generates category names dynamically rarely makes sense, since the static (!) column definition list has to match.

The query is based off an actual date column. Don’t store date information as text, which does not sort properly.

This crosstab() call works:
```
-- CREATE TABLE pivot_test AS (
SELECT date, id, "Ipad", "Keypad", "Laptop", "Phone" 
FROM   crosstab(
   $q$
   SELECT dense_rank() OVER (ORDER BY date, id)
        , to_char(date, 'MON-YY'), id
        , product
        , sum(product_cost)
   FROM   product_info
   GROUP  BY date, id, product
   ORDER  BY date, id, product
   $q$

,  $v$
   SELECT unnest('{Ipad, Keypad,Laptop, Phone}'::text[])
   $v$
   ) AS ct(rn int, date text, id int
         , "Ipad"   int
         , "Keypad" int
         , "Laptop" int
         , "Phone"  int);
```
fiddle

See:
Plain SQL

Typically, crosstab() is the fastest option. But for only few products, and since this case requires aggregation and a window function anyway, plain SQL may be even faster.
```
SELECT to_char(date, 'MON-YY') AS date, id
     , SUM(product_cost) FILTER (WHERE product = 'Ipad')   AS "Ipad"
     , SUM(product_cost) FILTER (WHERE product = 'Keypad') AS "Keypad"
     , SUM(product_cost) FILTER (WHERE product = 'Laptop') AS "Laptop"
     , SUM(product_cost) FILTER (WHERE product = 'Phone')  AS "Phone"
FROM   product_info
GROUP  BY date, id
ORDER  BY date, id;
```
fiddle

See:
- Sum by month and put months as columns
- Aggregate columns with additional (distinct) filters
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Postgres PIVOT table on aggregated count

Answers

Proper `crosstab()` call

Plain SQL

Postgresql – Postgres PIVOT table on aggregated count

Answers

Proper crosstab() call

Plain SQL

Proper `crosstab()` call