skip to Main Content

I am trying to create a pivot table in Postgres from the following table
named product_info:

CREATE TABLE product_info (
    ID varchar(10),
    date VARCHAR(20) NOT NULL,
    product VARCHAR(20) NOT NULL,
    product_cost integer 
);

INSERT INTO product_info(ID, date, product, product_cost)
VALUES
  ('1', 'AUG-23','Laptop',100),
  ('1', 'AUG-23','Phone',80),
  ('1', 'AUG-23','Keypad',50),
  ('1', 'SEP-23','Laptop',200),
  ('1', 'SEP-23','Phone',100),
  ('1', 'SEP-23','Ipad',150),
  ('1', 'SEP-23','Keypad',80),
  ('2', 'AUG-23','Laptop',100),
  ('2', 'AUG-23','Phone',90),
  ('2', 'AUG-23','Keypad',70),
  ('2', 'SEP-23','Laptop',120),
  ('2', 'SEP-23','Phone',30),
  ('2', 'SEP-23','Ipad',80),
  ('2', 'SEP-23','Keypad',40);

Tried below PIVOT Query

SELECT * FROM crosstab(
    'SELECT date, id,product, sum(product_cost)
     FROM product_info
     GROUP BY 1, 2,3
     ORDER BY 1, 2,3'
,
    'SELECT DISTINCT product FROM product_info ORDER BY 1'
) as cte(
    date text,
    id text, 
    "Ipad" numeric,
    "Keypad" numeric,
    "Laptop" numeric,
    "Phone" numeric
)

RETURNED RESULT

date id Ipad Keypad Laptop Phone
AUG-23 1 NULL 70 100 90
SEP-23 1 80 40 120 30

I get a single row per month, with id = '1' and values for id = '2'.
I expected one row per ID.

EXPECTED RESULT

id date Ipad Keypad Laptop Phone
1 AUG-23 NULL 50 100 80
2 AUG-23 NULL 70 100 90
1 SEP-23 150 80 200 100
2 SEP-23 80 40 120 30

2

Answers


  1. As you can use only obe row with crosstab, you can do something like this.

    But simpler is a normal aggregation like the second query

    SELECT split_part(id, '%', 1) as id,
      split_part(id, '%', 2) as date,
            "Ipad" ,
        "Laptop" ,
        "Phone" ,
        "Keypad" 
      FROM crosstab(
        'SELECT  CONCAT(id,''%'',date) as id, product, sum(cost)
         FROM product_info
         GROUP BY 1, 2
         ORDER BY 1, 2'
    ,
        'SELECT DISTINCT product FROM product_info ORDER BY 1'
    ) as cte(
        id text,
        "Ipad" numeric,
        "Laptop" numeric,
        "Phone" numeric,
        "Keypad" numeric
    )
    
    id date Ipad Laptop Phone Keypad
    1 Aug-23 null 50 250 100
    1 Sep-23 200 60 260 100
    2 Aug-23 null 80 220 100
    2 Sep-23 90 80 100 100
    SELECT
      id,date,
      SUM(CASE WHEN product = 'Ipad' THEN cost END ) as "Ipad",
      SUM(CASE WHEN product = 'Laptop' THEN cost END ) as "Laptop",
      SUM(CASE WHEN product = 'Phone' THEN cost END ) as "Phone",
      SUM(CASE WHEN product = 'Keypad' THEN cost END ) as "Keypad"
    FROM 
    product_info
    GROUP BY id,date
    ORDER BY id,date
    
    id date Ipad Laptop Phone Keypad
    1 Aug-23 null 250 100 50
    1 Sep-23 200 260 100 60
    2 Aug-23 null 220 100 80
    2 Sep-23 90 100 100 80

    fiddle

    Login or Signup to reply.
  2. Proper crosstab() call

    Quoting the manual for crosstab():

    source_sql is an SQL statement that produces the source set of data.
    This statement must return one row_name column, one category
    column, and one value column. It may also have one or more “extra”
    columns. The row_name column must be first. The category and
    value columns must be the last two columns, in that order. Any
    columns between row_name and category are treated as “extra”. The
    “extra” columns are expected to be the same for all rows with the same
    row_name value.

    Your query does not produce data in the expected form:

    row_name | extra ... | category | value
    

    This query does:

    SELECT dense_rank() OVER (ORDER BY date, id) AS rn  -- row_name
         , to_char(date, 'MON-YY') AS date, id          -- extra
         , product                                      -- category
         , sum(product_cost) AS sum_cost                -- value
    FROM   product_info
    GROUP  BY date, id, product
    ORDER  BY date, id, product;
    

    Your "row name" is a composite of (date, id). (Simply concatenating could produce ambiguous results.) Generate a single-column row_name with the window function dense_rank() as demonstrated. date and id can be passed as "extra" columns.

    Also, a SELECT query that generates category names dynamically rarely makes sense, since the static (!) column definition list has to match.

    The query is based off an actual date column. Don’t store date information as text, which does not sort properly.

    This crosstab() call works:

    -- CREATE TABLE pivot_test AS (
    SELECT date, id, "Ipad", "Keypad", "Laptop", "Phone" 
    FROM   crosstab(
       $q$
       SELECT dense_rank() OVER (ORDER BY date, id)
            , to_char(date, 'MON-YY'), id
            , product
            , sum(product_cost)
       FROM   product_info
       GROUP  BY date, id, product
       ORDER  BY date, id, product
       $q$
    
    ,  $v$
       SELECT unnest('{Ipad, Keypad,Laptop, Phone}'::text[])
       $v$
       ) AS ct(rn int, date text, id int
             , "Ipad"   int
             , "Keypad" int
             , "Laptop" int
             , "Phone"  int);
    

    fiddle

    See:

    Plain SQL

    Typically, crosstab() is the fastest option. But for only few products, and since this case requires aggregation and a window function anyway, plain SQL may be even faster.

    SELECT to_char(date, 'MON-YY') AS date, id
         , SUM(product_cost) FILTER (WHERE product = 'Ipad')   AS "Ipad"
         , SUM(product_cost) FILTER (WHERE product = 'Keypad') AS "Keypad"
         , SUM(product_cost) FILTER (WHERE product = 'Laptop') AS "Laptop"
         , SUM(product_cost) FILTER (WHERE product = 'Phone')  AS "Phone"
    FROM   product_info
    GROUP  BY date, id
    ORDER  BY date, id;
    

    fiddle

    See:

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search