skip to Main Content

I want to perform a fill-down activity in PgSQL

DDL:

create table brands 
(
id int,
category varchar(20),
brand_name varchar(20)
);
insert into brands values
(1,'chocolates','5-star')
,(2,null,'dairy milk')
,(3,null,'perk')
,(4,null,'eclair')
,(5,'Biscuits','britannia')
,(6,null,'good day')
,(7,null,'boost')
,(8,'shampoo','h&s')
,(9,null,'dove');

Expected output is:

category brand_name
chocolates 5-star
chocolates dairy milk
chocolates perk
chocolates eclair
Biscuits britannia
Biscuits good day
Biscuits boost
Shampoo h&s
Shampoo dove

I tried using the following script but it doesn’t seem to work.

select id,
      first_value(category)
      over(order by case when category is not null then id end desc nulls last) as category,
      brand_name
from brands

Can someone suggest a fix.

In MS SQL the following snippet seems to work fine.

select id,
       first_value (category) IGNORE NULLS
       over(order by id desc
        rows between current row and unbounded following) as category,
       brand_name
FROM brands
ORDER BY id

4

Answers


  1. with cte as (
    select id,
           category,
           count(category) over (order by id) as category_id,
           brand_name
      from brands)
    select id,
           first_value(category) over (partition by category_id order by id) as category,
           brand_name
      from cte;
    

    UPDATE: added query without CTE per request:

    select id,
           (array_agg(category) over (order by id))[max(case when category is null then 0 else id end) over (order by id)] as category,
           brand_name
      from brands;
    
    Login or Signup to reply.
  2. I think there’s nothing wrong with using a CTE (see JHH’s answer) and I would prefer that.

    Postgres DB’s don’t provide this IGNORE NULLS concept of SQLServer DB’s, so I guess you should stop thinking you will get a nearly identic query for Postgres DB’s like in MS SQL.

    Anyway, if you don’t want to use a CTE or complex subquery, you can define your own function and aggregation and run this.

    Function creation:

    -- CREATE your function
    CREATE FUNCTION yourFunction(STATE anyelement, VALUE anyelement)
        RETURNS anyelement
        IMMUTABLE PARALLEL safe
    AS
    $$
    SELECT COALESCE(VALUE, STATE); -- Replace NULL values here
    $$ LANGUAGE SQL;
    

    Aggregate creation using the function:

    -- CREATE your aggregate
    CREATE AGGREGATE yourAggregate(ANYELEMENT) (
        sfunc = yourFunction, -- Call your function here
        stype = ANYELEMENT
    );
    

    Your query using this aggregate:

    SELECT id, 
      yourAggregate(category) -- Call your aggregate here
      OVER (ORDER BY id, category), 
      brand_name
    FROM brands
    ORDER BY id;
    

    Of course, you should rename both function and aggregate and use more meaningful names.

    This will produce the same outcome like the CTE version.

    Try out: db<>fiddle

    If you are keen on defining and using own functions and you will use that often, you could do that.

    Otherwise, just use a CTE, that’s quite fine. There is no reason to do not use CTE’s.

    Always be aware that while using own functions, you are on risk of bad performance, so you should check whether this query is too slow.

    Login or Signup to reply.
  3. I am afraid that’s not implemented in Postgres (at least up to Postgres 15). The manual about window functions:

    The SQL standard defines a RESPECT NULLS or IGNORE NULLS option
    for lead, lag, first_value, last_value, and nth_value. This
    is not implemented in PostgreSQL: the behavior is always the same as
    the standard’s default, namely RESPECT NULLS.

    So you have to use a workaround with a CTE or subquery like JHH suggested, or roll your own window function (which will be comparatively slow).

    See (answer to similar question on dba.SE):

    Login or Signup to reply.
  4. select *, max(category) over (partition by r) 
    from(
    select *,sum(case when category is not null then 1 end) over (order by rn) r
    from
    (select *,row_number() over() as rn from brands) l
    )p
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search