skip to Main Content

I need to create an id-column: on every TRUE statement it should be a normal serial number, but on every FALSE statement the field value should contain the previous number in this column. In other words, every FALSE should repeat the previous value.

create table t1(column_id,column_bool)as values
   (1,true)
  ,(2,true)
  ,(3,false)
  ,(4,true)
  ,(5,true)
  ,(6,true)
  ,(7,false)
  ,(8,true);

I tried to give every FALSE-value its own number, but I don’t know how to lower them by 1 after every single FALSE-iteration.

So here is the code:

SELECT *, 
CASE WHEN column_bool = true THEN ROW_NUMBER() OVER (PARTITION BY 
column_bool ORDER BY (column_id)) 
ELSE ROW_NUMBER() OVER (ORDER BY (column_id)) -1
END AS row_number_column
FROM t1
ORDER BY column_id

Here is what happens:

column_id column_bool row_number_column
1 t 1
2 t 2
3 F 2
4 t 3
5 t 4
6 t 5
7 F 6
8 t 6

And here is what I want to happen

column_id column_bool row_number_column
1 t 1
2 t 2
3 F 2
4 t 3
5 t 4
6 t 5
7 F 5
8 t 6

Every FALSE number should repeat the previous one. Is there any way to do it?

3

Answers


  1. We can use COUNT() here as a window function:

    SELECT *, 
        COUNT(CASE WHEN column_bool = true THEN 1 END)
            OVER (ORDER BY column_id) AS row_number_column
    FROM t1
    ORDER BY column_id;
    

    The above logic computed a running count of true values over the table as ordered by the column_id. Note that the above sequence is 0 based.

    Login or Signup to reply.
  2. In PostgreSQL casting bool to int should also do the work:

    SUM(column_bool::int) OVER (ORDER BY column_id) AS row_number_column
    
    Login or Signup to reply.
  3. If you set up a window that goes through your records in the order of column_id, you can keep count of rows, using filter to only count those where column_bool is true:
    demo at db<>fiddle

    select *,count(*)filter(where column_bool)over w1
    from t1
    window w1 as(order by column_id);
    
    column_id column_bool row_number_column
    1 t 1
    2 t 2
    3 F 2
    4 t 3
    5 t 4
    6 t 5
    7 F 5
    8 t 6

    The order specified in the window definition assumes a rows between unbounded preceding and current row frame, which means the count in each row becomes a stepping/tumbling/rolling count of stuff found from the first row up to the current one, not a grand total spanning the whole table:

    The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. With ORDER BY, this sets the frame to be all rows from the partition start up through the current row’s last ORDER BY peer.

    Counting only filtered values you’re interested in is cheaper than casting all of them then summing, or emulating the filter clause with a conditional aggregate that also has to evaluate a case for all rows instead of only those of interest.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search