we want to "fill in the gaps" of the record_ids and fill in missing values with the most recent available value. If there is no value available, use the first available value instead. The result should look like:
category | record_id | value |
---|---|---|
A | 1 | 0.01 |
A | 2 | 0.23 |
A | 3 | 0.23 |
A | 4 | 0.23 |
A | 5 | 0.15 |
A | 6 | 0.20 |
A | 7 | 0.08 |
B | 1 | 1.00 |
B | 2 | 1.00 |
B | 3 | 0.75 |
B | 4 | 0.75 |
B | 5 | 0.75 |
B | 6 | 0.93 |
B | 7 | 0.87 |
I tried to approach the above requirement with cte, and created an another table with sequence and from there I tried to find the missing values. I am just trying to get the most recent available values and assign to the missing numbers. Can someone suggest on this
with table_1
as (
select 'A' as category
,1 as record_id
,0.01 as value
union all
select 'A', 2, 0.23 union all
select 'A', 5, 0.15 union all
select 'A', 6, 0.20 union all
select 'A', 7, 0.08 union all
select 'B', 2, 1.00 union all
select 'B', 3, 0.75 union all
select 'B', 6, 0.93 union all
select 'B', 7, 0.87
),seq_num as(
select 1 as record_id union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7),missing_seq
as (
select t1.record_id from seq_num t1 where t1.record_id not in (select t2.record_id from table_1 t2 where t2.category = 'A')
union all
select t1.record_id from seq_num t1 where t1.record_id not in (select t2.record_id from table_1 t2 where t2.category = 'B')
)
select * from missing_seq
2
Answers
with this i am missing 4 from A and 5 from B category
with table_1 as ( select 'A' as category ,1 as record_id ,0.01 as value union all select 'A', 2, 0.23 union all select 'A', 5, 0.15 union all select 'A', 6, 0.20 union all select 'A', 7, 0.08 union all select 'B', 2, 1.00 union all select 'B', 3, 0.75 union all select 'B', 6, 0.93 union all select 'B', 7, 0.87 ), data as ( select t.*, lead(record_id) over(partition by category order by record_id) lead_record_id from table_1 t union all select category, 1, null, min(record_id) from table_1 group by category having min(record_id) > 1 ), rcte as ( select category, record_id, value, lead_record_id from data union all select category, record_id + 1, value, lead_record_id from data where record_id + 1 < lead_record_id ) select category, record_id, value from rcte order by category, record_id;
One option uses a recursive query:
The first CTE computes the "next" record id to each row. Then, the recursive CTE generates as many rows as needed to fill the gaps needed, while assigning the "latest" values to the new rows.
If you want all categories to start from record id
1
(even if the minimum record id for that category is greater than that), then we can adapt the first CTE:Note that this will assign a
null
values to the "first" records, since there is no previous value available at that point.fiddle