Amazon web services - How to duplicate same row X number of times based on the value of a column?

ritarita
October 31, 2024
170 views
0 votes
2 Answers

I have a dataset looks like this:

Link_number	Houband	Time	Mean_speed	Sample_Number
Link 1	8	8:00	52	2
Link 1	8	8:30	55	5
Link 2	9	9:00	20	3
Link 2	9	9:30	40	4

I need to do duplicate each row X number of times, X = Sample_Number. So my result will look like this:

Link_number	Houband	Time	Mean_speed	Sample_Number
Link 1	8	8:00	52	2
Link 1	8	8:00	52	2
Link 1	8	8:30	55	5
Link 1	8	8:30	55	5
Link 1	8	8:30	55	5
Link 1	8	8:30	55	5
Link 1	8	8:30	55	5
Link 2	9	9:00	20	3
Link 2	9	9:00	20	3
Link 2	9	9:00	20	3
Link 2	9	9:30	40	4
Link 2	9	9:30	40	4
Link 2	9	9:30	40	4
Link 2	9	9:30	40	4

So then I can calculate the percentile of speed in each hourband for each link.

I have been able to do this in Postgresql using generate_series:

select Link_number, Hourband, percentile_cont(0.85) within group (order by Mean_Speed) as Speed_85th
from table, generate_series(1,Sample_Number)

How do I do the same in Amazon Athena?

Answers

- AmiraBedhiafi
- October 31, 2024 at 9:54 am
- 0 votes
0
As @Serg recommended, you can use a tally table to generate a sequence of numbers then use a cross join to duplicate the rows based on these values.
```
WITH numbers AS (
    SELECT n
    FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) AS t(n)
)

SELECT 
    Link_number,
    Houband,
    Time,
    Mean_speed,
    Sample_Number
FROM 
    your_table AS t
JOIN 
    numbers AS n
ON 
    n.n <= t.Sample_Number
```
Login or Signup to reply.

You can use unnest + sequence (which is kind of analog of generate_series based on what I see):

-- sample data
WITH dataset(Link_number, Houband, Time, Mean_speed, Sample_Number) as (
    values ('Link 1',   8,  '8:00', 52, 2),
        ('Link 1',  8,  '8:30', 55, 5),
        ('Link 2',  9,  '9:00', 20, 3),
        ('Link 2',  9,  '9:30', 40, 4)
)

-- query
select Link_number, Houband, Time, Mean_speed, Sample_Number
from dataset,
     unnest(sequence(1, Sample_Number)) as t(ignored);

Please signup or login to give your own answer.

Click here to cancel reply.

Amazon web services – How to duplicate same row X number of times based on the value of a column?

Answers