Trying to populate field has_impact based on the following for each feed view in the session.
has_impact is true when difference between (view_end_time – view_start_time) > 3 and view_perc > 0.8 else false.
view_logs table
create table view_logs(session_id varchar(10), post_id int,
ts int, event_name varchar(50), view_perc float);
insert into view_logs(session_id, post_id, ts, event_name, view_perc)
values
('m1', 1000, 1524600, 'view_start', null),
('m1', 1000, 1524602, 'view_end', 0.85),
('m1', 1000, 1524650, 'view_start', null),
('m1', 1000, 1524654, 'view_end', 0.9),
('m1', 2000, 1524700, 'view_start', null),
('m1', 2000, 1524707, 'view_end', 0.3),
('m1', 2000, 1524710, 'view_start', null),
('m1', 2000, 1524713, 'view_end', 0.9);
I have tried this fiddle but not getting desired output. can anyone help me to identify what I am doing wrong?
query:
with cte as
(
select
pv1.session_id,
pv1.post_id,
pv2.view_perc,
pv1.ts as start_time,
min(pv2.ts) as end_time
from view_logs pv1
join view_logs pv2
on pv1.session_id = pv2.session_id
and pv1.post_id = pv2.post_id
and pv1.event_name <> pv2.event_name
and pv1.ts < pv2.ts
group by
pv1.session_id,
pv1.post_id,
pv2.view_perc,
pv1.ts
)
select
session_id,
post_id,
start_time,
end_time,
case
when (end_time - start_time > 3 and view_perc > 0.8 )then 'yes'
else 'no'
end as has_meaningful_view
from cte
desired output:
*--------------------------------------------------------*
|session_id| post_id | start_time | end_time | has_impact|
*--------------------------------------------------------*
| m1 | 1000 | 1524600 | 1524602 | no |
| m1 | 1000 | 1524650 | 1524654 | yes |
| m1 | 2000 | 1524700 | 1524707 | no |
| m1 | 2000 | 1524710 | 1524714 | no |
*--------------------------------------------------------*
2
Answers
this might not get you exactly what you want but it does calculate the
has_impact
columnoutput:
Supposing that each start time has an end time, you can aggregate using row_number function as the following:
Demo