I have a table containing measurement information of several tests per devices:
- device_id gives the information on which device was tested
- measurement_no is an incrementing number giving the order in which tests has been performed
- test gives you the name of the test which is performed
- is_last_measurement_on_test is a boolean field, giving the information if the specific row is the last measurement of a test. It returns true, if the row is the last row of the device for an specific test. It returns false, if there is a subsequent row of the same device for the same test.
- ok gives information is the test was okay (=true) or not okay (=false)
- error_code gives you a specific error code if ok=false, or 0 if ok=true
WITH measurements (device_id,measurement_no,test,is_last_measurement_on_test,ok,error_code) AS ( VALUES
-- case 1: all measurements good, expecting to show test 3 only
('d1',1,'test1',true,true,0),
('d1',2,'test2',true,true,0),
('d1',3,'test3',true,true,0),
-- case 2: test 2, expecting to show test 2 only
('d2',1,'test1',true,true,0),
('d2',2,'test2',true,false,100),
('d2',3,'test3',true,true,0),
-- case 3: test 2 und 3 bad, expecting to show test 2 only
('d3',1,'test1',true,true,0),
('d3',2,'test2',true,false,100),
('d3',3,'test3',true,false,200),
-- case 4: test 2 bad on first try, second time good, expecting to show test 3 only
('d4',1,'test1',true,true,0),
('d4',2,'test2',false,false,100),
('d4',3,'test2',true,true,0),
('d4',4,'test3',true,true,0)
)
select * from measurements
where is_last_measurement_on_test=true
Now I want to filter these rows on following conditions per device:
- Only the last measurement on each test should be considered -> that’s easy: filtering on is_last_measurement_on_test=true
- For every device: If there is a bad result (ok=false) in any test where is_last_measurement_on_test=true, I want to display the first test on which the device failed.
- For every device: If there is no bad result at all (ok=true) in any test where is_last_measurement_on_test=true, I want to display the last test on which the device passed.
For the given example above, I am expecting that only these rows to display:
('d1',3,'test3',true,true,0)
('d2',2,'test2',true,false,100)
('d3',2,'test2',true,false,100)
('d4',4,'test3',true,true,0)
How can I receive this result? I already tried a lot on using first_value, for example
first_value(nullif(error_code,0)) over (partition by device_id)
but i wasn’t able to handle it in the way I wanted it to be.
2
Answers
Having this sample data:
It will be like:
The idea is get the name of the first fail test and to order the test using row_number starting from the latest one.
The important part is that here I am ordering the test by there name. In your real scenario, I am guessing you have an record_id or date, which can be used to do this. So, you will need to change a little bit the code.
distinct on
gets you the one, "top" record perdevice_id
.order by
lets you establish the order according to which a record might or might not end up on "top". Since your 2nd and 3rd case require opposite ordering/priority of tests perdevice_id
:You can flip that order accordingly, with a
case
.It’s a common misconception that
order by
section is somehow restricted to plain column names or aliases. Meanwhile, quoting the doc, it gives you as much freedom asselect
section does:Online demo.